Beschrijving
Static Cache Wrangler – Headless Assistant
Beschrijving
Static Cache Wrangler – Headless Assistant is a companion plugin for Static Cache Wrangler that converts cached HTML files into headless CMS-compatible formats for modern headless CMS workflows. This plugin requires WP-CLI and is intended for developers and administrators who have good working knowledge of operating in shell environments with traditional Linux stream processing commands (e.g. sed, grep, awk, sort) and a willingness to explore the WordPress command-line interface. This plugin enables composable, command-line interface tooling and IS NOT a point and click solution.
Testing Results
Tested on cachewrangler.com (15-page WordPress site):
- ☑ 74% semantic conversion rate – 564 blocks converted to structured content
- ☑ 15 pages converted successfully to Sanity format
- ☑ 763 links preserved with proper structure and references
- ☑ 36 images tracked with migration metadata
- ☑ 14 accordions converted semantically
- ☑ 13 tables converted semantically
Top Conversion Rates Achieved:
* Simple pages: 86% semantic conversion
* Complex pages with mixed content: 74-82%
* Ultimate test page (141 blocks, 23 different Kadence block types): 52%
What Falls Back to HTML (requires custom hooks):
* Navigation menus (by design – preserves styling)
* Advanced Kadence blocks (countdown, forms, testimonials, maps, etc.)
* Premium block libraries (Otter, Spectra – requires additional detectors)
Extensible Architecture
Unlike hardcoded solutions, this plugin uses a pluggable engine system where CMS targets can be registered via filters. Ships with Sanity® CMS support (unofficial) out of the box. It is technically feasible to target Contentful, Strapi, and others via extensions.
Sanity® is a registered trademark of Sanity.io. This project is not affiliated with or endorsed by Sanity.io.
Features
Direct Sanity CMS Conversion:
* WordPress Sanity NDJSON export
* Pattern detection for Gutenberg blocks
* Schema generation for Sanity Studio
* Asset tracking and manifest
* Command: wp scw-headless convert --cms=sanity
Smart Pattern Detection:
* 12 core Gutenberg patterns
* 28 Kadence Blocks patterns
* XPath-based detection with confidence scoring
* Priority-based matching for nested structures
* Pattern inheritance system
CLI-First Experience:
* wp scw-headless scan – View cached files
* wp scw-headless analyze <file> – Detect patterns
* wp scw-headless convert --cms=sanity – Export to Sanity
* wp scw-headless patterns – List registered patterns
* wp scw-headless detectors – Show detector modules
* wp scw-headless targets – List available CMS platforms
* wp scw-headless info – Show plugin statistics
Future Roadmap Considerations
Generic Portable Text Output:
* CMS-agnostic JSON format
* Support for any Portable Text consumer
* Command: wp scw-headless normalize
* Non-WP analyzer and converter tooling
Advanced Pattern Detection:
* 40+ patterns including Kadence Blocks
* ACF field support (roadmap)
* Page builder compatibility (roadmap)
* Custom pattern registration
Multi-CMS Support:
* Sanity (today)
* Strapi (horizon)
* Contentful (horizon)
* Payload CMS (horizon)
* Any Portable Text consumer (roadmap)
Learn more about planned features features
Funding Model
* This plugin is 100% free (true WordPress style)
* Want to make a donation? Consider purchasing a copy of the author’s book on command-line interfaces for yourself or as a gift.
Pattern Detection System
Built-in Detector Modules:
Gutenberg Core – 12 patterns:
* heading, paragraph, image, gallery, video
* list (ordered/unordered), quote, code
* button, buttons, separator, table
Kadence Blocks – 28 patterns:
* accordion, tabs, advanced_button, progress_bar
* icon_list, infobox, countdown, rowlayout
* column, advanced_heading, form, testimonials
* posts, table_of_contents, google_maps, lottie
* image, video_popup, advanced_gallery, navigation
* icon, spacer, show_more, search, identity
* table, vector, countup
Extensible via Filters:
php
// Register custom patterns
add_action('stcw_headless_patterns_loaded', function() {
\STCW\Headless\Engine\Detector\PatternRegistry::register('custom_block', [
'selectors' => ['.my-custom-block'],
'extractor' => [MyExtractor::class, 'extract'],
'priority' => 8,
'confidence' => 0.95,
]);
});
How It Works
- Cache your WordPress site with Static Cache Wrangler
- Scan cached files:
wp scw-headless scan - Analyze patterns:
wp scw-headless analyze /page/ - Convert:
- Free:
wp scw-headless convert --cms=sanity
- Free:
Supported CMS Targets
Included:
* Sanity CMS – Full Portable Text conversion with schema generation
Perfect For
- Migrating WordPress content to headless CMS platforms
- JAMstack architecture with WordPress as authoring tool
- SEO component analysis
- UI pattern analysis
Requirements
- WordPress 6.0 or higher
- PHP 7.4+ (PHP 8.x fully supported)
- Static Cache Wrangler 2.0.5+ (must be installed and active)
- WP-CLI recommended for best experience
- Pattern Library Pro for enterprise features (optional)
Pattern Analysis
File: index.html (104 KB)
Patterns Found: 71
paragraph 20 Confidence: 1.00
heading 15 Confidence: 1.00
separator 14 Confidence: 1.00
kadence_button 6 Confidence: 0.90
kadence_accordion 2 Confidence: 0.95
…
Confidence Distribution:
High (≥0.95): 60
Medium (0.85+): 11
Low (<0.85): 0
`
How do I get support?
- Free users: GitHub Issues
- Documentation: wp2headless.com/docs
Why are some files showing as 140 B?
Static Cache Wrangler may create gzipped files or use compression. The plugin handles this automatically by reading the actual index.html files within cached directories.
Additional Information
Architecture
Engine Components:
* Scanner – Finds cached HTML files
* Normalizer – Cleans HTML while preserving structure
* Pattern Registry – Centralized pattern definitions with inheritance
* Pattern Detector – XPath-based pattern matching engine
* Extractors – DOM-to-data conversion functions
* Parser – Orchestrates normalization detection extraction
* Converter – Transforms to target CMS format
* Target Registry – Pluggable CMS target management
Data Flow:
Cached HTML Normalizer Pattern Detector Extractors Converter Export
Support
- Documentation: https://wp2headless.com/documentation/
- GitHub: https://github.com/derickschaefer/stcw-headless-assistant
- Issues: https://github.com/derickschaefer/stcw-headless-assistant/issues
Roadmap
v2.2.0 (Q1 2026) – Quality Focus
* Improve semantic conversion to 85%+ (currently 74%)
* Wire up remaining Kadence extractors (button, icon list, maps)
* Enhance navigation menu handling
* Add pattern detection validators
* 80%+ test coverage
v2.3.0 (Q2 2026) – More Patterns
* Elementor widgets detection
* Beaver Builder modules
* ACF field mapping
* Custom post type support
v2.5.0 (Q3 2026) – Multi-CMS
* TBD based on input, feedback, and demand
Contributing
Contributions welcome!
* Additional CMS target implementations
* Page builder detector modules
* Pattern extraction improvements
* Documentation and examples
* Test coverage
Privacy Policy
This plugin does not collect, store, or transmit any user data. All conversion happens locally on your WordPress installation.
Data Storage:
* No external API calls
* No analytics or tracking
* No cookies used
* Export files stored locally in WordPress uploads directory
* License validation (Pattern Library Pro) stored in wp_options
License
GPL v2 or later – Copyright © 2024-2025 Derick Schaefer
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
Third-Party Trademark Notices
Content Management Systems:
Sanity® is a registered trademark of Sanity.io
Contentful® is a registered trademark of Contentful GmbH
Strapi® is a registered trademark of Strapi Solutions SAS
Builder.io® is a registered trademark of Builder.io, Inc.
DatoCMS® is a registered trademark of Dato srl
Payload CMS® is a registered trademark of Payload CMS, Inc.
WordPress Ecosystem:
WordPress® and Gutenberg® are registered trademarks of the WordPress Foundation
WP-CLI® is a registered trademark of the WordPress Foundation
Kadence® and Kadence Blocks™ are trademarks of Kadence WP LLC
Elementor® is a registered trademark of Elementor Ltd.
Divi® is a registered trademark of Elegant Themes, Inc.
Advanced Custom Fields® (ACF) is a registered trademark of WP Engine, Inc.
WooCommerce® is a registered trademark of Automattic Inc.
Beaver Builder® is a registered trademark of Beaver Builder
Development Tools:
GitHub® is a registered trademark of GitHub, Inc.
JSON™ is a trademark of JSON.org
Content & Media:
YouTube® is a registered trademark of Google LLC
Lottie™ is a trademark of Airbnb, Inc.
Google Maps™ is a trademark of Google LLC
Project Attribution:
WP2Headless.com is owned by Derick Schaefer
Static Cache Wrangler is developed by Derick Schaefer
Disclaimer:
This plugin is not affiliated with, endorsed by, or sponsored by any of the trademark owners listed above. These names are referenced solely to describe compatibility, integration capabilities, or as examples of headless CMS platforms that may be used with exported content. The plugin has not been tested, approved, or certified by any of these companies or organizations.
Schermafbeeldingen
Installatie
Automatic Installation
- Install and activate Static Cache Wrangler
- Install (or confirm installation) WP-CLI
- Search “STCW Headless Assistant” in WordPress plugin directory
- Click “Install Now” and activate
- Generate cached pages by browsing your site
- Use WP-CLI:
wp scw-headless infoto verify installation
Manual Installation
- Install and activate Static Cache Wrangler first
- Upload
stcw-headless-assistantfolder to/wp-content/plugins/ - Activate via WordPress admin
- Navigate to Static Cache > Headless Assistant
- Run:
wp scw-headless scanto verify setup
Recommended Setup
- Install WP-CLI (for CLI support)
- Install Static Cache Wrangler (for HTML caching)
- Install STCW Headless Assistant (this plugin)
- Cache your site:
wp scw enable - Test conversion:
wp scw-headless analyze /
FAQ
Does this work without Static Cache Wrangler?
No, this is a companion plugin that requires Static Cache Wrangler to be installed and active. It converts the HTML files that Static Cache Wrangler generates.
What WP-CLI commands are available?
Commands:
*wp scw-headless info– Show system status and statistics
*wp scw-headless scan– List all cached HTML files ready for conversion
*wp scw-headless analyze <file>– Detect patterns in specific file
*wp scw-headless patterns– List all registered patterns with confidence scores
*wp scw-headless detectors– Show registered detector modules
*wp scw-headless convert --cms=sanity– Convert all files to Sanity format
*wp scw-headless targets– List available CMS targetsAll commands support
--format=jsonfor automation.What gets exported?
Current (Sanity conversion):
The plugin creates a complete export package containing:data.ndjson– Sanity import data in newline-delimited JSON formatasset-manifest.json– Asset references with URLs and metadataschemas/– Sanity Studio schema definitionsREADME.md– Import instructions for Sanity.ziparchive – Complete package for download
Export packages are saved in
wp-content/cache/stcw-headless-exports/.How does pattern detection work?
The plugin uses a sophisticated multi-phase pipeline:
- HTML Normalization – Strips WordPress-specific classes, IDs, and attributes while preserving semantic HTML
- Pattern Detection – Uses XPath queries to find registered patterns (CSS selectors converted to XPath)
- Priority Sorting – Processes patterns by priority (10=highest) to handle nested blocks correctly
- Confidence Scoring – Each match includes confidence score (0.0-1.0) based on selector specificity
- Content Extraction – Registered extractor functions parse matched DOM nodes into structured data
- Conversion – Transform to target CMS format
How accurate is pattern detection?
Production Results (cachewrangler.com):
– Overall semantic conversion: 74%
– Core Gutenberg blocks: 100% accuracy (confidence: 1.00)
– Kadence Blocks: 90-95% accuracy (confidence: 0.90-0.98)
– 40 unique patterns registered
– 763 blocks processed across 15 pages
– 174 blocks fell back to raw HTML (navigation menus, advanced widgets)Per-Page Results:
– Simple pages: 82-86% conversion
– Mixed content pages: 74-82% conversion
– Complex pages (23 block types): 52% conversionWhat if a pattern isn’t detected?
Falls back to
rawHtmlblock type with pattern metadata. You can:
– Add custom pattern definitions via filters
– Report missing patterns on GitHub
– Wire up existing extractors (many already exist)Can I add support for more blocks?
Yes! Three ways:
1. Register patterns via filter:
php
add_action('stcw_headless_patterns_loaded', function() {
\STCW\Headless\Engine\Detector\PatternRegistry::register('my_block', [
'selectors' => ['.my-block-class'],
'extractor' => [MyExtractor::class, 'extract_my_block'],
'priority' => 8,
'confidence' => 0.95,
]);
});2. Create detector modules (for larger block libraries)
3. Use pattern inheritance:
php
// Extend existing patterns
PatternRegistry::register('custom_button', [
'extends' => 'button', // Inherits base button selectors
'selectors' => ['.my-custom-button'], // Adds custom selectors
]);Can I add support for other CMS platforms?
Yes! The plugin is designed with a pluggable architecture:
php
// Register custom CMS target
add_action('stcw_headless_register_targets', function() {
$my_target = new My_CMS_Target();
\STCW\Headless\Engine\Target\TargetRegistry::register($my_target);
});Your target class must implement
TargetInterfacewith methods forconvert(),generate_schemas(), andexport().What paths are excluded from scans?
By default, these paths are excluded:
*assets/– Static assets (CSS, JS, images)
*author/– Author archives
*category/,tag/– Taxonomy archives
*index.php/– WordPress quirks
*feed/,wp-json/– API endpoints
*sitemap/,404/– Utility pages
* Blog index pages (Posts Page in Settings Reading)Filter via
stcw_headless_excluded_pathsto customize.Why is the blog page not converting?
WordPress “Posts Page” archives (the page set as your blog index in Settings Reading) are intentionally skipped because they contain dynamic post loops, not static content. Individual blog posts are converted successfully.
To recreate your blog index in Sanity:
1. Import individual posts (automatically converted)
2. Use this GROQ query to fetch posts:
groq
*[_type == "post"] | order(publishedAt desc) {
title, slug, excerpt, publishedAt
}
3. Build your blog index view in your frontendDoes this work with page builders?
Kadence Blocks is currently supported. Support is being considered for Elementor, Otter Blocks, Divi, and more.
What’s the performance?
v2.1.0 Benchmarks (cachewrangler.com test site):
– 15 pages converted in ~6 seconds
– Small page (10 KB): ~0.2 seconds
– Medium page (50 KB): ~0.5 seconds
– Large page (100 KB): ~1.0 seconds
– Batch (100 pages): ~45 seconds
– Pattern detection: XPath-based (efficient)
– Memory: ~20 MB per pageCan I preview before converting?
Yes! Use
wp scw-headless analyze <file>to see:
– Patterns detected
– Confidence scores
– Asset references
– Potential issues
– Extraction previewExample:
`bash
wp scw-headless analyze /plugins/kadence-blocks/ –verbose
Beoordelingen
Er zijn geen beoordelingen voor deze plugin.
Bijdragers & ontwikkelaars
“Static Cache Wrangler – Headless Assistant” is open source software. De volgende personen hebben bijgedragen aan deze plugin.
BijdragersVertaal “Static Cache Wrangler – Headless Assistant” in je eigen taal.
Interesse in ontwikkeling?
Bekijk de code, haal de SVN repository op, of abonneer je op het ontwikkellog via RSS.
Changelog
2.1.0 – January 15, 2025
Major Update: Production-Ready after hours of testing
- Tested: Full production testing on cachewrangler.com (15 pages, 763 blocks)
- Confirmed: 74% semantic conversion rate across mixed content
- Confirmed: 100% link preservation (763 links maintained)
- Confirmed: Zero data loss – all content captured
- Added: Generic Portable Text converter for CMS-agnostic output
- Added: Enterprise feature gating via
stcw_headless_is_enterprisefilter - Added:
normalizecommand for generic Portable Text export - Added: Support for Pattern Library Pro integration
- Added: Asset tracking in generic format with IDs and metadata
- Added: Block type statistics in verbose mode
- Added:
--verboseflag support for detailed output - Added:
--output=<path>flag to save JSON to file - Enhanced: CLI commands with better error messages
- Enhanced: Pattern detection tested with 40 unique patterns
- Enhanced: JSON output structure with version, format, generator
- Fixed: List item handling (array vs string support)
- Fixed: Image deduplication per page
- Fixed: Parser API consistency (
parse_file()) - Improved: Export now generates Sanity-native NDJSON format
- Improved: Asset manifest with usage tracking and priority levels
- Improved: Accordion and table semantic conversion
2.0.9 – January 5, 2026
- Enhanced: Pattern detection confidence scoring
- Fixed: Slug deduplication prevents duplicate homepage exports
- Fixed: Parser file path resolution edge cases
- Improved: Normalizer statistics output formatting
2.0.8 – January 2, 2026
- Fixed: Improved file path resolution for all URL formats (
/contact/,contact,/) - Fixed: Scanner now properly excludes junk paths (
index.php/,author/admin/) - Enhanced: Better error messages showing attempted path resolutions
- Enhanced: Homepage
/now resolves correctly toindex.html - Added: Filterable path exclusion list via
stcw_headless_excluded_paths
2.0.7 – December 30, 2025
- Enhanced: CLI
infocommand shows full parity with admin dashboard - Added: Plugin version, directory paths, detector count, CMS targets to info output
- Added: Trademark symbol (®) for Sanity CMS throughout UI and CLI
- Enhanced: Better Scanner statistics with formatted file sizes
- Fixed: Admin dashboard cache size label clarity
2.0.6 – December 27, 2025
- Added: Complete Kadence Blocks support – 28 block patterns registered
- Added: Advanced Kadence extractors (accordion, tabs, progress_bar, icon_list, and 24 more)
- Enhanced: Pattern registry now supports pattern inheritance
- Enhanced: Confidence scoring system for better pattern matching
- Added:
wp scw-headless detectorscommand to list detector modules
2.0.5 – December 26, 2025
- Added: Pattern detection system with XPath-based queries
- Added: HTML Normalizer with configurable cleanup strategies
- Added:
wp scw-headless analyzecommand for pattern debugging - Enhanced: CLI commands now support
--verboseflag for detailed output
2.0.4 – December 22, 2025
- Added: Pluggable CMS target architecture with TargetRegistry
- Added:
wp scw-headless targetscommand - Enhanced: Convert command now uses
--cms=<target>flag - Refactored: Sanity-specific code moved to Target/Sanity/ namespace
2.0.3 – December 16, 2025
- Added: Admin dashboard with pattern statistics
- Added: Real-time cache file scanning
- Enhanced: WP-CLI output formatting with color codes
2.0.2 – December 1, 2025
- Enhanced: Pattern priority system for nested block handling
- Fixed: Pattern detection order respects priority values
- Added: Detection statistics with confidence distribution
2.0.1 – November 15, 2025
- Refactored: Complete namespace migration from
STCWSC_*toSTCW\Headless\* - Added: PSR-4 autoloader for WordPress naming conventions
- Fixed: CLI namespace changed from
wp scw-sanitytowp scw-headless - Enhanced: Plugin renamed to “Static Cache Wrangler – Headless Assistant”
2.0.0 – November 1, 2025
- Major: Complete architectural refactor to pluggable system
- Breaking: CLI commands changed (backward compatibility via aliases)
- Breaking: Namespace changed to trademark-safe naming
- Added: Detector module system
- Added: Pattern registry with 12 Gutenberg patterns
- Added: HTML normalizer engine
- Enhanced: Sanity export generates complete schemas
1.0.0 – October 1, 2025
- Initial proof-of-concept release
- Basic Sanity conversion support
- Simple block detection
- CLI commands: info, scan, convert







