Extract content from closed web platforms into WordPress-compatible WXR files.
Closed platforms make it hard to leave. Wix has no HTML export and caps RSS at 20 posts. JavaScript-rendered content and limited APIs leave your site data locked inside.
This tool extracts all content from closed platforms — posts, pages, media, navigation, redirects, products — and produces a standard WordPress WXR file ready to import.
Where to host WordPress: If your current provider also offers WordPress, you can move to WordPress and stay with them. WordPress.com is another option: the $4/mo Personal plan now supports plugins and themes, and the WordPress.com MCP integration gives AI agents direct write access.
| Platform | Status | Prompt |
|---|---|---|
| GoDaddy Websites & Marketing (pages/blog) | Ready | prompts/godaddy-wm.md |
| Hostinger Website Builder (blog/pages/products) | Ready | — |
| HubSpot | Ready | — |
| Shopify (blog/pages/products) | Ready | prompts/shopify.md |
| Squarespace | Ready | prompts/squarespace.md |
| Webflow | Ready | prompts/webflow.md |
| Weebly (blog/pages/products) | Ready | — |
| Wix | Ready | prompts/wix.md |
| Any other website (generic fallback) | Best-effort | — |
All eight platforms have MCP adapters with full extraction support including products (exported as WooCommerce-compatible CSV). Sites matching none of them fall back to a generic default adapter that renders each page in a headless browser and extracts the main content, media, and any JSON-LD products — best-effort, since it can't key off platform-specific markup. GoDaddy Websites & Marketing is pages + blog only in v1; GoDaddy Online Store (OLS) product support is planned for v1.1.
data-liberation-agent is built to be driven by an AI agent. The front door is the liberate skill: one command runs the full pipeline — detect the platform, inventory every page/post/product, extract content and media, capture screenshots and design tokens, then reconstruct the site as an editable WordPress block theme and import it into a local WordPress preview.
Studio required for preview/import. Install Automattic Studio before running
/data-liberation:liberate. Extraction itself needs no WordPress.
Install from the marketplace:
claude plugin marketplace add Automattic/data-liberation-agent
claude plugin install data-liberation@data-liberationOr from a local checkout (for development on the plugin itself):
cd data-liberation-agent
claude plugin marketplace add .
claude plugin install data-liberation@data-liberationThen, in Claude Code:
/liberate https://your-site.com
What you'll see: the agent detects the platform, inventories all pages/posts/products, pauses to confirm scope and estimated time, then extracts content and media. It then drives the design phase — clustering page layouts, building a responsive block theme that mirrors your source site's structure and visual style, and importing everything into Automattic Studio. When it finishes you get a local preview URL and a run-report.json summarizing what was built, what's faithful, and any gaps.
The result is a responsive, editable WordPress block theme — not a static copy.
cd data-liberation-agent
codexThe .codex-plugin/plugin.json and .mcp.json register the MCP server and skills automatically. The liberate flow runs sequentially on Codex (the builder fan-out step degrades to a sequential loop).
Then in Codex:
$liberate https://your-site.com
cd data-liberation-agent
gemini extension link .Run the MCP server over stdio:
npx tsx src/mcp-server.ts
# or
npm run mcpIt exposes 35 tools. The ones you'll call directly for a deterministic extract → QA → import flow:
liberate_detect, liberate_discover, liberate_inspect, liberate_extract, liberate_screenshot, liberate_status, liberate_qa, liberate_verify, liberate_setup, and liberate_import — plus liberate_paths (resolve the output directory) and liberate_probe / liberate_map_apis (browser-based diagnostics). The remaining tools drive the design/reconstruction phase and are orchestrated by the skills rather than called by hand. Full reference with parameters: docs/mcp.md.
A successful run produces, in ~/Studio/_liberations/<host>/ (the default for the liberate flow; set the DLA_OUTPUT_DIR environment variable to change it, or pass outputDir when calling the MCP tools — liberate_paths reports the resolved path):
~/Studio/_liberations/<host>/output.wxr— WordPress eXtended RSS file, ready to import via WordPress Admin > Tools > Importmedia/— downloaded images and attachments with local paths rewritten in the WXRredirect-map.json— old platform paths mapped to new WordPress slugsextraction-log.jsonl— per-URL extraction log (atomic dedupe for resume)session.json— pipeline stage, captured opts, per-entity progress counters, and adapter pagination cursorsmedia-stubs.json— per-asset download status so permanently-broken URLs stop retrying across resume runsproducts.csv— WooCommerce-compatible product CSV (if the site has e-commerce)products.jsonl— raw product data streamed during extraction
The liberate flow captures, for every URL, full-page + scrolled-state screenshots (desktop 1440×900 and mobile 390×844), the rendered HTML, and site-wide design tokens — used by the reconstruction phase and handy for feeding AI design-system tools. Via raw MCP this is the liberate_screenshot tool (or screenshots: true on liberate_extract).
Artifacts land under the output directory:
screenshots/{desktop,mobile}/<slug>.png(plus.scrolled.pngpost-scroll variants)html/<slug>.html— rendered HTML per URLscreenshots/manifest.json— the URL → files join tablepalette.json,typography.json,breakpoints.json— aggregated per-site design tokens
The join back to output.wxr and products.jsonl happens on the filesystem via manifest.json, keyed by URL — nothing is written into WordPress postmeta.
- How it works
- AI agent commands
- AI skills
- MCP server tools
- Wix authenticated content endpoints — reference of the ten load-bearing content endpoints behind Wix's editor / dashboard auth
- WordPress Data Liberation project — the official effort
- WordPress.com MCP — AI agent write access to WordPress.com
Preview and import require Automattic Studio — install the app first (the studio CLI ships with it). Studio sites are persistent and named after the output directory's domain slug (example-com, example-com-2 on collision).
"Studio not found" — the studio CLI is not on PATH. Install Studio from https://developer.wordpress.com/studio/ and relaunch the terminal so the PATH update takes effect.
"Studio create-site fails" — out of disk, port conflict, or Studio config corruption. The error message includes the underlying CLI output. If it's a port conflict, retry. If the Studio config is corrupt, reinstalling Studio fixes it.
Preview is not a secure environment. Studio sites auto-log in as admin/password and bind to localhost. Do not paste secrets into them.