Add Substack migration support#1
Open
joeboydston wants to merge 1 commit into
Open
Conversation
New platform extractor for Substack publications. Uses Substack's undocumented public API for content discovery and extraction — no browser needed. Supports dual-source extraction: API for metadata and free content, CSV export for full paid post content. Images are unwrapped from Substack's CDN proxy URLs to download originals. Also adds --self-hosted flag to import.js for non-WordPress.com sites, fixes media upload Content-Type handling, and improves image URL replacement to handle CDN wrapper URLs. Tested end-to-end against coloradomedia.substack.com → Atomic staging site. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
simison
reviewed
Apr 3, 2026
simison
left a comment
Member
There was a problem hiding this comment.
Really cool idea to add Substack!
| | URL structure (`/p/slug`) | Generate redirect map; set WordPress permalinks to match if possible | | ||
| | Subtitles not in WordPress | Add as styled first paragraph or use a subtitle plugin | | ||
| | Subscriber migration | Separate CSV export; requires email service setup | | ||
| | Paid subscriber migration | Stripe account transfer needed; no automated path | |
Member
There was a problem hiding this comment.
We have APIs for doing both paid and free subscriber imports to WP.com and Jetpack sites, including the Stripe account switch, so those should first be exposed via MCP and then connected here.
Cc @Automattic/loop team who's looking after subscriber importer.
|
|
||
| ### Substack | ||
|
|
||
| | Problem | Solution | |
Member
There was a problem hiding this comment.
Some potential problems to solve, not necessarily in this PR:
- Videos and podcasts are not included in substack exports and need to be scraped separately. For paid videos needs authenticating or maybe access to sites media management?
- Lots of "subscribe!" nudges in post content; could be removed, or replaced e.g. with MailPoet forms or Jetpack Subscription blocks. I'd imagine intuitively AI would use button or link blocks otherwise but there's no point if they aren't actually functional.
- Paywall markers in paid posts likely needs guidance which plugin/block to use; Jetpack Newsletters has a paywall block!
|
Nice work. We've done a lot of Substack migrations on the WP.com side -- a few things that always come up:
|
borkweb
added a commit
that referenced
this pull request
Jun 5, 2026
# Conflicts: # src/mcp-server/handlers/reconstruct-pages.ts
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
scripts/substack/discover.js— inventories a Substack publication via its undocumented public API (/api/v1/archive). No browser dependency.scripts/substack/extract.js— extracts all content via API, with--csv-exportflag for full paid post content from Substack's official export. Unwrapssubstackcdn.comCDN wrapper URLs to download original images.prompts/substack.md— user-facing migration prompt covering content, subscribers, paid tiers, and redirects.scripts/import.js— adds--self-hostedflag for non-WordPress.com sites, fixes media uploadContent-Typehandling, improves image URL replacement to handle Substack CDN wrapper URLs. Wix import behavior unchanged.AGENTS.md,README.md,DISCOVERIES.mdwith Substack documentation.How I found it
Built and tested as a new platform contribution. Substack's public API endpoints (
/api/v1/archive,/api/v1/posts/<slug>) return rich JSON without authentication, making extraction fast and browser-free.Tested against
Discovery log entry added to DISCOVERIES.md
🤖 Generated with Claude Code