Capture books from Kindle, Apple Books, or PDF — then OCR and generate structured Markdown.
A Claude Code plugin that captures book pages as screenshots, extracts text via OCR, and generates thematically organized Markdown documents. Built for Obsidian knowledge vaults but works with any Markdown-based system.
- Captures every page of a book as screenshots (Mac Kindle, Apple Books, Kindle Cloud Reader, or PDF)
- Extracts text via macOS Vision OCR + Claude Code agents for low-confidence pages
- Generates 8-14 thematically organized Markdown files with rich formatting (tables, blockquotes, cross-references)
- Creates a hub file with frontmatter and wikilinks to all topic files
The entire pipeline runs locally with no external API keys — OCR and content generation use Claude Code's built-in capabilities.
| Platform | How It Works | Best For |
|---|---|---|
| Mac Kindle | CGWindowList + screencapture + Page Down | Kindle purchases |
| Apple Books | CGWindowList + screencapture + arrow keys | Apple Books purchases |
| Kindle Cloud Reader | Playwright browser automation | When desktop app unavailable |
| Poppler pdftoppm conversion | Scanned/image-based PDFs |
In Claude Code, run:
/plugins
- Navigate to the Marketplaces tab
- Select Add marketplace and enter:
masterleopold/book-capture - Navigate to the Discover tab
- Find book-capture and install it
After installation, the commands are available in every Claude Code session.
| Scope | Effect |
|---|---|
| user (default) | Available in all your projects |
| project | Shared with your team via .claude/settings.json |
git clone https://github.com/masterleopold/book-capture.git
claude --plugin-dir ./book-captureAfter installing, run the setup script to install Node.js dependencies and compile the Vision OCR binary:
bash ~/.claude/plugins/cache/book-capture-marketplace/book-capture/*/scripts/setup.shOr the plugin will auto-detect and prompt you on first use.
- macOS (required for screencapture and Vision OCR)
- Claude Code CLI installed and authenticated
- Node.js 20+
- Xcode Command Line Tools (
xcode-select --install) - Accessibility permission for Terminal/Claude Code (System Settings > Privacy & Security > Accessibility)
- Poppler (PDF only):
brew install poppler
| Command | Platform | Description |
|---|---|---|
/book-capture:kindle |
Mac Kindle | Capture from Amazon Kindle desktop app |
/book-capture:books |
Apple Books | Capture from Apple Books app |
/book-capture:cloud |
Kindle Cloud Reader | Capture via browser (Playwright) |
/book-capture:pdf |
PDF file | Capture from scanned/image-based PDF |
| Command | Description |
|---|---|
/book-capture:capture |
Full pipeline with platform selection prompt |
/book-capture:ocr |
OCR only on existing page captures |
/book-capture:generate |
Markdown generation from existing OCR text |
/book-capture:kindle B0883TQ3ZN
Claude Code will:
- Ask for book title, author, category, and location
- Remind you to open the book in Kindle to the first page
- Capture all pages (auto-stops at end of book)
- Run Vision OCR + agent re-reading for low-confidence pages
- Analyze content and create 8-14 thematic topic files
- Generate a hub file with wikilinks
Each platform uses macOS-native tools:
- CGWindowList (via inline Swift) to find the app window ID
- screencapture to capture individual window frames
- AppleScript to control page navigation (Page Down for Kindle, arrow keys for Books)
- Duplicate detection to auto-stop at end of book (3 consecutive identical pages)
- macOS Vision OCR (fast, local) processes all pages with confidence scoring
- Pages below the confidence threshold are re-read by Claude Code agents using multimodal image reading
- Results are merged into
raw_text.json
Note: macOS Vision cannot read vertical Japanese text (tategaki). For vertical text books, all pages are re-read by agents.
- Claude Code analyzes the full text and identifies 8-14 thematic categories (organized by information type, not original chapter order)
- Parallel agents generate detailed topic files (300-600 lines each) with:
- Genre-specific structure (business, technical, humanities, science, narrative)
- Tables, blockquotes, bold key terms, cross-references
[[wikilinks]]between sibling topics
- A hub file is created with frontmatter and links to all topics
Books/entries/BookTitle.md # Hub file with frontmatter
Knowledge/Category/BookTitle/
01_Theme_Name.md # Topic file (300-600 lines)
02_Theme_Name.md
...
10_Theme_Name.md
---
tags:
- source/book
- type/framework
- theme/fundraising
Category: "Startup"
Rating: ""
author: "Author Name"
Location: "Knowledge/Startup/BookTitle"
Chapters: 10
Language: "EN"
URL: "https://www.amazon.co.jp/dp/B0883TQ3ZN"
---
# Book Title
Book summary (2-3 sentences).
## Topics
- [[01_Theme Name]] - Description
- [[02_Theme Name]] - Description
...Per-project settings via .claude/book-capture.local.md:
---
vault_root: /path/to/obsidian/vault
captures_dir: Books/files/book-captures
entries_dir: Books/entries
default_source: kindle
default_language: JP
default_dpi: 200
---If no settings file exists, the plugin auto-detects Obsidian vaults by looking for .obsidian/ in the current directory or parents.
| Issue | Solution |
|---|---|
| Kindle window not found | Ensure book is open and visible in Amazon Kindle app |
| Accessibility denied | System Settings > Privacy & Security > Accessibility > add Terminal |
| Vision OCR compilation fails | xcode-select --install |
| Vision OCR misses vertical text | Expected for tategaki — agents handle these pages |
| Pages don't advance | Kindle uses Page Down key; restart the app if stuck |
| pdftoppm not found | brew install poppler |
| PDF is encrypted | qpdf --decrypt input.pdf output.pdf |
| npm packages not installed | Run scripts/setup.sh |
book-capture/
.claude-plugin/
plugin.json # Plugin manifest
marketplace.json # Marketplace metadata
commands/ # 7 slash commands
kindle.md # /book-capture:kindle
books.md # /book-capture:books
cloud.md # /book-capture:cloud
pdf.md # /book-capture:pdf
capture.md # /book-capture:capture (generic)
ocr.md # /book-capture:ocr
generate.md # /book-capture:generate
agents/
ocr-reader.md # Multimodal OCR re-reader
content-writer.md # Thematic content generator
skills/
book-capture/
SKILL.md # Auto-activating skill
scripts/
capture-kindle-mac.mjs
capture-books-app.mjs
kindle-capture.mjs # Playwright Cloud Reader
capture-pdf.mjs
extract-text.mjs # Vision OCR
generate-markdown.mjs # Direct API fallback
book-capture-utils.mjs
vision-ocr.swift # macOS Vision CLI
setup.sh # Dependency installer
package.json
templates/
settings-template.md