Skip to content

sambt/ankigen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AnkiGen

Generate high-quality Anki flashcards from PDFs and web content using Claude.

Built for studying statistics, math, and machine learning — with full LaTeX/MathJax support.

Setup

# Install dependencies
uv sync

# Install AnkiConnect in Anki desktop:
# Tools > Add-ons > Get Add-ons > code 2055492159 > Restart Anki

Anki desktop must be running for direct card import. Without it, you can still export .apkg files.

Environment variables

# For Mistral OCR extraction (get key at console.mistral.ai)
export MISTRAL_API_KEY="your-key-here"

# For Gemini extraction or OpenRouter LLM generation (get key at openrouter.ai)
export OPENROUTER_KEY="your-key-here"

# For Claude API generation (get key at console.anthropic.com)
export ANTHROPIC_API_KEY="your-key-here"

Extraction backends

AnkiGen supports three PDF extraction backends:

Backend How it works Env var Cost
marker (default) Runs locally via marker-pdf ML models None Free (uses your CPU/GPU)
mistral Mistral OCR API — best math extraction MISTRAL_API_KEY ~$1-2 / 1K pages
gemini Gemini 2.5 Flash via OpenRouter — cheapest OPENROUTER_KEY ~$0.13 / 1K pages

Generation backends

AnkiGen supports three LLM backends for generating flashcards:

Backend How it works Env var Notes
claude-code (default) Invokes claude -p CLI None (uses Claude Code auth) No extra setup needed
claude-api Anthropic Messages API directly ANTHROPIC_API_KEY More reliable structured output
openrouter Any model via OpenRouter OPENROUTER_KEY Use any model — Claude, Gemini, Llama, etc.

Default models:

  • claude-api: claude-sonnet-4-20250514
  • openrouter: anthropic/claude-sonnet-4

Override with --model (e.g., --model google/gemini-2.5-pro-preview).

Generation modes

Control what types of cards are generated:

Mode Description
comprehensive (default) Thorough coverage — definitions, formulas, intuition, applications
core Key concepts only — fewer, higher-impact cards
formulas Formulas and equations only — heavy on cloze math cards
conceptual Deep understanding — intuition, reasoning, connections (minimal formulas)
exam Exam prep — definitions, formulas, gotchas, proof sketches

Output types

AnkiGen can now generate either flashcards, a markdown study guide, or both from the same extraction pass.

Output type Description
flashcards (default) Existing flashcard workflow with optional review/import/export
study-guide Generate a markdown study guide only
both Generate flashcards and a markdown study guide

Web GUI

Run ankigen with no arguments to launch a browser-based interface for the full pipeline:

uv run ankigen
# or explicitly:
uv run ankigen --gui

The GUI lets you:

  • Upload a PDF, enter a file path, or paste a URL
  • Choose what to generate — flashcards, study guide, or both
  • Configure all settings — card generation mode, study guide mode, deck name, custom instructions, duplicate avoidance
  • Pick models for each stage — extraction backend (marker / Mistral OCR / Gemini Flash) and generation backend (Claude Code / Claude API / OpenRouter) with optional model override
  • Set a page range for large textbooks
  • Review flashcards inline — navigate, edit, approve/reject, then submit to Anki or export .apkg
  • View and download the study guide as markdown

All the same options available in the CLI are exposed in the GUI.

CLI Usage

# Generate cards from a PDF — opens review UI in your browser
uv run ankigen paper.pdf

# From a URL
uv run ankigen https://some-blog.com/post

# Target a specific deck
uv run ankigen paper.pdf -d "Stats::Bayesian"

# Extract only specific pages (1-indexed, matching your PDF viewer)
uv run ankigen textbook.pdf --pages 180-210 -d "Stats::Ch8"

# Later, do the next chapter — only those pages are extracted
uv run ankigen textbook.pdf --pages 211-240 -d "Stats::Ch9"

# Combine page ranges
uv run ankigen textbook.pdf --pages 1-5,20-30

# Give Claude custom instructions
uv run ankigen paper.pdf -i "focus only on chapter 8"
uv run ankigen paper.pdf -i "emphasize the key theorems and their proof intuitions"

# Avoid duplicates with cards already in the deck
uv run ankigen paper.pdf -d "ML" --no-duplicates

# Skip the review UI — add all cards directly
uv run ankigen paper.pdf -d "ML" --no-review

# Export to .apkg file (no running Anki needed)
uv run ankigen paper.pdf --export output.apkg

# Save the generated JSON for later review
uv run ankigen paper.pdf -o cards.json

# Use Mistral OCR for extraction (fast, great math quality)
uv run ankigen paper.pdf --extractor mistral -d "Stats"

# Use Gemini Flash via OpenRouter (cheapest option)
uv run ankigen paper.pdf --extractor gemini --pages 1-30

# Short form
uv run ankigen paper.pdf -x mistral -p 180-210 -d "Stats::Ch8"

# Use Claude API directly instead of Claude Code CLI
uv run ankigen paper.pdf -g claude-api

# Use any model via OpenRouter
uv run ankigen paper.pdf -g openrouter
uv run ankigen paper.pdf -g openrouter -m "google/gemini-2.5-pro-preview"
uv run ankigen paper.pdf -g openrouter -m "meta-llama/llama-4-maverick"

# Combine: Mistral OCR extraction + Gemini generation via OpenRouter
uv run ankigen paper.pdf -x mistral -g openrouter -m "google/gemini-2.5-pro-preview" -p 1-30

# Generation modes
uv run ankigen paper.pdf --mode core            # key concepts only
uv run ankigen paper.pdf --mode formulas        # equations and math only
uv run ankigen paper.pdf --mode conceptual      # deep understanding, minimal formulas
uv run ankigen paper.pdf --mode exam            # exam prep with gotchas and proof sketches

# Combine mode with other options
uv run ankigen textbook.pdf -p 180-210 --mode formulas -d "Stats::Formulas"
uv run ankigen paper.pdf --mode exam -i "focus on hypothesis testing"

# Generate a markdown study guide instead of flashcards
uv run ankigen paper.pdf --output-type study-guide --output-markdown guide.md

# Generate both flashcards and a study guide from one extraction pass
uv run ankigen paper.pdf --output-type both --study-guide-mode cheatsheet

# Study guide modes
uv run ankigen paper.pdf --output-type study-guide --study-guide-mode overview
uv run ankigen paper.pdf --output-type study-guide --study-guide-mode core-concepts
uv run ankigen paper.pdf --output-type study-guide --study-guide-mode exam-summary

Options

Flag Description
source PDF file path or URL (optional — omit to launch web GUI)
--gui Launch the web GUI explicitly
-x, --extractor PDF extraction backend: marker (default), mistral, or gemini
-p, --pages RANGE Page range to extract, 1-indexed (e.g., 180-210, 1-5,20-30)
-g, --generator LLM backend: claude-code (default), claude-api, or openrouter
-m, --model Model override for claude-api / openrouter (e.g., google/gemini-2.5-pro-preview)
--mode Generation mode: comprehensive (default), core, formulas, conceptual, exam
--output-type Output type: flashcards (default), study-guide, or both
--study-guide-mode Study guide mode: overview (default), cheatsheet, core-concepts, exam-summary
--output-markdown PATH Save generated study guide markdown to a specific path
-i, --instructions Custom instructions for the LLM (e.g., "focus on chapter 8")
-d, --deck Target Anki deck name (default: "Default")
--no-review Skip the Streamlit review UI, add cards directly
-e, --export PATH Export to .apkg file instead of AnkiConnect
--no-duplicates Fetch existing deck cards and tell Claude to avoid duplicates
--no-cache Force re-extraction even if a cached version exists
-o, --output-json PATH Save generated cards JSON to a specific path

Review UI

The Streamlit review interface lets you:

  • Navigate cards one at a time with prev/next buttons or a slider
  • Preview card front/back with rendered LaTeX math
  • Edit card text, including math notation
  • Approve/reject individual cards
  • Bulk approve or reject all cards
  • Edit tags per card
  • Select a deck from your Anki collection
  • Submit approved cards to Anki or export as .apkg

Caching

Extracted content is cached in ~/.ankigen/cache/ so repeated runs against the same source skip the slow PDF extraction step. Each unique combination of file + page range gets its own cache entry.

  • Re-running with different --pages or --instructions reuses the cached extraction
  • The cache auto-invalidates if the source file is modified
  • Use --no-cache to force re-extraction

Pipeline

Source (PDF/URL)
  → Content extraction (marker-pdf / Mistral OCR / Gemini Flash / trafilatura) — cached
  → Flashcard generation and/or study guide generation (Claude Code CLI / Claude API / OpenRouter)
  → Review UI (Streamlit, flashcards only)
  → Anki import (AnkiConnect / .apkg export, flashcards only)

Card Types

AnkiGen produces a mix of:

  • Basic (front/back) — definitions, intuition, comparisons, application
  • Cloze (fill-in-the-blank) — formulas, key relationships, theorem statements

Math is formatted with Anki MathJax delimiters: \(...\) for inline, \[...\] for display.

Supported Sources

  • .pdf — academic papers, textbooks (extracted via marker-pdf with LaTeX preservation)
  • .txt, .md — plain text and markdown files
  • URLs — blog posts, articles (extracted via trafilatura)

Study Guides

Study guides are generated as markdown files. The first implementation is text-only and reuses the same extracted source content as the flashcard pipeline, but it is structured so richer document assets can be added later without changing the default flashcard behavior.

About

i got tired of making flash cards by hand

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages