Releases: JordanCoin/openfoia
OpenFOIA v3.2.2
Updated pdf-extract binaries for all platforms.
OpenFOIA v3.2.1
Updated pdf-extract binaries for all platforms.
OpenFOIA v3.2.0 — DocumentCloud + Document Reader
What's New
DocumentCloud Integration
- Search 10M+ public documents via
openfoia records search --source documentcloud - Fetch full text into local database — DocumentCloud already extracted it, no OCR needed
- Cross-reference entities against DocumentCloud alongside MuckRock, SEC, OpenCorporates, OpenSanctions, and ICIJ
records fetchcommand — pull any document's text locally for entity extraction- Uses httpx REST API directly for speed (not the python-documentcloud library)
- Search highlights show where your terms appear in the document
- Idempotent fetch — re-running won't duplicate records
- API errors surfaced clearly (not masked as "no results")
Interactive Document Reader
- Click any entity in the graph to see its source documents
- Open the document reader — full text with every entity highlighted inline
- Sidebar lists all entities found, sorted by frequency, color-coded by type
- Click an entity in the sidebar or text to jump to its occurrences
- "View original source" button links directly to DocumentCloud
- Source URLs on document cards for immediate verification
- Documents without text gracefully shown as unavailable
Graph Visualization Improvements
- Double-click to zoom smoothly into any node
- Selection highlighting — dims unconnected nodes, brightens direct connections
- Curved edges to reduce visual overlap
- Connected entities shown as clickable tags in the info panel
- Adaptive spacing — repulsion scales with node count
- Pointer cursor on hoverable nodes
- Escape key to close reader/deselect
Other Improvements
- Multi-layer MuckRock search (tags → agency ID → user filter)
- MSG email file support for ingest
- File type display in search results
- 13 security fixes from codex adversarial review
- Entity links FK constraint fix
- Graph HTML template extracted to
graph_template.py(640 lines out of cli.py) - New README screenshots showing the graph and document reader
- ruff format passing on all files
Data Sources (6)
| Source | Documents |
|---|---|
| MuckRock | 46k+ FOIA requests |
| DocumentCloud | 10M+ public documents |
| OpenCorporates | Global company records |
| SEC EDGAR | US corporate filings |
| OpenSanctions | Sanctions & PEP lists |
| ICIJ Offshore Leaks | Panama/Pandora Papers |
Install
curl -fsSL https://raw.githubusercontent.com/JordanCoin/openfoia/main/install.sh | bashOpenFOIA v3.1.1
Updated pdf-extract binaries for all platforms.
OpenFOIA v3.1.0 — Security Hardening
Security Hardening (codex adversarial review)
All 13 findings from an independent adversarial code review have been addressed:
- Duress mode redesigned: no stored password hash, encrypted decoy profile, opaque filenames, SQLCipher as verifier
- Encrypted storage: errors instead of silently falling back to plaintext; .bak file securely deleted after encryption
- Cross-reference warns: before sending entity names to external APIs
- Cloud AI warning: when document text would be sent to Anthropic/OpenAI
- LLM extraction: validates entities against source text (prompt injection mitigation)
- Install script: SHA256 checksum verification for pdf-extract binary
- Custom entity regex: now actually used in the regex fallback extraction path
- Ingest pipeline fixed: CLI now persists Document rows to database
- Agent fixed: DocumentIngester gets proper storage_path
- Portable mode fixed: config.py respects OPENFOIA_DATA_DIR
- CDN dependency: removed unused htmx, documented Tailwind tradeoff
- Purge messaging: honest about limitations (swap, journals, caches)
- Threat model: new docs/THREAT_MODEL.md with honest capabilities and limitations
Also includes pdf-extract v0.2.0 with PNG predictor fix for linearized government PDFs.
OpenFOIA v0.2.0
Includes pdf-extract v0.2.0 binaries for all platforms.
OpenFOIA v3.0.1
Patch: lean core deps, portable install, helpful error messages for missing extras.
OpenFOIA v3.0.0 — Cross-Reference Engine
OpenFOIA v3.0.0 — Cross-Reference Engine
The free, local Maltego alternative.
New
openfoia crossref— check every entity against MuckRock, OpenCorporates, SEC EDGAR, OpenSanctions, and ICIJ Offshore Leaks in one command- MuckRock adapter — search 46k+ completed FOIA requests, download response documents, auto-ingest into pipeline
- FollowTheMoney export —
openfoia analyze exportproduces .ftm.json compatible with Aleph, OpenAleph, OpenSanctions - FollowTheMoney import —
openfoia analyze importbrings data from Aleph or colleagues into your local database - Named investigation graphs —
openfoia analyze graph --name defense-contracts --view - Portable mode —
openfoia portablekeeps all data on the USB, nothing on the host machine - Smart CSV import — AI-assisted column mapping and regex generation from plain English
- Entity type CLI — add, remove, list, import, export, test custom entity types
- Quickstart guide —
openfoia guidewalks new users through the entire workflow - Journalist guide — comprehensive docs/GUIDE.md
Install
curl -fsSL https://raw.githubusercontent.com/JordanCoin/openfoia/main/install.sh | bash
openfoia guideUninstall
curl -fsSL https://raw.githubusercontent.com/JordanCoin/openfoia/main/uninstall.sh | bashYour data never leaves your machine.
OpenFOIA v2.0.0 — The Safety Layer
OpenFOIA v2.0.0 — The Safety Layer
Everything from v1.0.0 plus a complete journalist safety toolkit.
New in v2.0.0
Entity Extraction (rewritten)
- 4-tier extraction: LLM → GLiNER → spaCy → Regex
- GLiNER zero-shot NER:
pip install openfoia[ner]— 90% entity recall, runs on CPU, no API key - Local LLM (ollama llama3.2:3b): 100% recall on FOIA documents in 26 seconds
- Relationship extraction at every tier (co-occurrence, syntactic parsing, LLM)
- E2E test suite with realistic FOIA documents and ground truth
Security
- Encrypted storage at rest (SQLCipher AES-256) —
openfoia init --password - Forensic purge —
openfoia purge --secure(3-pass overwrite, history scrub, free space fill) - Automatic metadata stripping on ingest (EXIF, PDF author, DOCX revision history)
- Duress mode — second password opens a decoy database with harmless data
OSINT
- Web archive ingestion via Tor —
openfoia ingest --url <url> --tor - Public records adapters: OpenCorporates + SEC EDGAR —
openfoia records search "Acme Corp" - Agentic Tor browsing via Playwright —
openfoia browse <url> --tor
Deployment
- Air-gapped deployment docs (Tails OS, encrypted USB, Qubes)
OPENFOIA_DATA_DIRenv var for portable installs
Install
curl -fsSL https://raw.githubusercontent.com/JordanCoin/openfoia/main/install.sh | bashUninstall
curl -fsSL https://raw.githubusercontent.com/JordanCoin/openfoia/main/uninstall.sh | bashYour data never leaves your machine.
OpenFOIA v1.0.0
OpenFOIA v1.0.0
Local-first FOIA automation for journalists, researchers, and citizens.
Install
curl -fsSL https://raw.githubusercontent.com/JordanCoin/openfoia/main/install.sh | bashWhat's in v1.0.0
- 53 federal agencies pre-loaded with FOIA contacts
- File requests via email, fax (Twilio), or physical mail (Lob)
- Request templates with proven legal language
- PDF text extraction (~3ms/page, lossless) with OCR fallback
- Entity extraction via local LLM (Ollama), Anthropic, OpenAI, or regex
- Deadline tracking with auto-calculated due dates (20 business days)
- Campaign coordination for crowdsourced FOIA
- Entity relationship graph with interactive visualization
- Web UI with agency search, form submission, document upload
- SQLite database with Alembic migrations
- Works offline. Works on Linux, macOS, Windows.
- One command to destroy all data:
openfoia purge --yes
Uninstall
curl -fsSL https://raw.githubusercontent.com/JordanCoin/openfoia/main/uninstall.sh | bashYour data never leaves your machine.