Skip to content

More data sources: government portals, spending, corporate, and investigative databases #55

@JordanCoin

Description

@JordanCoin

Find and integrate more public data sources beyond our current 6. Focus on free, API-accessible, no-auth-required sources that help journalists and researchers follow money, power, and connections.

Tier 1 — High value, free API, easy to integrate

Federal Government Documents

  • GovInfo API — Congressional reports, court opinions, Federal Register, bills, hearings. Free API key via api.data.gov. Covers all 3 branches of government.
  • Federal Register API — Every federal rule, proposed rule, notice, and executive order. No API key required. JSON + CSV.
  • State Department FOIA Reading Room — Searchable released documents including declassified cables and diplomatic correspondence.
  • USAspending API — Every federal contract, grant, and loan since 2001. Who got the money, how much, from which agency. No auth needed. This is how you follow the money.

Corporate & Financial

Investigative & Cross-Border

  • OCCRP Aleph — 400M+ documents from 200+ datasets, 139 countries. Corporate registries, financial records, leaks, legal filings. Public access available, API for bulk search. Already uses FollowTheMoney format (we support FtM import/export!).

Tier 2 — High value, may need scraping or special access

Federal Agency Reading Rooms

Most federal agencies have electronic FOIA reading rooms with previously released documents. These don't have APIs but could be scraped:

State & Local Data Portals

Major state open data portals (many use Socrata/CKAN with APIs):

Specialized

  • Regulations.gov API — Public comments and documents for every federal rulemaking. Follow industry lobbying on specific regulations.
  • NINA (Latin America) — Collaborative database of public documents across Latin American countries. Cross-searches OpenSanctions, Aleph, ICIJ, and OpenCorporates.
  • FOIA.gov API — Submit FOIA requests to any federal agency programmatically. Could wire into our request filing system.

Tier 3 — Nice to have

Priority

Start with USAspending (follow the money), GovInfo (follow the law), and OCCRP Aleph (follow the corruption). These three plus our existing 6 would make OpenFOIA the most comprehensive free investigative toolkit available.

Architecture note

All new sources should follow the RecordAdapter pattern in openfoia/records/. Register in __init__.py. Lazy-import any optional deps.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions