Skip to content

Database export and federated query support #70

@vmvarela

Description

@vmvarela

Description

Add the ability to export query results directly to external databases and to query across multiple input sources (federated queries). This transforms sql-pipe from a data processor into a lightweight data pipeline tool.

Export: write results to PostgreSQL, MySQL, or SQLite files instead of (or in addition to) stdout.
Federation: load multiple CSV files as named tables and JOIN across them.

Example usage

# Export CSV query results to PostgreSQL
cat orders.csv | sql-pipe \
  --export-db "postgres://user:pass@localhost/warehouse" \
  --export-table orders_staging \
  "SELECT * FROM stdin WHERE status = 'completed'"

# Export to a SQLite file
cat data.csv | sql-pipe \
  --export-db "sqlite:///tmp/analytics.db" \
  --export-table metrics \
  "SELECT date, COUNT(*) as events FROM stdin GROUP BY date"

# Federated query across multiple CSV files
sql-pipe \
  --load users=users.csv \
  --load orders=orders.csv \
  "SELECT u.name, COUNT(o.id) as order_count
   FROM users u JOIN orders o ON u.id = o.user_id
   GROUP BY u.name ORDER BY order_count DESC"

# Combine federation with export
sql-pipe \
  --load products=products.csv \
  --load sales=sales.csv \
  --export-db "postgres://localhost/reporting" \
  --export-table product_summary \
  "SELECT p.name, SUM(s.quantity) as total_sold
   FROM products p JOIN sales s ON p.id = s.product_id
   GROUP BY p.name"

Acceptance Criteria

Federation

  • --load <name>=<file> flag loads a CSV file as a named table
  • Multiple --load flags can be used to load multiple tables
  • JOINs across loaded tables work correctly
  • stdin is still available as stdin table alongside loaded files
  • Clear error messages for missing files or name collisions

Export

  • --export-db <url> flag specifies the target database
  • --export-table <name> specifies the destination table name
  • Supported URL schemes: sqlite://, postgres://, mysql://
  • Table is created automatically if it doesn't exist (CREATE TABLE from query schema)
  • Option to append (--export-mode append) or replace (--export-mode replace)
  • Atomic writes — partial data is not committed on failure
  • Credentials can be provided via URL or environment variables

General

  • stdout output still works alongside --export-db (dual output)
  • Documentation updated with federation and export examples

Recommended split plan

When scheduled for a sprint, split into independent sub-issues:

  1. Federation: --load flag for multiple CSV tables (size:m) — load multiple CSV files as named SQLite tables. This is the highest-value, lowest-effort part — SQLite already supports multiple tables in the same in-memory database. Delivers --load independently of export.
  2. Export to SQLite files (size:s) — use SQLite's ATTACH + INSERT INTO to write results to a .db file. No external dependencies needed.
  3. Export to PostgreSQL (size:m) — requires libpq or a pure-Zig PostgreSQL client. Consider feasibility before committing.
  4. Export to MySQL (size:m) — requires libmysqlclient or pure-Zig MySQL client. Similar to Postgres.

Sub-issues 1 and 2 are independent and high-value. Sub-issues 3 and 4 can be deferred or even spun off as separate backlog items.

Dependencies

Notes

  • Federation is the easier part — SQLite already supports multiple tables; sql-pipe can load multiple CSVs into the same in-memory SQLite instance as different tables
  • Export to SQLite files is also straightforward — use ATTACH + INSERT INTO
  • Export to Postgres/MySQL is more complex — requires client libraries (libpq, libmysqlclient) or a pure-Zig implementation
  • Consider implementing in phases: (1) federation, (2) SQLite export, (3) Postgres/MySQL export
  • This is size:l for the federation + SQLite export subset; Postgres/MySQL would push it to xl

Metadata

Metadata

Assignees

No one assigned

    Labels

    priority:mediumShould be done soonsize:lLarge — 1 to 2 daysstaleInactive issue — review in next refinementstatus:readyRefined and ready for sprint selectiontype:featureNew functionality

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions