Skip to content

ImpossibleForge/pfc-export-clickhouse

Repository files navigation

pfc-export-clickhouse

License: MIT Python PFC-JSONL Version

One-shot CLI to export any ClickHouse table to a .pfc cold-storage archive — with block-level timestamp index for time-range queries without full decompression.

No daemon. No scheduling. Run once, get your archive.


When to use this vs pfc-archiver-clickhouse

pfc-export-clickhouse pfc-archiver-clickhouse
Mode One-shot CLI Daemon
Use case Manual export, migration, one-time backup Continuous automated archiving
Scheduling You handle it (cron, CI, manual) Built-in loop
Config CLI flags TOML file

Install

pip install pfc-export-clickhouse

# Or from source
git clone https://github.com/ImpossibleForge/pfc-export-clickhouse
cd pfc-export-clickhouse
pip install -r requirements.txt

Also required: pfc_jsonl binary on your PATH. → Download: github.com/ImpossibleForge/pfc-jsonl/releases


Usage

# Export full table
pfc-export-clickhouse --host ch.example.com --table application_logs --output logs.pfc

# Export a specific time range
pfc-export-clickhouse --host ch.example.com --database mydb --table events \
  --ts-column timestamp \
  --from-ts "2024-01-01T00:00:00" \
  --to-ts   "2024-02-01T00:00:00" \
  --output  events_jan2024.pfc

# ClickHouse Cloud (HTTPS)
pfc-export-clickhouse \
  --host my-cluster.clickhouse.cloud --secure \
  --user myuser --password mypass \
  --table logs --output logs.pfc

# Verbose output
pfc-export-clickhouse --host localhost --table logs --output logs.pfc --verbose

Options

Flag Default Description
--host (required) ClickHouse hostname or IP
--port 8123 HTTP port (use 8443 with --secure)
--user default Username
--password (empty) Password
--database default Database name
--secure off Use HTTPS/TLS (for ClickHouse Cloud)
--table (required) Table to export
--ts-column (none) Timestamp column for filtering and ORDER BY
--from-ts (none) Start of time range (ISO 8601, inclusive)
--to-ts (none) End of time range (ISO 8601, exclusive)
--output auto Output .pfc file (auto-named if omitted)
--batch-size 100000 Rows per fetch block
--verbose off Print progress
--pfc-binary auto-detect Path to pfc_jsonl binary

Environment variable: PFC_JSONL_BINARY — alternative to --pfc-binary.


Output

Each export produces two files:

logs.pfc        ← compressed JSONL (~8-10% of original size)
logs.pfc.bidx   ← block index for time-range queries

Query archives directly with DuckDB — only relevant blocks decompress:

INSTALL pfc FROM community;
SELECT * FROM pfc_scan('logs.pfc')
WHERE timestamp BETWEEN '2024-01-15' AND '2024-01-16';

Part of the PFC Ecosystem

→ View all PFC tools & integrations

Direct integration Why
pfc-archiver-clickhouse Daemon version — continuous automated archiving instead of one-shot
pfc-duckdb Query the archives this tool creates — time-range queries without full decompress
pfc-gateway HTTP REST query layer over .pfc archives — no DuckDB required

Disclaimer

pfc-export-clickhouse is an independent open-source project and is not affiliated with, endorsed by, or associated with ClickHouse, Inc. or the ClickHouse project.


License

pfc-export-clickhouse (this repository) is released under the MIT License — see LICENSE.

The PFC-JSONL binary (pfc_jsonl) is proprietary software — free for personal and open-source use. Commercial use requires a license: info@impossibleforge.com

About

One-shot CLI export from ClickHouse to PFC cold-storage archives

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages