diff --git a/README.md b/README.md index 8211794..6e06339 100644 --- a/README.md +++ b/README.md @@ -17,7 +17,7 @@ Forks boot in seconds against a CoW snapshot of `/var/lib/postgresql`. Postgres ## What it does - **Forks with the substrate.** Every byte under `/var/lib/postgresql` snapshots atomically; the new box boots on a CoW copy. No `pg_dump`, no replication topology, no wait. -- **PgBouncer on the front.** Transaction pooling on `:5432`; Postgres itself listens on the loopback only. +- **PgBouncer on the front, sized to load.** Transaction pooling and TLS termination on `:5432`; Postgres listens on the loopback only. The pooler runs one worker when idle and adds `so_reuseport` workers across cores as connection-handshake load rises, then reaps them when it falls. The worker count survives restarts. - **Logical decoding from day one.** `wal_level = logical`, `max_wal_senders = 10`, `max_replication_slots = 10`. No primary restart to enable CDC later. - **Standard extensions, pinned.** pgvector, pgvectorscale, PostGIS, pg_cron, pg_partman, pg_jsonschema, hypopg, pg_repack, pg_search, pg_stat_statements, pg_trgm, auto_explain. - **Beyond extensions on the same volume.** `beyond-auth` and `beyond-queue` ship in the image and live under their own schemas in your database. Forking your DB forks their state automatically. diff --git a/bench/glidefs-pg/README.md b/bench/glidefs-pg/README.md new file mode 100644 index 0000000..b053c39 --- /dev/null +++ b/bench/glidefs-pg/README.md @@ -0,0 +1,79 @@ +# GlideFS × Postgres substrate-tuning harness + +Measure-first rig for `plans/wal-recycle-off-*.md`. It puts a real Postgres data +dir on a real GlideFS block device backed by an in-memory / local-file / MinIO +object store (never real S3), runs a fixed pgbench workload, and scrapes +GlideFS's own per-export metrics before/after so each tuning knob is scored on +**S3 write cost and coalescing**, not intuition. + +## Why this exists + +The S3 cost of running Postgres on GlideFS is _distinct 128 KiB blocks flushed +per cycle_. Overwrite-before-flush coalesces for free. Every knob in the plan +(checkpoints, `*_flush_after`, bgwriter, `wal_compression`, `wal_recycle`, +autovacuum-on-CoW, `compaction_cooldown`) changes how many distinct blocks reach +the object store. This rig measures that directly via +`glidefs_s3_batches_written_total`, `glidefs_coalesce_ratio`, +`glidefs_write_amplification`. + +## Prereqs (already present on the homelab box) + +- `glidefs` binary, `nbd` + `ublk_drv` kernel modules loaded +- passwordless `sudo` (glidefs needs CAP_SYS_ADMIN for `/dev/nbdN`; mkfs/mount) +- Postgres 18 client tools (`initdb`, `pg_ctl`, `pgbench`, `psql`) + +## Run + +```sh +# one run (baseline) +mise run bench:substrate + +# a single candidate knob +bench/glidefs-pg/run.sh --conf bench/glidefs-pg/conf/c1-checkpoint60.conf --label c1-60 + +# full A/B sweep (baseline 3x for the noise floor, then each overlay) +bench/glidefs-pg/sweep.sh +``` + +Knobs: `--backend file|memory|minio` (default `file` — same byte accounting as +`memory`, no RAM blowup on long runs), `--scale`, `--duration`, `--clients`, +`--cooldown N` (GlideFS compaction_cooldown, the G experiment), `--transport +nbd|ublk`, `--keep`. + +## Output + +`out/-