feat(exporter): unified Arrow exporter with GUC + end-to-end TAP test#116
feat(exporter): unified Arrow exporter with GUC + end-to-end TAP test#116JoshDreamland wants to merge 6 commits into
Conversation
edfb438 to
0e48f7f
Compare
…to-end Introduce a new exporter that builds Arrow IPC RecordBatches through the typed StatsExporter column-factory interface (StatLC/StatHC/StatTimestamp) instead of the open-coded ArrowBatchBuilder used by arrow_batch.cc. Composition over inheritance: the new exporter holds an OTelExporter for gRPC transport (SendArrowBatch) but doesn't extend it, so the per-row LogRecord state machine in OTelExporter — which is unused on this path post-PR-#72 — stays out of scope. Wire shape targets events_raw (the unified schema authored in PR #99), not the legacy query_logs_arrow: * query_id, parent_query_id: Int64 (no sprintf decimal-string encoding) * pid: Int32 * err_elevel: UInt8 * buffer counters (shared/local/temp_blks_*, *_blk_*_time_us, wal_*, cpu_*_time_us): Int64 * parallel_workers_planned/launched: Int16 * jit_*: Int32 * LC strings (db_*, err_sqlstate, app, server_role, region, cell, service_version, read_replica_type) -> DictionaryUtf8 * HC strings (query_text, err_message, client_addr, instance_ubid, server_ubid, host_id, pod_name) -> plain utf8 * ts: arrow::timestamp(MICRO, "UTC") matching DateTime64(6, 'UTC') Column<T> wrappers are nested private types inside OTelArrowExporter (not at namespace scope) so they can inherit from the protected Column<T> base — same convention OTelExporter and ClickHouseExporter use for their own column types. Columns the caller doesn't explicitly populate are synthesized in BeginRow by the exporter itself, so stats_exporter.cc's column-emission loop stays unchanged: * parent_query_id (hardcoded 0 until PR #95 lands and PschEvent carries the field — events_raw requires the column on every insert, no DEFAULT) * 8 envelope columns from pg_stat_ch.extra_attributes (instance_ubid, server_ubid, server_role, region, cell, host_id, pod_name) plus read_replica_type (default 'none' if extra_attributes didn't supply) * service_version pinned to the compile-time PG_STAT_CH_VERSION macro This commit only adds the exporter file (no dispatcher wiring yet) — the next commit adds the GUC and routes batches through it when on. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New bool GUC pg_stat_ch.use_unified_arrow_exporter (default off, PGC_SIGHUP)
opts a producer instance into the OTelArrowExporter added in the previous
commit. When on (together with use_otel and otel_arrow_passthrough), the
bgworker constructs an OTelArrowExporter at init time and PschExportBatch
goes through ExportEventStats (the typed column interface) instead of
calling ExportEventsAsArrow (the legacy bypass that uses arrow_batch.cc
directly).
When off — the default — behavior is preserved bit-for-bit:
* arrow_batch.cc / ExportEventsAsArrow remain reachable on the
arrow_passthrough path
* The OTelExporter column-emission path (off-arrow OTel logs) remains
reachable when arrow_passthrough is off
* The CH-native ClickHouseExporter remains reachable when use_otel is off
The sprintf-decimal-string ID encoding in arrow_batch.cc lives on for
the legacy path; the new exporter neither calls into it nor perpetuates
it. After the GUC has been on in prod long enough to retire the legacy
query_logs_arrow table, the arrow_batch.cc + ExportEventsAsArrow* + the
otel_arrow_passthrough GUC itself can be deleted in a single follow-on.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…_raw
Closes the test gap that's existed since the Arrow path went live: no
existing test proves that pg_stat_ch's Arrow IPC output can actually be
ingested by ClickHouse against the unified events_raw schema. t/026
asserts on the IPC schema shape via pyarrow but never pushes the bytes
into CH; t/010 etc. exercise the CH-native Block path, not Arrow.
The new test wires the full producer-to-CH chain locally, bypassing
the OTel collector + receiver service entirely:
1. Spin up a node with use_unified_arrow_exporter=on +
debug_arrow_dump_dir set, an OTel endpoint that doesn't resolve so
gRPC send fails — MaybeDumpArrowBatch fires BEFORE send so IPC
files land on disk regardless.
2. Run a deliberately-shaped workload (SELECT, CREATE, INSERT,
SELECT count, DROP — five distinct statements).
3. Force pg_stat_ch_flush(), wait for IPC files in $dump_dir.
4. TRUNCATE pg_stat_ch.events_raw, then for each IPC file:
curl -X POST --data-binary @$f \
'http://localhost:18123/?query=INSERT INTO pg_stat_ch.events_raw FORMAT ArrowStream'
A type mismatch on the wire (e.g. if the producer regressed to
writing query_id as String) would surface here as a 4xx with a
clear error rather than silently corrupting data.
5. SELECT count() FROM events_raw, assert >= 5 rows.
6. Pull system.columns and assert each id/counter column has the
declared type from PR #99's schema (no silent string-typed regressions).
7. Pinpoint the marker SELECT row and assert db_name/db_operation/
query_text values match what we sent.
8. Assert envelope columns (instance_ubid, server_role, region, cell,
read_replica_type) carry the values from pg_stat_ch.extra_attributes.
9. Assert parent_query_id is 0 across all rows (synthesized by the
exporter until PR #95 lands).
Skips cleanly when Docker / the test CH container / the events_raw
schema aren't available — same patterns as t/010, t/013, t/021.
The "no OTel collector required" property makes this test purely a
producer⇄CH wire-format check. The clickgres-platform Go receiver is
not exercised here, since for verifying that the bytes match the
schema, a curl invocation is the simplest possible expression of "POST
this Arrow IPC body to CH" — the receiver's only added value over
curl in prod is OTel-collector pipeline integration, which we don't
care about for wire-format correctness.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0e48f7f to
021e75f
Compare
There was a problem hiding this comment.
Pull request overview
Adds a new unified Arrow IPC exporter (behind a new GUC) that emits the events_raw-shaped ArrowStream payload via the existing OTel gRPC transport, plus an end-to-end TAP test that validates ClickHouse can ingest the dumped IPC bytes.
Changes:
- Introduces
OTelArrowExporterimplementingStatsExporterend-to-end with typed Arrow builders targetingevents_raw. - Adds
pg_stat_ch.use_unified_arrow_exporter(PGC_SIGHUP, default off) and dispatch logic to select the unified path vs the legacyarrow_batch.ccbypass. - Adds
t/036_unified_arrow_e2e.plto dump Arrow IPC and insert it into ClickHouse usingFORMAT ArrowStream, asserting basic correctness.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| test/regression/expected/guc.out | Updates expected GUC listing to include pg_stat_ch.use_unified_arrow_exporter. |
| t/036_unified_arrow_e2e.pl | New TAP test for Arrow IPC → ClickHouse events_raw ingestion. |
| src/export/stats_exporter.cc | Wires unified exporter construction and routes unified batches through ExportEventStats. |
| src/export/otel_arrow_exporter.h | Declares MakeUnifiedArrowExporter() factory. |
| src/export/otel_arrow_exporter.cc | Implements unified Arrow IPC builder/exporter and envelope column synthesis. |
| src/config/guc.c | Adds new boolean GUC definition and backing variable. |
| include/config/guc.h | Exposes the new GUC variable to the exporter code. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Three real findings worth addressing, plus a TODO for one that's worth
flagging but bigger than this PR:
src/export/otel_arrow_exporter.cc:
* NumExported() now returns row_count_ instead of delegating to the
inner OTelExporter. The inner exporter's exported_count_ accumulates
across batches because we never call inner_->BeginBatch() (its
per-record LogRecord state machine is unused on this path), and
PschExportBatch fetch_add's NumExported() into shared stats once per
successful commit — so the cumulative inner count would cause
quadratic over-reporting. (Cursor Bugbot, medium.)
* ExtraAttrs class comment said "last write wins on duplicate keys"
but Get() does a linear scan from the front and returns the first
match. Comment now reflects what the code actually does. Duplicate
keys in pg_stat_ch.extra_attributes are a user error in practice;
no behavior change. (Copilot.)
* Added a TODO(memory-budget) at CommitBatch documenting the missing
mid-batch flush against psch_otel_max_block_bytes. The legacy
ExportEventsAsArrowInternal flushes when the builder exceeds 3 MiB;
this exporter ships the whole batch in one IPC. Acceptable while
the GUC defaults off, but the budget check has to land before the
default flips. Plumbing it through requires either invalidating
caller-held column shared_ptrs at the flush boundary or threading
a per-row size hook through the StatsExporter interface. (Cursor
Bugbot, high — disagreeing on severity for the shadow-rollout
window but acknowledging the underlying gap.)
* Comment note on MaybeDumpArrowBatch acknowledging the intentional
duplication with stats_exporter.cc — both copies die when the
legacy path retires, so extracting a shared helper now would be a
header just to delete it later. (Copilot, push-back.)
t/036_unified_arrow_e2e.pl:
* curl now uses --fail-with-body so HTTP 4xx/5xx from CH surfaces as
a non-zero exit at the assertion site. The downstream SELECT count()
assertion would catch a real ingestion failure too, but the sharper
error at the source is a strict improvement. (Copilot.)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The bgworker constructs the exporter implementation once at init time
based on this GUC, but PschExportBatch was re-reading it every cycle to
choose between the legacy ExportEventsAsArrow bypass and the new
unified ExportEventStats path. Under PGC_SIGHUP, the two reads can
disagree:
- Init=on, runtime=off: dispatcher takes the legacy bypass and calls
SendArrowBatch on an OTelArrowExporter, which doesn't override it,
so every batch silently drops.
- Init=off, runtime=on: dispatcher runs the unified path through a
plain OTelExporter, which ships OTLP log records instead of Arrow
IPC. events_raw never sees the data.
Flag is a producer-shape feature toggle, not an operational tunable;
flipping it should be a deliberate restart.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit ae0334c. Configure here.
| if (psch_use_otel && psch_otel_arrow_passthrough && psch_use_unified_arrow_exporter) { | ||
| // New unified path: typed Arrow column wrappers driving the | ||
| // StatsExporter interface end-to-end, writing the events_raw schema. | ||
| g_exporter.exporter = MakeUnifiedArrowExporter(); |
There was a problem hiding this comment.
Passthrough reload mismatches exporter
Medium Severity
PschExporterInit only builds MakeUnifiedArrowExporter when otel_arrow_passthrough is already on at postmaster start, but that GUC is PGC_SIGHUP while use_unified_arrow_exporter is PGC_POSTMASTER. After reload turns passthrough on with unified already enabled, PschExportBatch skips the legacy Arrow bypass yet still runs ExportEventStats on a plain OTelExporter, so batches go out as per-record OTLP instead of Arrow IPC.
Additional Locations (2)
Reviewed by Cursor Bugbot for commit ae0334c. Configure here.
| explicit ExtraAttrs(const char* raw) { | ||
| if (raw == nullptr) { | ||
| return; | ||
| } | ||
| std::string_view input(raw); | ||
| while (!input.empty()) { | ||
| const size_t delim = input.find(';'); | ||
| const std::string_view token = | ||
| (delim == std::string_view::npos) ? input : input.substr(0, delim); | ||
| const size_t sep = token.find(':'); | ||
| if (sep != std::string_view::npos) { | ||
| attrs_.emplace_back(std::string(token.substr(0, sep)), std::string(token.substr(sep + 1))); | ||
| } | ||
| if (delim == std::string_view::npos) { | ||
| break; | ||
| } | ||
| input.remove_prefix(delim + 1); | ||
| } | ||
| } |
| "PGC_POSTMASTER: the bgworker picks the exporter implementation at init time, " | ||
| "so a runtime change would mismatch the per-cycle dispatcher (Arrow batches " | ||
| "would silently drop, since OTelArrowExporter does not implement SendArrowBatch).", |
| # 6. Assert: row count matches what we sent, column types are what | ||
| # events_raw declares (no silent string-to-Int coercion), known queries | ||
| # land with the right db_name/db_operation/query_text values. |
| # Verify the events_raw schema is present (applied by the goose migration | ||
| # step in CI). If we hit an empty CH or a different schema, fail loudly | ||
| # rather than silently inserting into nothing. |
…_block_bytes OTelArrowExporter accumulated rows into a single Arrow IPC up to psch_batch_max (default 200000) with no byte ceiling, so a queue backlog could produce a payload exceeding gRPC's 4 MiB wire cap or otelcol's HTTP body cap. Closes the gap relative to ExportEventsAsArrowInternal which flushes mid-batch when the builder's estimated bytes cross psch_otel_max_block_bytes. Column wrappers each take a pointer to bytes_estimate_ and bump it per Append (sizeof for fixed-width, value bytes + 4 for var-length offsets). BeginRow samples bytes_estimate_ at the row boundary; if it has crossed max_block_bytes_ and at least one row is already accumulated, Flush() runs, ships the chunk, and resets per-flush state. Arrow ArrayBuilder::Finish leaves builders empty and reusable, so subsequent chunks share the same slot vector + column shared_ptrs without any re-registration. NumExported now reports exported_in_batch_ (cumulative across all chunks in the active batch) instead of the per-chunk row_count_, otherwise mid-batch flushes would erase the dispatcher's view of work done. A sticky batch_failed_ flag poisons CommitBatch after any mid-batch Flush failure so partial batches never count as a success. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>


Summary
Land the unified Arrow exporter behind a GUC, with an end-to-end TAP test that proves Arrow IPC bytes from pg_stat_ch can be ingested by ClickHouse against the
events_rawschema. Draft until CI confirms — local build is Linux-only.Three reviewable commits:
feat(exporter): add OTelArrowExporter implementing StatsExporter end-to-end— newsrc/export/otel_arrow_exporter.{cc,h}. Implements theStatLC*/StatHC*/StatTimestamp/Db*Columninterface with typed Arrow column wrappers. Composition: holds an innerOTelExporterfor gRPC transport (SendArrowBatch), doesn't extend it. Wire shape targetsevents_raw:query_id→Int64(no sprintf decimal-string)pid→Int32err_elevel→UInt8Int64parallel_workers_*→Int16jit_*→Int32DictionaryUtf8; HC strings → plainutf8ts→arrow::timestamp(MICRO, "UTC")matchingDateTime64(6, 'UTC')Synthesizes envelope columns (the 8
pg_stat_ch.extra_attributes-derived columns,read_replica_typedefaulting to'none',service_versionpinned toPG_STAT_CH_VERSION) sostats_exporter.cc's column-emission loop stays unchanged. Does not emitparent_query_id— that column isn't in the events_raw schema yet; PR feat: emit parent_query_id to link nested SPI queries #95 (still open) will add both the schema migration and the producer-side wiring together.feat(exporter): add use_unified_arrow_exporter GUC + dispatcher wiring— new bool GUCpg_stat_ch.use_unified_arrow_exporter(default off,PGC_SIGHUP). When on withuse_otel+otel_arrow_passthrough,PschExporterInitconstructs the new exporter andPschExportBatchroutes throughExportEventStats(the typed interface) instead of the legacyExportEventsAsArrowbypass. When off, behavior is preserved bit-for-bit — legacyarrow_batch.ccpath, CH-native path, and OTel column-emission path all still reachable on their existing GUC combinations.test: end-to-end round-trip of the unified Arrow exporter into events_raw—t/036_unified_arrow_e2e.pl. Spins up a node withuse_unified_arrow_exporter=on, runs a deliberately-shaped workload (SELECT/CREATE/INSERT/SELECT count/DROP), waits for IPC files indebug_arrow_dump_dir, curls each file into CH asINSERT INTO pg_stat_ch.events_raw FORMAT ArrowStream, asserts:db_name/db_operation/query_textextra_attributesThe "no OTel collector required" property of the test — bypassing the clickgres-platform Go receiver entirely via the
debug_arrow_dump_dir+ curl chain — makes this a pure producer⇄CH wire-format check. That receiver service's only added value in prod is OTel-collector-pipeline integration, which doesn't affect wire-format correctness.Release intent
This is the first GUC-toggleable producer-side change of the unification effort. Worth tagging as a release: the GUC default is
off, so existing producers are unaffected, but it gives prod operators a knob to opt instances into the new exporter for the cutover fromquery_logs_arrow(the legacy receiver table) toevents_raw(PR #99's unified shape).Out of scope (future commits)
arrow_batch.cc,ExportEventsAsArrow, thepsch_otel_arrow_passthroughGUC, and theOTelExportercolumn-emission machinery (IntColumn/SvColumn/StrColumn/DateTimeCol/DurationCol+BeginBatch/BeginRow/FlushChunk/the chunk-state machinery) — all becomes dead code once the GUC default flips and the legacyquery_logs_arrowtable is retired. ~200 LOC saving.parent_query_idthrough to events_raw, both the schema-migration and producer side. PR feat: emit parent_query_id to link nested SPI queries #95 owns this end-to-end.events_raw.Test plan
t/036_unified_arrow_e2e.pl(the new test)t/010(CH-native),t/013(TLS),t/024(OTel column-emission),t/026(Arrow dump shape) still pass — the GUC defaults off so none of those paths change behavior🤖 Generated with Claude Code
Note
Medium Risk
Changes the telemetry export pipeline and wire format when the new GUC is enabled; default off limits blast radius, but misconfiguration or partial enablement could drop or mis-route Arrow batches, and export accounting logic is subtle.
Overview
Adds an opt-in unified Arrow export path for the
events_rawClickHouse schema, behindpg_stat_ch.use_unified_arrow_exporter(default off,PGC_POSTMASTER).OTelArrowExporterimplementsStatsExporterend-to-end: typed Arrow builders (dictionary LC strings, plain HC types, microsecond UTC timestamps), ZSTD Arrow IPC, optionaldebug_arrow_dump_dirdumps, synthesized envelope columns fromextra_attributes, and gRPC transport via a composed innerOTelExporter. Mid-batch flush honorsotel_max_block_bytes;NumExportedcounts only the current batch to avoid double-counting shared stats.Dispatcher wiring: when
use_otel+otel_arrow_passthrough+ the new GUC are on, init selectsMakeUnifiedArrowExporter()and batches go throughExportEventStatsinstead of the legacyExportEventsAsArrow/arrow_batch.ccbypass. With the GUC off, behavior is unchanged.t/036_unified_arrow_e2e.pldumps IPC from the new exporter andINSERT … FORMAT ArrowStreamintopg_stat_ch.events_raw, asserting row counts, key column types, workload rows, and envelope fields. GUC regression output updated.Reviewed by Cursor Bugbot for commit 34d6b49. Bugbot is set up for automated code reviews on this repo. Configure here.