Skip to content

schema: unify pg_stat_ch events_raw with prod Arrow path, move into Goose layout#99

Merged
JoshDreamland merged 10 commits into
mainfrom
unified_schema
Jun 10, 2026
Merged

schema: unify pg_stat_ch events_raw with prod Arrow path, move into Goose layout#99
JoshDreamland merged 10 commits into
mainfrom
unified_schema

Conversation

@JoshDreamland

@JoshDreamland JoshDreamland commented May 20, 2026

Copy link
Copy Markdown
Contributor

Summary

Pre-GA unification of pg_stat_ch's ClickHouse schema. The docker quickstart schema (docker/init/00-schema.sql) and the production Arrow receiver schema (datagres_otel.query_logs_arrow in clickgres-platform) had drifted apart on both column naming and column types. This PR makes pg_stat_ch the source of truth for the unified shape and moves the canonical schema into a Goose migrations layout under schema/migrations/ matching the runner clickgres-platform already uses (pressly/goose v3, DialectClickHouse, embed.FS).

Two commits, structured so git rename detection can follow the file's evolution cleanly:

Commit 1 — 9c39ecd: in-place unification + tests pass

In-place edits to docker/init/00-schema.sql so the docker quickstart schema matches what prod actually writes to, plus the CH-native exporter and TAP tests updated to use the new column names.

Column renames (prod-side wins — closer to OTel semantic conventions, minimizes downstream churn):

  • ts_startts
  • dbdb_name
  • usernamedb_user
  • cmd_typedb_operation
  • queryquery_text

Type fix: err_sqlstate FixedString(5)LowCardinality(String). FixedString doesn't round-trip through Arrow IPC cleanly, and ~270 SQLSTATE codes are dictionary-friendly. CH-native exporter switches from MetricFixedString(5, …) to TagString(…) (clickhouse-cpp ColumnString → CH LowCardinality(String) is fine on the wire).

Envelope columns added with DEFAULT '' so the CH-native exporter (which doesn't yet emit these) still inserts successfully: instance_ubid, server_ubid, server_role, region, cell, service_version, host_id, pod_name.

Engine/partitioning aligned with prod:

  • ORDER BY tsORDER BY (instance_ubid, ts) (tenant locality)
  • TTL toDate(ts) + INTERVAL 180 DAY
  • SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1

All four MVs (events_recent_1h, query_stats_5m, db_app_user_1m, errors_recent) updated to reference the new column names and to include instance_ubid in their ORDER BY / GROUP BY / SELECT projections.

parent_query_id is intentionally not included here — it belongs to PR #95 (parent-query-id-surgical) and will land as its own follow-up migration in schema/migrations/ after this PR.

Commit 2 — 299115b: rename + goose annotations

git mv docker/init/00-schema.sql schema/migrations/20260519000001_create_initial_schema.sql (96% similarity per git's rename detection), with:

  • Header banner rewritten from "CANONICAL SCHEMA REFERENCE / single source of truth" framing to "initial migration" framing.
  • -- +goose Up / -- +goose Down section markers added.
  • Each CREATE wrapped in -- +goose StatementBegin / StatementEnd.
  • Pre-CREATE DROP TABLE IF EXISTS X idioms removed — those existed for docker init idempotency on restart; goose tracks state via goose_db_version. Drops live exclusively in the -- +goose Down section in reverse dependency order.

Also adds schema/migrations/00000000000001_bootstrap.sql, a no-op SELECT 1 migration required by goose to seed its version table (copied verbatim from clickgres-platform's bootstrap).

What's been validated locally

  • docker/init/00-schema.sql (commit 1) applies cleanly on clickhouse/clickhouse-server:26.1 (the version pinned in docker/docker-compose.test.yml); all 51 columns and 4 MVs land with the expected types.
  • Envelope columns' DEFAULT '' lets INSERTs that omit them succeed.
  • schema/migrations/ (commit 2) round-trips clean via goose v3.27.1 up and goose reset against CH 26.1.
  • C++ source edits compile-clean syntactically (no diagnostics from the column-name changes); the local build needs a Linux env (presets are all Linux-targeted), so CI's the real check.

Out of scope (follow-on PRs)

  • Wire docker/init/ and docker-compose.test.yml to run goose-up at container start. Currently docker/init/00-schema.sql is gone, so the docker quickstart and the test compose need a small shim to apply migrations from schema/migrations/ (clickhouse-server's docker entrypoint can't parse -- +goose Up/Down directly).
  • Coordinated cutover from clickgres-platform's query_logs_arrow to the unified events_raw, including historical backfill via INSERT INTO events_raw SELECT … FROM query_logs_arrow with explicit casts for the renamed/retyped columns.
  • After PR feat: emit parent_query_id to link nested SPI queries #95 merges, add a follow-on migration schema/migrations/<ts>_add_parent_query_id.sql for the column it introduces.

Test plan

  • TAP suite passes on PG 16/17/18 × amd64/arm64 (CI)
  • PGXN / code style / Clang checks (CI)
  • Spot-check the rename diff on commit 2 — git should report it as a 96% similarity rename

🤖 Generated with Claude Code


Note

High Risk
Breaking ClickHouse schema and INSERT column contract for any deployment still on the old docker/init SQL; existing data and dashboards need coordinated migration. Core export path and all CH integration tests depend on the new shape landing correctly.

Overview
Moves the canonical ClickHouse schema into Goose migrations under schema/migrations/ (bootstrap + initial migration) and changes CI TAP setup to install goose and apply migrations instead of piping docker/init/00-schema.sql. docker/init/ is kept as an empty bind-mount placeholder via .gitkeep.

The initial migration unifies events_raw with the production Arrow shape: renames core columns (ts_startts, dbdb_name, usernamedb_user, cmd_typedb_operation, queryquery_text), changes err_sqlstate to LowCardinality(String), adds OTel envelope columns with DEFAULT '', and aligns ORDER BY to (instance_ubid, ts) plus TTL on raw data. All four materialized views are updated for the new names and tenant locality.

The ClickHouse native exporter and TAP tests are updated to insert/query the new column names; err_sqlstate is exported via TagString instead of fixed-width metrics.

Reviewed by Cursor Bugbot for commit 5934589. Bugbot is set up for automated code reviews on this repo. Configure here.

@serprex serprex requested a review from amogiska May 26, 2026 02:09
@serprex

serprex commented May 26, 2026

Copy link
Copy Markdown
Member

does this need corresponding change in clickgres-platform? eg for query_id being Int64 vs String

can clickgres-platform submodule pg_stat_ch for schema?

description talks of cutover, but seems that's not too necessary since CH exporter not being used in prod

can we add test that does both CH & arrow? potentially one-after-the-other. to test schema compatibility

JoshDreamland and others added 3 commits June 8, 2026 18:15
In-place patch of docker/init/00-schema.sql, the CH-native exporter, and
the TAP tests so the docker quickstart schema aligns with what prod
actually writes to (datagres_otel.query_logs_arrow in clickgres-platform).
This is the pre-cutover unification: pg_stat_ch's CH-native path was
previously isolated from prod, and the two schemas had drifted apart on
both column naming and types.

Column renames (prod-side naming wins; closer to OTel semantic
conventions and minimizes downstream churn):
  ts_start    -> ts
  db          -> db_name
  username    -> db_user
  cmd_type    -> db_operation
  query       -> query_text

Type fix:
  err_sqlstate FixedString(5) -> LowCardinality(String)
    FixedString does not round-trip through Arrow IPC cleanly, and ~270
    SQLSTATE codes are dictionary-friendly. The CH-native exporter is
    updated to write the column via TagString (clickhouse-cpp's
    ColumnString -> CH LowCardinality(String) is fine on the wire).

Envelope columns added (with DEFAULT '' so the CH-native exporter, which
does not yet emit these, continues to insert successfully):
  instance_ubid, server_ubid, server_role, region, cell,
  service_version, host_id, pod_name

Engine/partitioning aligned with prod:
  ORDER BY ts -> ORDER BY (instance_ubid, ts)   (tenant locality)
  TTL added: toDate(ts) + INTERVAL 180 DAY
  SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1

Materialized views (events_recent_1h, query_stats_5m, db_app_user_1m,
errors_recent) updated to reference the new column names and to include
instance_ubid in their ORDER BY / GROUP BY / SELECT projections so they
remain consistent with the events_raw partitioning strategy.

Test fixtures updated to query the new column names:
  t/010_clickhouse_export.pl, t/012_timing_accuracy.pl,
  t/021_cmd_type_counts.pl, t/027_query_normalization.pl,
  t/031_normalize_cache.pl

parent_query_id is intentionally NOT included here — it's the subject of
PR #95 (parent-query-id-surgical) and lands as its own follow-up
migration after this PR.

Validated end-to-end: docker/init/00-schema.sql applies cleanly on
clickhouse/clickhouse-server:26.1 (the version pinned in
docker/docker-compose.test.yml); INSERTs that omit the envelope columns
fill them via DEFAULT ''; all 4 MVs build. CI will run the TAP suite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mechanical move + goose annotations. The pg_stat_ch ClickHouse schema
that was previously the docker quickstart init script becomes the first
real Goose migration under schema/migrations/, matching
clickgres-platform's runner layout (pressly/goose v3,
DialectClickHouse, embed.FS).

Changes to the content of the moved file:
  * Header banner rewritten from "CANONICAL SCHEMA REFERENCE / single
    source of truth / dual role as docker init" to "initial migration"
    framing.
  * Added -- +goose Up / -- +goose Down section markers.
  * Each CREATE DATABASE / CREATE TABLE / CREATE MATERIALIZED VIEW
    wrapped in -- +goose StatementBegin / StatementEnd so goose's
    parser handles the multi-statement bodies correctly.
  * Removed the pre-CREATE "DROP TABLE IF EXISTS X" idioms — those
    existed to make the docker init script idempotent on container
    restart, but goose tracks state via goose_db_version. Drops now
    live exclusively in the -- +goose Down section in reverse
    dependency order.

The schema content itself (column names, types, MV definitions,
ORDER BY / TTL / SETTINGS) is unchanged from the previous commit.
Git rename detection should follow docker/init/00-schema.sql ->
schema/migrations/20260519000001_create_initial_schema.sql.

Also adds schema/migrations/00000000000001_bootstrap.sql, a no-op
SELECT 1 migration required by goose to seed the goose_db_version
table (copied verbatim from clickgres-platform's bootstrap).

Validated end-to-end against clickhouse/clickhouse-server:26.1:
pressly goose v3.27.1 `up` and `reset` round-trip cleanly. All 51
columns and 4 MVs land with the expected types.

Note: this leaves docker/init/ empty. The docker-compose mounts will
need updating in a follow-on PR to point at schema/migrations/ (which
requires a small shim to invoke goose-up at container start, since
clickhouse-server's docker entrypoint cannot parse goose's
StatementBegin/End directives directly).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…client

The previous "Initialize ClickHouse schema" step ran
`clickhouse-client --multiquery < docker/init/00-schema.sql`. That file
moved in the previous commit; pointing the step at the new location
without further changes would not work because clickhouse-client cannot
parse goose -- +goose Up/Down/StatementBegin/End directives, and would
execute the Down section's DROP statements right after the Up section's
CREATEs.

Switch the step to install pressly/goose v3.27.1 (~5 sec on Ubuntu CI
runners which have Go preinstalled) and apply the migrations from
schema/migrations/ via `goose ... up` against the running CH container.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JoshDreamland and others added 2 commits June 8, 2026 18:22
Pre-add the column being introduced on both sides of the Arrow pipeline
to keep the unified events_raw schema wire-compatible once the cutover
happens:

  * pg_stat_ch PR #107 — adds read_replica_type to the Arrow IPC output
    in arrow_batch.cc (dictionary-encoded, populated from the
    pg_stat_ch.extra_attributes GUC).
  * clickgres-platform PR #448 — adds the matching ALTER TABLE on
    query_logs_arrow (LowCardinality(String) DEFAULT 'none' AFTER
    server_role) and promotes read-replica traffic into query_logs via
    a widened MV filter.

Mirroring the same type, default, and position here means the eventual
cutover from query_logs_arrow to events_raw needs zero further schema
changes — and lets PR #107 rebase onto unified_schema without having
to amend the schema migration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@JoshDreamland JoshDreamland marked this pull request as ready for review June 9, 2026 21:42
Copilot AI review requested due to automatic review settings June 9, 2026 21:42
Comment thread .github/workflows/ci-tap.yml

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR unifies pg_stat_ch’s ClickHouse events_raw schema with the production Arrow receiver shape (column names/types, ordering/TTL, and MV definitions), and makes schema/migrations/ (pressly/goose layout) the canonical, CI-applied schema source. It also updates the ClickHouse exporter + TAP suite queries to use the unified column names.

Changes:

  • Moved the canonical ClickHouse schema into Goose migrations (schema/migrations/), including a bootstrap migration and an initial schema migration (events_raw + 4 MVs).
  • Updated exporters/tests to the unified column names (ts, db_name, db_user, db_operation, query_text) and changed err_sqlstate to LowCardinality(String) (exporter uses TagString).
  • Updated TAP CI to install and run goose migrations against the test ClickHouse container.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
t/034_query_intern_oom_export.pl Updates ClickHouse queries to query_text and related fields for OOM export assertions.
t/031_normalize_cache.pl Updates ClickHouse query_text usage and ordering by ts.
t/027_query_normalization.pl Updates ClickHouse queries to query_text and ordering by ts.
t/021_cmd_type_counts.pl Switches aggregation queries to group by db_operation.
t/013_clickhouse_tls.pl Updates ClickHouse query checks to query_text.
t/012_timing_accuracy.pl Updates ClickHouse filters/order-by to query_text/ts.
t/010_clickhouse_export.pl Updates ClickHouse validation queries to query_text, db_name, db_operation.
src/export/stats_exporter.cc Emits ts and uses TagString("err_sqlstate") with string conversion.
src/export/exporter_interface.h Updates semantic column comments/examples to unified names.
src/export/clickhouse_exporter.cc Maps semantic columns to db_name, db_user, db_operation, query_text.
schema/migrations/20260519000001_create_initial_schema.sql Adds Goose “initial schema” migration with unified table/MVs and prod-aligned ordering/TTL/settings.
schema/migrations/00000000000001_bootstrap.sql Adds Goose bootstrap migration (SELECT 1).
docker/init/.gitkeep Leaves docker/init/ as an empty placeholder for legacy compose bind mounts.
.github/workflows/ci-tap.yml Installs goose and applies migrations to initialize ClickHouse schema in CI.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +104 to +111
- name: Install goose
run: go install github.com/pressly/goose/v3/cmd/goose@v3.27.1

- name: Initialize ClickHouse schema
run: |
docker exec psch-clickhouse clickhouse-client --multiquery < docker/init/00-schema.sql
docker exec psch-clickhouse clickhouse-client -q "CREATE DATABASE IF NOT EXISTS pg_stat_ch"
"$HOME/go/bin/goose" -dir schema/migrations \
clickhouse "tcp://localhost:19000?database=pg_stat_ch" up
Comment thread t/034_query_intern_oom_export.pl
Comment thread docker/init/.gitkeep
Comment on lines +1 to +5
# Reserved for legacy docker/quickstart bind mounts. The canonical CH schema
# now lives in schema/migrations/ and is applied via goose (see CI workflow).
# This directory is intentionally empty; the .gitkeep keeps the bind mount
# in docker/docker-compose.test.yml and docker/quickstart/docker-compose.yml
# from failing on a missing host path.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 9, 2026 22:51
… $HOME/go/bin

Per Copilot review on #99: $HOME/go/bin assumes GOPATH=$HOME/go, which is the
Go default but not guaranteed (CI runners can configure GOPATH elsewhere, and
GOBIN can override the install location entirely). Resolve the actual path via
`go env GOPATH` and add it to $GITHUB_PATH so subsequent steps can just call
`goose` without a path prefix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

schema/migrations/20260519000001_create_initial_schema.sql:196

  • The comment suggests DEFAULT '' is required for ClickHouse inserts to succeed, but the ClickHouse exporter builds INSERT INTO events_raw (<col list>) (see src/export/clickhouse_exporter.cc), so omitted envelope columns will already use their DEFAULT/type defaults. Updating the comment avoids implying a stricter requirement than actually exists.

Comment thread src/export/stats_exporter.cc Outdated
Comment thread t/021_cmd_type_counts.pl
Comment on lines +31 to 35
# Helper: parse db_operation counts from ClickHouse
sub get_cmd_type_counts {
my $result = psch_query_clickhouse(
"SELECT cmd_type, count() FROM pg_stat_ch.events_raw GROUP BY cmd_type FORMAT TabSeparated"
"SELECT db_operation, count() FROM pg_stat_ch.events_raw GROUP BY db_operation FORMAT TabSeparated"
);
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 9, 2026 22:59
Per Copilot review on #99: identifiers carrying the old column names
through Perl helpers and C++ locals were misleading after the column
renames in commit 1. Updating them so failure messages and stack traces
reflect what the code now actually does.

  t/021_cmd_type_counts.pl:
    sub get_cmd_type_counts -> sub get_db_operation_counts
                               (+ 4 call sites)
    node name 'cmd_type_counts' -> 'db_operation_counts'

  src/export/stats_exporter.cc:
    col_db        -> col_db_name
    col_username  -> col_db_user
    col_cmd_type  -> col_db_operation
    col_query     -> col_query_text

File name t/021_cmd_type_counts.pl is left as-is (per earlier decision -
file rename is more churn than it's worth pre-GA; the TAP summary report
will read fine).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit b8282dc. Configure here.

Comment thread .github/workflows/ci-tap.yml

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

schema/migrations/20260519000001_create_initial_schema.sql:196

  • The envelope header comment says these resource-attribute columns default to '', but read_replica_type actually defaults to 'none'. Tweaking the comment avoids documenting a default that the schema doesn’t implement.

Comment thread t/021_cmd_type_counts.pl
);

# Helper: parse cmd_type counts from ClickHouse
# Helper: parse db_operation counts from ClickHouse
Comment on lines +147 to +149
# - every intern_failed row still carries duration_us, db_name, and
# db_operation — the numeric/identity telemetry the customer relies on
# for slow-query analysis even when SQL text is unavailable.
Comment thread docker/init/.gitkeep
Comment on lines +1 to +5
# Reserved for legacy docker/quickstart bind mounts. The canonical CH schema
# now lives in schema/migrations/ and is applied via goose (see CI workflow).
# This directory is intentionally empty; the .gitkeep keeps the bind mount
# in docker/docker-compose.test.yml and docker/quickstart/docker-compose.yml
# from failing on a missing host path.
Copilot's autofix in 12ab775 shifted the line break from before
std::string( to after the opening paren while changing
strnlen(..., sizeof(...) - 1). The new wrap position trips
clang-format on the column-limit rule. Move the break back to where it
was originally (before std::string), keep the sizeof - 1 fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 9, 2026 23:05

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

schema/migrations/20260519000001_create_initial_schema.sql:196

  • The envelope section comment says attributes default to '' so the CH-native exporter can omit them, but read_replica_type defaults to 'none'. This makes the comment inaccurate/misleading for anyone editing defaults later.

Comment on lines +104 to +108
- name: Install goose
run: |
go install github.com/pressly/goose/v3/cmd/goose@v3.27.1
echo "$(go env GOPATH)/bin" >> "$GITHUB_PATH"

@JoshDreamland JoshDreamland merged commit c5aaa87 into main Jun 10, 2026
13 checks passed
@JoshDreamland JoshDreamland deleted the unified_schema branch June 10, 2026 15:32
JoshDreamland added a commit that referenced this pull request Jun 18, 2026
…FixedStringCol class

`MetricFixedString` was added in Feb 2026 (commit 9f89f4) when err_sqlstate
was the only column in events_raw of type FixedString(5). It had exactly
one caller, exactly one width, and was always going to.

PR #99 (merged 2026-06-10) changed err_sqlstate from FixedString(5) to
LowCardinality(String) and updated the call site from
`exporter->MetricFixedString(5, "err_sqlstate")` to
`exporter->TagString("err_sqlstate")`. The interface method became
unreferenced.

PR #104 (merged 2026-06-04 — before #99) had to reproduce
MetricFixedString in clickhouse-c by hand-rolling a `FixedStringCol`
class, since clickhouse-c has no built-in FixedString column type
unlike clickhouse-cpp. That work was correct when authored — the
column was still live — but became orphaned a week later when #99
landed. Nobody noticed.

Deletes the now-dead virtual + both impls. The CH-native side loses
the 27-line FixedStringCol class entirely; the OTel side loses a
3-line forward to MakeSvCol; the interface loses one virtual
declaration. Stats_exporter.cc already doesn't reference it.

No behavior change — this was unreachable code. -35 LOC.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JoshDreamland added a commit that referenced this pull request Jun 18, 2026
…LC/StatHC

Consolidate the StatsExporter column-factory methods around cardinality intent.
The previous Tag/Metric/Record split was a vestige of pre-OTel naming and didn't
actually map to anything semantically meaningful — every backend treated
MetricInt64 and RecordInt64 identically. Replace with StatLC* (low-cardinality,
dimension-eligible) and StatHC* (high-cardinality, value-only) variants per
type, plus a domain-specific StatTimestamp for the event timestamp.

Cardinality intent matters per-backend in ways the old naming concealed:

  ClickHouse: LC -> may be stored as LowCardinality(<Type>); HC -> plain.
              Schema-declared encoding wins on write; the LC hint helps
              the producer pick the cheapest column representation.
  Arrow IPC:  LC -> DictBuilder (dictionary-encoded array); HC -> plain
              typed builder. Required for batch-rate efficiency on
              low-cardinality dimensions. (Honored by the upcoming
              unified Arrow exporter; not yet exercised here.)
  OTel:       LC -> eligible as histogram dimension or metric label;
              HC -> log attribute only, *never* a metric dimension
              (cardinality explosion).

Interface shrinks from 14 column factories (TagString + 5 Metric* + 5 Record* +
RecordDateTime + RecordString + MetricFixedString) to 8 (4 LC + 3 HC + Timestamp),
keeping only the wire types stats_exporter.cc actually instantiates. Future
column types (Int8, UInt16, UInt32) can be added when their first caller appears.

Removed:
  * MetricFixedString — dead since PR #99 retired FixedString(5) for err_sqlstate
    in favor of LowCardinality(String).
  * FixedStringCol class in clickhouse_exporter.cc — only used by MetricFixedString.

Rename map applied at the only call site (ExportEventStatsInternal in
stats_exporter.cc), with cardinality chosen per column based on observed data
shape rather than just type width: err_elevel (UInt8) is LC; query_id (Int64)
is HC; parallel_workers_* (Int16) is LC; duration_us (UInt64) is HC. The Db*
semantic shortcuts remain.

Net diff: -30 LOC, no behavior change. Two doc-comments updated in t/024 and
t/psch.pm. The OTel column-emission machinery itself stays alive for now — it
gets retired in a later commit alongside the new unified Arrow exporter that
will satisfy ExportEventStats for the OTel path going forward.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JoshDreamland added a commit that referenced this pull request Jun 18, 2026
`MetricFixedString` was added in Feb 2026 (commit 9f89f4) when err_sqlstate
was the only column in events_raw of type FixedString(5). It had exactly
one caller, exactly one width, and was always going to.

PR #99 (merged 2026-06-10) changed err_sqlstate from FixedString(5) to
LowCardinality(String) and updated the call site from
`exporter->MetricFixedString(5, "err_sqlstate")` to
`exporter->TagString("err_sqlstate")`. The interface method became
unreferenced.

PR #104 (merged 2026-06-04 — before #99) had to reproduce
MetricFixedString in clickhouse-c by hand-rolling a `FixedStringCol`
class, since clickhouse-c has no built-in FixedString column type
unlike clickhouse-cpp. That work was correct when authored — the
column was still live — but became orphaned a week later when #99
landed. Nobody noticed.

Deletes the now-dead virtual + both impls. The CH-native side loses
the 27-line FixedStringCol class entirely; the OTel side loses a
3-line forward to MakeSvCol; the interface loses one virtual
declaration. Stats_exporter.cc already doesn't reference it.

No behavior change — this was unreachable code. -35 LOC.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JoshDreamland added a commit that referenced this pull request Jun 18, 2026
…LC/StatHC

Consolidate the StatsExporter column-factory methods around cardinality intent.
The previous Tag/Metric/Record split was a vestige of pre-OTel naming and didn't
actually map to anything semantically meaningful — every backend treated
MetricInt64 and RecordInt64 identically. Replace with StatLC* (low-cardinality,
dimension-eligible) and StatHC* (high-cardinality, value-only) variants per
type, plus a domain-specific StatTimestamp for the event timestamp.

Cardinality intent matters per-backend in ways the old naming concealed:

  ClickHouse: LC -> may be stored as LowCardinality(<Type>); HC -> plain.
              Schema-declared encoding wins on write; the LC hint helps
              the producer pick the cheapest column representation.
  Arrow IPC:  LC -> DictBuilder (dictionary-encoded array); HC -> plain
              typed builder. Required for batch-rate efficiency on
              low-cardinality dimensions. (Honored by the upcoming
              unified Arrow exporter; not yet exercised here.)
  OTel:       LC -> eligible as histogram dimension or metric label;
              HC -> log attribute only, *never* a metric dimension
              (cardinality explosion).

Interface shrinks from 14 column factories (TagString + 5 Metric* + 5 Record* +
RecordDateTime + RecordString + MetricFixedString) to 8 (4 LC + 3 HC + Timestamp),
keeping only the wire types stats_exporter.cc actually instantiates. Future
column types (Int8, UInt16, UInt32) can be added when their first caller appears.

Removed:
  * MetricFixedString — dead since PR #99 retired FixedString(5) for err_sqlstate
    in favor of LowCardinality(String).
  * FixedStringCol class in clickhouse_exporter.cc — only used by MetricFixedString.

Rename map applied at the only call site (ExportEventStatsInternal in
stats_exporter.cc), with cardinality chosen per column based on observed data
shape rather than just type width: err_elevel (UInt8) is LC; query_id (Int64)
is HC; parallel_workers_* (Int16) is LC; duration_us (UInt64) is HC. The Db*
semantic shortcuts remain.

Net diff: -30 LOC, no behavior change. Two doc-comments updated in t/024 and
t/psch.pm. The OTel column-emission machinery itself stays alive for now — it
gets retired in a later commit alongside the new unified Arrow exporter that
will satisfy ExportEventStats for the OTel path going forward.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JoshDreamland added a commit that referenced this pull request Jun 18, 2026
…LC/StatHC

Consolidate the StatsExporter column-factory methods around cardinality intent.
The previous Tag/Metric/Record split was a vestige of pre-OTel naming and didn't
actually map to anything semantically meaningful — every backend treated
MetricInt64 and RecordInt64 identically. Replace with StatLC* (low-cardinality,
dimension-eligible) and StatHC* (high-cardinality, value-only) variants per
type, plus a domain-specific StatTimestamp for the event timestamp.

Cardinality intent matters per-backend in ways the old naming concealed:

  ClickHouse: LC -> may be stored as LowCardinality(<Type>); HC -> plain.
              Schema-declared encoding wins on write; the LC hint helps
              the producer pick the cheapest column representation.
  Arrow IPC:  LC -> DictBuilder (dictionary-encoded array); HC -> plain
              typed builder. Required for batch-rate efficiency on
              low-cardinality dimensions. (Honored by the upcoming
              unified Arrow exporter; not yet exercised here.)
  OTel:       LC -> eligible as histogram dimension or metric label;
              HC -> log attribute only, *never* a metric dimension
              (cardinality explosion).

Interface shrinks from 14 column factories (TagString + 5 Metric* + 5 Record* +
RecordDateTime + RecordString + MetricFixedString) to 8 (4 LC + 3 HC + Timestamp),
keeping only the wire types stats_exporter.cc actually instantiates. Future
column types (Int8, UInt16, UInt32) can be added when their first caller appears.

Removed:
  * MetricFixedString — dead since PR #99 retired FixedString(5) for err_sqlstate
    in favor of LowCardinality(String).
  * FixedStringCol class in clickhouse_exporter.cc — only used by MetricFixedString.

Rename map applied at the only call site (ExportEventStatsInternal in
stats_exporter.cc), with cardinality chosen per column based on observed data
shape rather than just type width: err_elevel (UInt8) is LC; query_id (Int64)
is HC; parallel_workers_* (Int16) is LC; duration_us (UInt64) is HC. The Db*
semantic shortcuts remain.

Net diff: -30 LOC, no behavior change. Two doc-comments updated in t/024 and
t/psch.pm. The OTel column-emission machinery itself stays alive for now — it
gets retired in a later commit alongside the new unified Arrow exporter that
will satisfy ExportEventStats for the OTel path going forward.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JoshDreamland added a commit that referenced this pull request Jun 18, 2026
…LC/StatHC

Consolidate the StatsExporter column-factory methods around cardinality intent.
The previous Tag/Metric/Record split was a vestige of pre-OTel naming and didn't
actually map to anything semantically meaningful — every backend treated
MetricInt64 and RecordInt64 identically. Replace with StatLC* (low-cardinality,
dimension-eligible) and StatHC* (high-cardinality, value-only) variants per
type, plus a domain-specific StatTimestamp for the event timestamp.

Cardinality intent matters per-backend in ways the old naming concealed:

  ClickHouse: LC -> may be stored as LowCardinality(<Type>); HC -> plain.
              Schema-declared encoding wins on write; the LC hint helps
              the producer pick the cheapest column representation.
  Arrow IPC:  LC -> DictBuilder (dictionary-encoded array); HC -> plain
              typed builder. Required for batch-rate efficiency on
              low-cardinality dimensions. (Honored by the upcoming
              unified Arrow exporter; not yet exercised here.)
  OTel:       LC -> eligible as histogram dimension or metric label;
              HC -> log attribute only, *never* a metric dimension
              (cardinality explosion).

Interface shrinks from 14 column factories (TagString + 5 Metric* + 5 Record* +
RecordDateTime + RecordString + MetricFixedString) to 8 (4 LC + 3 HC + Timestamp),
keeping only the wire types stats_exporter.cc actually instantiates. Future
column types (Int8, UInt16, UInt32) can be added when their first caller appears.

Removed:
  * MetricFixedString — dead since PR #99 retired FixedString(5) for err_sqlstate
    in favor of LowCardinality(String).
  * FixedStringCol class in clickhouse_exporter.cc — only used by MetricFixedString.

Rename map applied at the only call site (ExportEventStatsInternal in
stats_exporter.cc), with cardinality chosen per column based on observed data
shape rather than just type width: err_elevel (UInt8) is LC; query_id (Int64)
is HC; parallel_workers_* (Int16) is LC; duration_us (UInt64) is HC. The Db*
semantic shortcuts remain.

Net diff: -30 LOC, no behavior change. Two doc-comments updated in t/024 and
t/psch.pm. The OTel column-emission machinery itself stays alive for now — it
gets retired in a later commit alongside the new unified Arrow exporter that
will satisfy ExportEventStats for the OTel path going forward.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JoshDreamland added a commit that referenced this pull request Jun 18, 2026
…LC/StatHC (#115)

Consolidate the StatsExporter column-factory methods around cardinality intent.
The previous Tag/Metric/Record split was a vestige of pre-OTel naming and didn't
actually map to anything semantically meaningful — every backend treated
MetricInt64 and RecordInt64 identically. Replace with StatLC* (low-cardinality,
dimension-eligible) and StatHC* (high-cardinality, value-only) variants per
type, plus a domain-specific StatTimestamp for the event timestamp.

Cardinality intent matters per-backend in ways the old naming concealed:

  ClickHouse: LC -> may be stored as LowCardinality(<Type>); HC -> plain.
              Schema-declared encoding wins on write; the LC hint helps
              the producer pick the cheapest column representation.
  Arrow IPC:  LC -> DictBuilder (dictionary-encoded array); HC -> plain
              typed builder. Required for batch-rate efficiency on
              low-cardinality dimensions. (Honored by the upcoming
              unified Arrow exporter; not yet exercised here.)
  OTel:       LC -> eligible as histogram dimension or metric label;
              HC -> log attribute only, *never* a metric dimension
              (cardinality explosion).

Interface shrinks from 14 column factories (TagString + 5 Metric* + 5 Record* +
RecordDateTime + RecordString + MetricFixedString) to 8 (4 LC + 3 HC + Timestamp),
keeping only the wire types stats_exporter.cc actually instantiates. Future
column types (Int8, UInt16, UInt32) can be added when their first caller appears.

Removed:
  * MetricFixedString — dead since PR #99 retired FixedString(5) for err_sqlstate
    in favor of LowCardinality(String).
  * FixedStringCol class in clickhouse_exporter.cc — only used by MetricFixedString.

Rename map applied at the only call site (ExportEventStatsInternal in
stats_exporter.cc), with cardinality chosen per column based on observed data
shape rather than just type width: err_elevel (UInt8) is LC; query_id (Int64)
is HC; parallel_workers_* (Int16) is LC; duration_us (UInt64) is HC. The Db*
semantic shortcuts remain.

Net diff: -30 LOC, no behavior change. Two doc-comments updated in t/024 and
t/psch.pm. The OTel column-emission machinery itself stays alive for now — it
gets retired in a later commit alongside the new unified Arrow exporter that
will satisfy ExportEventStats for the OTel path going forward.

Note that the `Tag...` methods were also unceremoniously renamed in this
commit to look like everything else... that's because they silently started
behaving like everything else as of #72.

---

Two unrelated doc fixes flagged by Copilot's review on #115:

  exporter_interface.h: StatTimestamp's doc said "Postgres-epoch
  microsecond timestamp" but all current callers convert to Unix-epoch
  by adding kPostgresEpochOffsetUs before append. CH DateTime64(6) and
  OTel time_unix_nano both interpret the wire value as Unix-epoch.
  Clarify the contract to describe what the column wants on the wire,
  not what shape the input data happens to be in — defends against a
  future caller passing raw PG-epoch values and getting wrong-by-30-
  years timestamps.

  t/024_otel_export.pl: comment on subtest 'metric labels populated'
  described producer-side metric promotion that hasn't existed since
  PR #72 ripped the OTel SDK out. The producer-side OTel exporter
  emits OTLP log attributes only; the test's Prometheus assertions
  succeed because the downstream OTel collector's log-to-metric
  processor promotes specific log attributes to histogram labels.
  Reword to describe the actual flow.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JoshDreamland added a commit that referenced this pull request Jun 18, 2026
…_raw

Closes the test gap that's existed since the Arrow path went live: no
existing test proves that pg_stat_ch's Arrow IPC output can actually be
ingested by ClickHouse against the unified events_raw schema. t/026
asserts on the IPC schema shape via pyarrow but never pushes the bytes
into CH; t/010 etc. exercise the CH-native Block path, not Arrow.

The new test wires the full producer-to-CH chain locally, bypassing
the OTel collector + receiver service entirely:

  1. Spin up a node with use_unified_arrow_exporter=on +
     debug_arrow_dump_dir set, an OTel endpoint that doesn't resolve so
     gRPC send fails — MaybeDumpArrowBatch fires BEFORE send so IPC
     files land on disk regardless.
  2. Run a deliberately-shaped workload (SELECT, CREATE, INSERT,
     SELECT count, DROP — five distinct statements).
  3. Force pg_stat_ch_flush(), wait for IPC files in $dump_dir.
  4. TRUNCATE pg_stat_ch.events_raw, then for each IPC file:
       curl -X POST --data-binary @$f \
         'http://localhost:18123/?query=INSERT INTO pg_stat_ch.events_raw FORMAT ArrowStream'
     A type mismatch on the wire (e.g. if the producer regressed to
     writing query_id as String) would surface here as a 4xx with a
     clear error rather than silently corrupting data.
  5. SELECT count() FROM events_raw, assert >= 5 rows.
  6. Pull system.columns and assert each id/counter column has the
     declared type from PR #99's schema (no silent string-typed regressions).
  7. Pinpoint the marker SELECT row and assert db_name/db_operation/
     query_text values match what we sent.
  8. Assert envelope columns (instance_ubid, server_role, region, cell,
     read_replica_type) carry the values from pg_stat_ch.extra_attributes.
  9. Assert parent_query_id is 0 across all rows (synthesized by the
     exporter until PR #95 lands).

Skips cleanly when Docker / the test CH container / the events_raw
schema aren't available — same patterns as t/010, t/013, t/021.

The "no OTel collector required" property makes this test purely a
producer⇄CH wire-format check. The clickgres-platform Go receiver is
not exercised here, since for verifying that the bytes match the
schema, a curl invocation is the simplest possible expression of "POST
this Arrow IPC body to CH" — the receiver's only added value over
curl in prod is OTel-collector pipeline integration, which we don't
care about for wire-format correctness.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JoshDreamland added a commit that referenced this pull request Jun 18, 2026
…to-end

Introduce a new exporter that builds Arrow IPC RecordBatches through the
typed StatsExporter column-factory interface (StatLC/StatHC/StatTimestamp)
instead of the open-coded ArrowBatchBuilder used by arrow_batch.cc.
Composition over inheritance: the new exporter holds an OTelExporter for
gRPC transport (SendArrowBatch) but doesn't extend it, so the per-row
LogRecord state machine in OTelExporter — which is unused on this path
post-PR-#72 — stays out of scope.

Wire shape targets events_raw (the unified schema authored in PR #99),
not the legacy query_logs_arrow:
  * query_id, parent_query_id: Int64 (no sprintf decimal-string encoding)
  * pid: Int32
  * err_elevel: UInt8
  * buffer counters (shared/local/temp_blks_*, *_blk_*_time_us,
    wal_*, cpu_*_time_us): Int64
  * parallel_workers_planned/launched: Int16
  * jit_*: Int32
  * LC strings (db_*, err_sqlstate, app, server_role, region, cell,
    service_version, read_replica_type) -> DictionaryUtf8
  * HC strings (query_text, err_message, client_addr, instance_ubid,
    server_ubid, host_id, pod_name) -> plain utf8
  * ts: arrow::timestamp(MICRO, "UTC") matching DateTime64(6, 'UTC')

Column<T> wrappers are nested private types inside OTelArrowExporter
(not at namespace scope) so they can inherit from the protected
Column<T> base — same convention OTelExporter and ClickHouseExporter use
for their own column types.

Columns the caller doesn't explicitly populate are synthesized in
BeginRow by the exporter itself, so stats_exporter.cc's column-emission
loop stays unchanged:
  * parent_query_id (hardcoded 0 until PR #95 lands and PschEvent carries
    the field — events_raw requires the column on every insert, no DEFAULT)
  * 8 envelope columns from pg_stat_ch.extra_attributes (instance_ubid,
    server_ubid, server_role, region, cell, host_id, pod_name) plus
    read_replica_type (default 'none' if extra_attributes didn't supply)
  * service_version pinned to the compile-time PG_STAT_CH_VERSION macro

This commit only adds the exporter file (no dispatcher wiring yet) —
the next commit adds the GUC and routes batches through it when on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JoshDreamland added a commit that referenced this pull request Jun 18, 2026
…_raw

Closes the test gap that's existed since the Arrow path went live: no
existing test proves that pg_stat_ch's Arrow IPC output can actually be
ingested by ClickHouse against the unified events_raw schema. t/026
asserts on the IPC schema shape via pyarrow but never pushes the bytes
into CH; t/010 etc. exercise the CH-native Block path, not Arrow.

The new test wires the full producer-to-CH chain locally, bypassing
the OTel collector + receiver service entirely:

  1. Spin up a node with use_unified_arrow_exporter=on +
     debug_arrow_dump_dir set, an OTel endpoint that doesn't resolve so
     gRPC send fails — MaybeDumpArrowBatch fires BEFORE send so IPC
     files land on disk regardless.
  2. Run a deliberately-shaped workload (SELECT, CREATE, INSERT,
     SELECT count, DROP — five distinct statements).
  3. Force pg_stat_ch_flush(), wait for IPC files in $dump_dir.
  4. TRUNCATE pg_stat_ch.events_raw, then for each IPC file:
       curl -X POST --data-binary @$f \
         'http://localhost:18123/?query=INSERT INTO pg_stat_ch.events_raw FORMAT ArrowStream'
     A type mismatch on the wire (e.g. if the producer regressed to
     writing query_id as String) would surface here as a 4xx with a
     clear error rather than silently corrupting data.
  5. SELECT count() FROM events_raw, assert >= 5 rows.
  6. Pull system.columns and assert each id/counter column has the
     declared type from PR #99's schema (no silent string-typed regressions).
  7. Pinpoint the marker SELECT row and assert db_name/db_operation/
     query_text values match what we sent.
  8. Assert envelope columns (instance_ubid, server_role, region, cell,
     read_replica_type) carry the values from pg_stat_ch.extra_attributes.
  9. Assert parent_query_id is 0 across all rows (synthesized by the
     exporter until PR #95 lands).

Skips cleanly when Docker / the test CH container / the events_raw
schema aren't available — same patterns as t/010, t/013, t/021.

The "no OTel collector required" property makes this test purely a
producer⇄CH wire-format check. The clickgres-platform Go receiver is
not exercised here, since for verifying that the bytes match the
schema, a curl invocation is the simplest possible expression of "POST
this Arrow IPC body to CH" — the receiver's only added value over
curl in prod is OTel-collector pipeline integration, which we don't
care about for wire-format correctness.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JoshDreamland added a commit that referenced this pull request Jun 18, 2026
…_raw

Closes the test gap that's existed since the Arrow path went live: no
existing test proves that pg_stat_ch's Arrow IPC output can actually be
ingested by ClickHouse against the unified events_raw schema. t/026
asserts on the IPC schema shape via pyarrow but never pushes the bytes
into CH; t/010 etc. exercise the CH-native Block path, not Arrow.

The new test wires the full producer-to-CH chain locally, bypassing
the OTel collector + receiver service entirely:

  1. Spin up a node with use_unified_arrow_exporter=on +
     debug_arrow_dump_dir set, an OTel endpoint that doesn't resolve so
     gRPC send fails — MaybeDumpArrowBatch fires BEFORE send so IPC
     files land on disk regardless.
  2. Run a deliberately-shaped workload (SELECT, CREATE, INSERT,
     SELECT count, DROP — five distinct statements).
  3. Force pg_stat_ch_flush(), wait for IPC files in $dump_dir.
  4. TRUNCATE pg_stat_ch.events_raw, then for each IPC file:
       curl -X POST --data-binary @$f \
         'http://localhost:18123/?query=INSERT INTO pg_stat_ch.events_raw FORMAT ArrowStream'
     A type mismatch on the wire (e.g. if the producer regressed to
     writing query_id as String) would surface here as a 4xx with a
     clear error rather than silently corrupting data.
  5. SELECT count() FROM events_raw, assert >= 5 rows.
  6. Pull system.columns and assert each id/counter column has the
     declared type from PR #99's schema (no silent string-typed regressions).
  7. Pinpoint the marker SELECT row and assert db_name/db_operation/
     query_text values match what we sent.
  8. Assert envelope columns (instance_ubid, server_role, region, cell,
     read_replica_type) carry the values from pg_stat_ch.extra_attributes.
  9. Assert parent_query_id is 0 across all rows (synthesized by the
     exporter until PR #95 lands).

Skips cleanly when Docker / the test CH container / the events_raw
schema aren't available — same patterns as t/010, t/013, t/021.

The "no OTel collector required" property makes this test purely a
producer⇄CH wire-format check. The clickgres-platform Go receiver is
not exercised here, since for verifying that the bytes match the
schema, a curl invocation is the simplest possible expression of "POST
this Arrow IPC body to CH" — the receiver's only added value over
curl in prod is OTel-collector pipeline integration, which we don't
care about for wire-format correctness.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JoshDreamland added a commit that referenced this pull request Jun 18, 2026
…to-end

Introduce a new exporter that builds Arrow IPC RecordBatches through the
typed StatsExporter column-factory interface (StatLC/StatHC/StatTimestamp)
instead of the open-coded ArrowBatchBuilder used by arrow_batch.cc.
Composition over inheritance: the new exporter holds an OTelExporter for
gRPC transport (SendArrowBatch) but doesn't extend it, so the per-row
LogRecord state machine in OTelExporter — which is unused on this path
post-PR-#72 — stays out of scope.

Wire shape targets events_raw (the unified schema authored in PR #99),
not the legacy query_logs_arrow:
  * query_id, parent_query_id: Int64 (no sprintf decimal-string encoding)
  * pid: Int32
  * err_elevel: UInt8
  * buffer counters (shared/local/temp_blks_*, *_blk_*_time_us,
    wal_*, cpu_*_time_us): Int64
  * parallel_workers_planned/launched: Int16
  * jit_*: Int32
  * LC strings (db_*, err_sqlstate, app, server_role, region, cell,
    service_version, read_replica_type) -> DictionaryUtf8
  * HC strings (query_text, err_message, client_addr, instance_ubid,
    server_ubid, host_id, pod_name) -> plain utf8
  * ts: arrow::timestamp(MICRO, "UTC") matching DateTime64(6, 'UTC')

Column<T> wrappers are nested private types inside OTelArrowExporter
(not at namespace scope) so they can inherit from the protected
Column<T> base — same convention OTelExporter and ClickHouseExporter use
for their own column types.

Columns the caller doesn't explicitly populate are synthesized in
BeginRow by the exporter itself, so stats_exporter.cc's column-emission
loop stays unchanged:
  * parent_query_id (hardcoded 0 until PR #95 lands and PschEvent carries
    the field — events_raw requires the column on every insert, no DEFAULT)
  * 8 envelope columns from pg_stat_ch.extra_attributes (instance_ubid,
    server_ubid, server_role, region, cell, host_id, pod_name) plus
    read_replica_type (default 'none' if extra_attributes didn't supply)
  * service_version pinned to the compile-time PG_STAT_CH_VERSION macro

This commit only adds the exporter file (no dispatcher wiring yet) —
the next commit adds the GUC and routes batches through it when on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JoshDreamland added a commit that referenced this pull request Jun 18, 2026
…_raw

Closes the test gap that's existed since the Arrow path went live: no
existing test proves that pg_stat_ch's Arrow IPC output can actually be
ingested by ClickHouse against the unified events_raw schema. t/026
asserts on the IPC schema shape via pyarrow but never pushes the bytes
into CH; t/010 etc. exercise the CH-native Block path, not Arrow.

The new test wires the full producer-to-CH chain locally, bypassing
the OTel collector + receiver service entirely:

  1. Spin up a node with use_unified_arrow_exporter=on +
     debug_arrow_dump_dir set, an OTel endpoint that doesn't resolve so
     gRPC send fails — MaybeDumpArrowBatch fires BEFORE send so IPC
     files land on disk regardless.
  2. Run a deliberately-shaped workload (SELECT, CREATE, INSERT,
     SELECT count, DROP — five distinct statements).
  3. Force pg_stat_ch_flush(), wait for IPC files in $dump_dir.
  4. TRUNCATE pg_stat_ch.events_raw, then for each IPC file:
       curl -X POST --data-binary @$f \
         'http://localhost:18123/?query=INSERT INTO pg_stat_ch.events_raw FORMAT ArrowStream'
     A type mismatch on the wire (e.g. if the producer regressed to
     writing query_id as String) would surface here as a 4xx with a
     clear error rather than silently corrupting data.
  5. SELECT count() FROM events_raw, assert >= 5 rows.
  6. Pull system.columns and assert each id/counter column has the
     declared type from PR #99's schema (no silent string-typed regressions).
  7. Pinpoint the marker SELECT row and assert db_name/db_operation/
     query_text values match what we sent.
  8. Assert envelope columns (instance_ubid, server_role, region, cell,
     read_replica_type) carry the values from pg_stat_ch.extra_attributes.
  9. Assert parent_query_id is 0 across all rows (synthesized by the
     exporter until PR #95 lands).

Skips cleanly when Docker / the test CH container / the events_raw
schema aren't available — same patterns as t/010, t/013, t/021.

The "no OTel collector required" property makes this test purely a
producer⇄CH wire-format check. The clickgres-platform Go receiver is
not exercised here, since for verifying that the bytes match the
schema, a curl invocation is the simplest possible expression of "POST
this Arrow IPC body to CH" — the receiver's only added value over
curl in prod is OTel-collector pipeline integration, which we don't
care about for wire-format correctness.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants