benchmark: revisit NOTICE-vs-pgss instrumentation choice (observer effect on DO-wrapped queries)

## Summary

`benchmark/` documentation (PR #66) explains:

> "Why NOTICE rather than pgss: DO-wrappers hide per-statement rows from `pg_stat_statements`..."

The choice is functional but has a real **observer effect** on the metric being measured: every `RAISE NOTICE` produces a server→client protocol message AND a server-side log write, both of which add latency to the very queries being instrumented.

## Why this matters

Two compounding costs:

1. **Server→client protocol overhead.** `RAISE NOTICE` flushes a NoticeResponse on each invocation; the client (psql/pgbench/etc.) must read it. At high TPS this is non-trivial.
2. **Log subsystem write.** The notice also goes to whatever log destination is configured (related to issue #123). Even when log volume is "low," it's not zero.

Combined, `RAISE NOTICE`-based instrumentation has a higher floor than `pg_stat_statements`-based measurement.

## Why DO blocks were used

DO blocks let arbitrary PL/pgSQL run without persisting a function definition. But `pg_stat_statements` records the OUTER `DO` statement (or sometimes nothing, depending on version) — not the inner per-statement timings.

## Alternatives for next bench cycle

- **Replace DO blocks with named functions** for the hot-path measurement points. `pg_stat_statements` then captures per-call statistics with negligible observer effect at typical TPS.
- **Use `pg_stat_kcache` or `pg_stat_statements.toplevel = on`** (PG 14+) to capture nested call stats that include DO contents.
- **Accept DO + use `pg_ash` 1Hz sampling** (already in the pgque bench stack) which is lower-overhead than NOTICE for high-frequency events; reserve NOTICE for boundary events only.
- **Switch to `auto_explain` with `log_min_duration_sample`** — less per-query log volume but still useful for outlier diagnosis.

## Action

- **Do not block** PR #66 — the bench numbers themselves stand at the current overhead level.
- For the next bench cycle:
  - quantify the NOTICE overhead with a comparison run (NOTICE vs no-NOTICE for the same workload), then either move to functions+pgss or accept the overhead with a measured floor.
  - update the methodology doc with the comparison.

## Severity

LOW — methodology refinement, not a numerical correction. Joins the cluster with #123 (log I/O) and #124 (planner-cost framing) as bench-doc revisits for the next round.

## Related

- #123 (log I/O claim revisit)
- #124 (pgmq-partitioned planner-cost framing)
- PR #66

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmark: revisit NOTICE-vs-pgss instrumentation choice (observer effect on DO-wrapped queries) #127

Summary

Why this matters

Why DO blocks were used

Alternatives for next bench cycle

Action

Severity

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

benchmark: revisit NOTICE-vs-pgss instrumentation choice (observer effect on DO-wrapped queries) #127

Description

Summary

Why this matters

Why DO blocks were used

Alternatives for next bench cycle

Action

Severity

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions