Feature Description
Extend the EventStats Lambda with a compiled queries endpoint — a curated list of predefined, server-executed queries (e.g. "failed jobs in the last 7 days", "job count aggregated by catalog") that consumers can invoke by name. Rather than pushing aggregation logic into the client or BI tool, the heavy lifting runs inside PostgreSQL and only the final result set is returned over the wire.
Problem / Opportunity
The current POST /stats/{topic_name} endpoint returns raw paginated rows. Consumers (dashboards, reports, downstream tools) re-derive the same aggregations repeatedly on the client side — wasting bandwidth, duplicating logic, and making results inconsistent across consumers. Predefined server-side queries centralise that logic, reduce payload sizes significantly, and allow PostgreSQL query planning to optimise repeated patterns.
Acceptance Criteria
- A new route (e.g.
POST /stats/{topic_name}/query/{query_name}) accepts a query identifier and an optional parameter bag (e.g. time window, tenant filter).
- A
SUPPORTED_QUERIES registry maps each query_name to its SQL template and accepted parameters — unknown names return 400.
- Queries are executed server-side; only the aggregated result is returned (no raw row streaming).
- The endpoint is protected by the same JWT auth and per-topic ACL as the rest of EventStats.
- Unknown or unsupported
query_name values produce a clear 400 error, not a 500.
- Unit tests cover routing, registry lookup, and each query's result shaping.
- Integration tests validate at least one aggregation query end-to-end against a seeded testcontainer database.
- All quality gates pass (
./ci_local.sh).
Note for implementer: the concrete set of queries and their SQL definitions must be identified and agreed upon as part of this issue's implementation. The examples below are starting points only — validate with stakeholders which aggregations are actually needed before writing SQL.
Proposed Solution
Introduce a CompiledQueryRegistry (or extend ReaderPostgres) that maps query names to parameterised SQL templates (using psycopg2 %s / sql.SQL composition — never string interpolation). Each entry declares its accepted input parameters and output schema.
Example candidate queries to evaluate with stakeholders:
failed_jobs_last_7d — count and list of jobs with a failure status in the last N days, grouped by pipeline/tenant.
aggregation_by_catalog — job count, success rate, and average elapsed time grouped by catalog/source.
run_status_summary — distribution of run statuses (running, completed, failed) over a configurable time window.
Route dispatch follows the existing ROUTE_HANDLERS pattern in event_stats_lambda.py. A new HandlerCompiledQuery (or an extension of HandlerStats) handles validation and delegates to the registry.
Dependencies / Related
Feature Description
Extend the EventStats Lambda with a compiled queries endpoint — a curated list of predefined, server-executed queries (e.g. "failed jobs in the last 7 days", "job count aggregated by catalog") that consumers can invoke by name. Rather than pushing aggregation logic into the client or BI tool, the heavy lifting runs inside PostgreSQL and only the final result set is returned over the wire.
Problem / Opportunity
The current
POST /stats/{topic_name}endpoint returns raw paginated rows. Consumers (dashboards, reports, downstream tools) re-derive the same aggregations repeatedly on the client side — wasting bandwidth, duplicating logic, and making results inconsistent across consumers. Predefined server-side queries centralise that logic, reduce payload sizes significantly, and allow PostgreSQL query planning to optimise repeated patterns.Acceptance Criteria
POST /stats/{topic_name}/query/{query_name}) accepts a query identifier and an optional parameter bag (e.g. time window, tenant filter).SUPPORTED_QUERIESregistry maps eachquery_nameto its SQL template and accepted parameters — unknown names return400.query_namevalues produce a clear400error, not a500../ci_local.sh).Proposed Solution
Introduce a
CompiledQueryRegistry(or extendReaderPostgres) that maps query names to parameterised SQL templates (usingpsycopg2%s/sql.SQLcomposition — never string interpolation). Each entry declares its accepted input parameters and output schema.Example candidate queries to evaluate with stakeholders:
failed_jobs_last_7d— count and list of jobs with a failure status in the last N days, grouped by pipeline/tenant.aggregation_by_catalog— job count, success rate, and average elapsed time grouped by catalog/source.run_status_summary— distribution of run statuses (running, completed, failed) over a configurable time window.Route dispatch follows the existing
ROUTE_HANDLERSpattern inevent_stats_lambda.py. A newHandlerCompiledQuery(or an extension ofHandlerStats) handles validation and delegates to the registry.Dependencies / Related