1. Problem Statement
1.1 Today: validation is post-execution
In OpenSearch SQL/PPL, every form of feedback the user gets about query quality lands after the query has already been dispatched (or, in the editor, after the parser fails outright). There is no advisory layer that runs before execution to warn the user about queries that are syntactically valid but semantically suboptimal or wrong.
The current state, by stage:
| Stage |
Where it runs |
What it catches |
What it misses |
| ANTLR parse |
OSD Monaco worker + backend |
Syntax errors only — rendered as MarkerSeverity.Error red squigglies |
Anything past tokens/grammar |
SemanticCheckException |
Backend Analyzer, at execution time |
Hard semantic failures (e.g., "can't resolve field X") |
Soft mistakes that don't fail the resolver |
| Cluster execution |
Backend executor |
Engine-level failures, OOMs, timeouts |
Wrong-but-completing results |
| Pre-execution advisory |
— |
— (does not exist today) — |
Everything semantically suboptimal |
Everything between "ANTLR parsed it" and "the cluster ran it" is invisible until after the user clicks Run. The Monaco editor's MarkerSeverity.Warning channel is currently unused.
1.2 What that means in practice
This is forgiving on input but unforgiving on output. The same query the user wrote is dispatched, possibly scans a year of data, and returns either an error, an empty result, or a wrong result. Examples we see in the wild:
... | where service = "checkout" | stats count() by user_id on a 5-billion-doc index — runs to completion, OOMs the coordinator at the stats stage.
... | where message like "%5xx%" against a keyword-mapped field — silently returns zero rows. The user assumes "no errors == no matches".
source = logs-* | head 100 against a wildcard pattern matching zero indices — empty result, no warning, indistinguishable from "no data in the time range".
... | sort @timestamp after stats — sorts the wrong post-aggregation columns, producing non-deterministic output.
- A typo in a field name (
stauts vs status) — SemanticCheckException thrown, but only after the user submits and waits.
None of these are syntax errors. None necessarily throw SemanticCheckException early enough to be useful. The user discovers the problem only after the query has either run wrong or failed at the cluster.
1.3 Where peers are
| Tool |
Pre-execution advisory |
Auto-fix |
Cost preview |
| Azure Data Explorer (Kusto) |
Yes — "Best Practices" analyzer |
Yes — Quick-fix |
Yes |
| BigQuery Studio |
Yes — partition-filter warnings |
No |
Yes — "will process N GB" |
| DataGrip |
Yes — named SQL inspections |
Yes |
No |
| OpenSearch SQL/PPL today |
No |
No |
No |
We are the outlier. The capability exists in every comparable analytics product.
1.4 The gap
Summarized: the project has post-execution validation (parse errors, runtime exceptions, engine failures), but no pre-execution layer that gives the user advisory feedback while they are still composing the query. We need to add one.
2. Proposed Solution
2.1 The shift: pre-execution lint layer
Add a pre-execution lint layer that runs against the parsed query without executing it and returns structured diagnostics the editor can render inline. The layer is purely advisory: it never blocks, never executes data-plane queries, and never mutates state. It complements — not replaces — the existing post-execution validation.
┌───────────────────────────────┐
│ PRE-EXECUTION (NEW) │
user types │ │
query ───────► │ ANTLR parse → Lint layer │ ───► diagnostics in editor
│ (errors) (warnings) │
└───────────────────────────────┘
│
│ user clicks Run
▼
┌───────────────────────────────┐
│ POST-EXECUTION (TODAY) │
│ │
│ Analyzer → Planner → │ ───► results / errors
│ Executor │
└───────────────────────────────┘
2.2 What ships
- Lint REST endpoints —
POST /_plugins/_ppl/_lint and POST /_plugins/_sql/_lint accepting a query plus target index/data source, returning a structured diagnostics array without executing the query.
- Pluggable rule framework — adding a new rule should not require changes to the REST handler or the response schema. Rules are independent units that walk the parsed query and emit diagnostics.
- Initial rule catalog — at least 8 rules covering correctness, performance, and determinism categories, discovered from the existing PPL/SQL docs and exception sites (see §2.5).
- Frontend Monaco integration — debounced calls to the lint endpoint, results rendered as
MarkerSeverity.Warning markers alongside the existing Error syntax markers, with hover tooltips and a one-click Apply fix action.
- Auto-fix mechanism — diagnostics that are auto-correctable carry a structured fix the editor applies on click.
- Cluster settings — operators can disable individual rules, adjust thresholds, or kill-switch the entire feature without a restart.
2.3 Goals & non-goals
Goals
- G1. Stable REST contract for linting PPL and SQL queries against a target index/data source without executing them.
- G2. Pluggable rule catalog that grows incrementally — both in-tree and (eventually) by other plugins.
- G3. Ship at least 8 concrete rules at GA.
- G4. Diagnostics carry enough metadata (position, severity, category, rule ID, optional fix) for editors to render a useful UX out of the box.
- G5. OpenSearch Dashboards renders warnings inline within ~500 ms of a keystroke at p50.
- G6. Linting is safe: never executes user data-plane queries, never mutates state, observes per-cluster opt-out.
Non-goals
- Replacing
SemanticCheckException for hard errors. The lint layer is additive and advisory.
- Full Calcite EXPLAIN cost-model integration on day one (planned as a later phase).
- Linting async data sources (Spark/Glue) end-to-end — different planner, different scope.
- Blocking execution. Lint is advisory; the engine remains the source of truth for refusal.
2.4 Requirements
2.4.1 Must-have
- Lint endpoint —
POST /_plugins/_ppl/_lint accepting { query, index, dataSourceId? } and returning a structured diagnostics array. SQL counterpart symmetric.
- Pluggable rule framework — the REST surface and response schema do not change when a new rule is added.
- Diagnostic schema — every diagnostic includes:
ruleId (stable, kebab-case)
severity (error | warning | info | hint)
category (correctness | performance | determinism | style)
message (user-facing)
position (precise start/end line/column and offsets — must be reliable for editor underlines)
- optional
fix (structured edit: start/end offset + replacement text)
- optional
docUrl
- Initial rule catalog (≥ 8 rules) — discovered from PPL/SQL docs and exception sites; see §2.5.
- Severity-level configuration via cluster settings — operators can disable individual rules and adjust thresholds without a restart.
- Master switch —
plugins.sql.lint.enabled (default false until GA) so clusters can opt in.
- Frontend Monaco integration — debounced calls, results rendered as
MarkerSeverity.Warning markers alongside existing Error markers; hover shows message + suggestion + docs link.
- Auto-fix payload — diagnostics that can be auto-corrected return a structured fix the editor applies via a code action.
2.4.2 Nice-to-have
- Stretch rules beyond the GA catalog — additional candidates surfaced from the same docs / exception sources (see §2.5).
- Cost-estimate badge (à la BigQuery's "This query will process X GB") — feasible once Calcite explain-cost is wired through.
- External plugin SPI so third-party plugins can register rules without forking core.
- Per-data-source rule subsets (some rules apply only to OpenSearch indices, others to Glue, etc.).
- Telemetry on which fixes get accepted, to guide which rules earn their keep.
2.5 Rule discovery & catalog
Rather than prescribing a fixed list of rules upfront, this RFC asks the intern to discover the initial catalog from sources of truth that already exist in the project. The PPL and SQL documentation, the existing exception sites in the codebase, and the issue tracker collectively describe — with examples — every footgun the team has cared about enough to write down. Those are the natural rule candidates.
Where to look for rule candidates
- PPL command docs —
docs/user/ppl/cmd/*.md. Many commands include "Note", "Limitation", or "Caveat" callouts describing wrong-result patterns. Each one is a rule candidate.
- PPL function docs —
docs/user/ppl/functions/*.md. Function-specific footguns (regex semantics, type coercion, null handling) belong here.
- PPL general guides —
docs/user/ppl/general/. Cross-cutting topics like wildcards, identifiers, and time filters.
- PPL limitations —
docs/user/ppl/limitations/ and docs/user/limitations/. By definition these are documented constraints; every entry is a candidate for a pre-execution warning.
- SQL docs —
docs/user/dql/, docs/user/admin/, docs/user/optimization/. Same pattern as PPL.
SemanticCheckException throw sites — every place the analyzer throws SemanticCheckException is a hard error that fires at execution time today. Most of them have a natural pre-execution counterpart: the lint version surfaces the same condition as a warning while the user is still typing.
- Issues labelled
bug / wrong-result / confused user in the opensearch-project/sql issue tracker. Recurring user reports point to gaps that lint can close.
What makes a good rule
- It catches a real, repeatable mistake (cited in docs or issues).
- It can be detected from the parsed query plus index metadata (mapping,
_stats, _field_caps) — no execution required.
- It has a clear, low-false-positive trigger condition, or a tunable threshold.
- It belongs to one of:
correctness, performance, determinism, style.
Where to look for rule exceptions (carve-outs)
For every rule, there are legitimate uses that should not fire a warning. The same docs are the source of those carve-outs: usage patterns the docs explicitly endorse, examples in the docs, and "this is fine when …" notes. A rule without documented carve-outs is probably under-specified.
For instance, a rule "warn on head without sort" has a documented carve-out "no warning when the upstream is already deterministic (e.g., top N, rare N, a single-row scalar source)" — that carve-out comes directly from the head command docs.
Catalog target
The minimum bar at GA is at least 8 rules spanning all four categories, with at least one auto-fix-capable rule per category where it makes sense. The intern proposes the actual list in week 1 (see §3) after surveying the sources above.
Each proposed rule is reviewed against:
- Source citation (which doc entry / exception throw site / issue motivates it).
- Documented carve-outs.
- Default severity rationale.
- Auto-fix design (or rationale for why it has none).
Per-rule cluster-settings example
plugins.sql.lint.enabled: true
plugins.sql.lint.rules.<rule-id>.severity: warning # off | hint | info | warning | error
plugins.sql.lint.thresholds.<rule-id>.<param>: 100000 # rule-specific threshold
2.6 High-level design
The pre-execution layer sits next to the existing query path and reuses its parser and analyzer:
HTTP request
→ REST handler (new)
→ Parser (PPL or SQL) (existing ANTLR path, reused)
→ Lint context build (mapping, _stats, settings — cached)
→ Tolerant analyzer pass (catches semantic errors as diagnostics, not exceptions)
→ Rule engine (each rule visits the plan + context, emits diagnostics)
→ JSON response
Two design points worth calling out:
- Reuse the existing parser and analyzer. The lint engine should not re-implement type inference or field resolution — it should run the same code paths as a real query, but in a tolerant mode where what would normally throw becomes a diagnostic. This is where most of the high-value rules (field validation, type mismatch) come from "for free".
- Diagnostics carry precise positions. Editors need offsets accurate to the offending token, not to the surrounding command. The structured
position field is non-negotiable.
2.7 API contract
Request
POST /_plugins/_ppl/_lint
Content-Type: application/json
{
"query": "source=logs-* | where status = 500 | stats count() by user_id",
"index": "logs-*",
"dataSourceId": "my_glue", // optional, defaults to local cluster
"rules": ["field-validation"], // optional allowlist
"settings": { // optional per-call overrides
"highCardinalityThreshold": 100000
}
}
POST /_plugins/_sql/_lint is symmetric.
Response
{
"took": 14,
"diagnostics": [
{
"ruleId": "high-cardinality-groupby",
"severity": "warning",
"category": "performance",
"message": "Grouping by 'user_id' (cardinality ≈ 4.2M) may exhaust coordinator memory.",
"command": "stats",
"position": {
"startLine": 1, "startColumn": 51,
"endLine": 1, "endColumn": 58,
"startOffset": 50, "endOffset": 57
},
"fix": {
"title": "Replace stats-by with `top N`",
"edits": [
{ "startOffset": 50, "endOffset": 57, "replacement": "user_id | head 1000" }
]
},
"docUrl": "https://opensearch.org/docs/latest/.../lint/high-cardinality-groupby/"
}
]
}
The endpoint always returns 200 if the request is well-formed — parse errors are surfaced as a single error-severity diagnostic with ruleId: "syntax" so the frontend renders them via the same code path.
2.8 Frontend integration (OpenSearch Dashboards)
- An OSD server proxy route forwards
/api/enhancements/ppl/lint to _plugins/_ppl/_lint (same shape as the existing _grammar proxy), threading dataSourceId for multi-DS clusters.
- The Monaco PPL language module debounces lint calls (~500 ms) on content change, cancels in-flight requests on new keystrokes, and renders diagnostics as
MarkerSeverity.Warning markers using the existing setModelMarkers() infrastructure.
- A code-action provider reads each diagnostic's structured
fix and exposes a one-click Apply fix in the lightbulb menu.
- When the cluster has lint disabled, the proxy returns an empty diagnostics array; the editor never lights up.
No new UI framework, no new component library — pure addition on the existing Monaco infrastructure.
2.9 Cross-cutting concerns
- Performance. Mapping and stats lookups must be cached (per cluster + index, short TTL) so repeated keystrokes don't hammer
_mapping. Per-rule and per-request timeouts bound worst-case latency. Lint is best-effort: a rule that times out is silently skipped rather than failing the response.
- Security. The endpoint reads
_mapping and _stats for the target index — caller's role must already permit those reads. No privilege elevation. Audit-log impact (queries logged on every keystroke) is documented; the master switch lets operators opt out.
- Multi-data-source.
dataSourceId is plumbed end-to-end. For non-OpenSearch data sources, AST-only rules apply; metadata-dependent rules emit an info diagnostic noting they couldn't run.
- Settings.
plugins.sql.lint.enabled, plugins.sql.lint.rules.<rule-id>.severity, and per-rule thresholds are cluster settings (no restart required to change).
- i18n. Diagnostic messages are externalized for translation; English ships day one.
3. Acceptance Criteria
POST /_plugins/_ppl/_lint and POST /_plugins/_sql/_lint return the documented schema for the eight rules from the catalog, with positive and negative integration tests for each.
- Per-rule cluster setting flips a rule off without a restart.
- Adding a new rule does not require changes to the REST handler or the response schema.
- A sample external-plugin rule is discovered and run end-to-end without modifying the SQL plugin or core.
- "How to add a rule" guide is published and the intern's own rules serve as canonical examples.
- OSD Monaco renders warning markers within ~500 ms p50 against a 10-line query.
- With
plugins.sql.lint.enabled: false, no behavior change anywhere; the new endpoints are inert.
4. Open Questions
- Default-on or default-off at GA? Lean default-off until P6 ships and feedback is positive, then flip.
- Source of cardinality estimates for cardinality-sensitive rules.
_field_caps doesn't expose value cardinality; a sampled terms aggregation is one possible path but needs a cost ceiling.
- Where do per-rule docs live? Proposal:
docs/user/ppl/lint/<rule-id>.md, with each diagnostic's docUrl pointing into the published docs.
- Should
error-severity diagnostics block submission if the user hits Run anyway? Recommended: no — the engine remains the source of truth for refusal. Lint is advisory by definition.
- Telemetry on rule emission and fix-acceptance — useful for prioritizing the catalog, but needs a privacy review before P6.
- Streaming / partial parses — could we lint mid-typing? Out of scope for v1; needs partial-AST recovery.
5. References
Primary sources for rule and carve-out discovery (in-repo)
Prior art (external)
Appendix — Illustrative end-to-end example
The rule IDs below are placeholders for shape only — the actual catalog is locked in week 1 from the discovery output (see §2.5).
Input
source=logs-* | where status = 500 | stats count() by user_id | head 100
Diagnostics emitted (truncated for brevity)
[
{
"ruleId": "missing-time-filter",
"severity": "info",
"category": "performance",
"message": "Index 'logs-*' has a @timestamp field but no predicate filters on it. Full scans on time-series indices are usually unintended.",
"fix": {
"title": "Restrict to last 1 hour",
"edits": [{ "startOffset": 16, "endOffset": 16, "replacement": "where @timestamp > now() - 1h | " }]
}
},
{
"ruleId": "high-cardinality-groupby",
"severity": "warning",
"category": "performance",
"message": "Grouping by 'user_id' (cardinality ≈ 4.2M) may exhaust coordinator memory."
},
{
"ruleId": "head-without-sort",
"severity": "info",
"category": "determinism",
"message": "head returns an arbitrary subset of rows; output is non-deterministic."
}
]
Three warnings on a query the user thought was fine. That is the value proposition.
1. Problem Statement
1.1 Today: validation is post-execution
In OpenSearch SQL/PPL, every form of feedback the user gets about query quality lands after the query has already been dispatched (or, in the editor, after the parser fails outright). There is no advisory layer that runs before execution to warn the user about queries that are syntactically valid but semantically suboptimal or wrong.
The current state, by stage:
MarkerSeverity.Errorred squiggliesSemanticCheckExceptionAnalyzer, at execution timeEverything between "ANTLR parsed it" and "the cluster ran it" is invisible until after the user clicks Run. The Monaco editor's
MarkerSeverity.Warningchannel is currently unused.1.2 What that means in practice
This is forgiving on input but unforgiving on output. The same query the user wrote is dispatched, possibly scans a year of data, and returns either an error, an empty result, or a wrong result. Examples we see in the wild:
... | where service = "checkout" | stats count() by user_idon a 5-billion-doc index — runs to completion, OOMs the coordinator at thestatsstage.... | where message like "%5xx%"against akeyword-mapped field — silently returns zero rows. The user assumes "no errors == no matches".source = logs-* | head 100against a wildcard pattern matching zero indices — empty result, no warning, indistinguishable from "no data in the time range".... | sort @timestampafterstats— sorts the wrong post-aggregation columns, producing non-deterministic output.stautsvsstatus) —SemanticCheckExceptionthrown, but only after the user submits and waits.None of these are syntax errors. None necessarily throw
SemanticCheckExceptionearly enough to be useful. The user discovers the problem only after the query has either run wrong or failed at the cluster.1.3 Where peers are
We are the outlier. The capability exists in every comparable analytics product.
1.4 The gap
Summarized: the project has post-execution validation (parse errors, runtime exceptions, engine failures), but no pre-execution layer that gives the user advisory feedback while they are still composing the query. We need to add one.
2. Proposed Solution
2.1 The shift: pre-execution lint layer
Add a pre-execution lint layer that runs against the parsed query without executing it and returns structured diagnostics the editor can render inline. The layer is purely advisory: it never blocks, never executes data-plane queries, and never mutates state. It complements — not replaces — the existing post-execution validation.
2.2 What ships
POST /_plugins/_ppl/_lintandPOST /_plugins/_sql/_lintaccepting a query plus target index/data source, returning a structured diagnostics array without executing the query.MarkerSeverity.Warningmarkers alongside the existingErrorsyntax markers, with hover tooltips and a one-click Apply fix action.2.3 Goals & non-goals
Goals
Non-goals
SemanticCheckExceptionfor hard errors. The lint layer is additive and advisory.2.4 Requirements
2.4.1 Must-have
POST /_plugins/_ppl/_lintaccepting{ query, index, dataSourceId? }and returning a structured diagnostics array. SQL counterpart symmetric.ruleId(stable, kebab-case)severity(error|warning|info|hint)category(correctness|performance|determinism|style)message(user-facing)position(precise start/end line/column and offsets — must be reliable for editor underlines)fix(structured edit: start/end offset + replacement text)docUrlplugins.sql.lint.enabled(defaultfalseuntil GA) so clusters can opt in.MarkerSeverity.Warningmarkers alongside existingErrormarkers; hover shows message + suggestion + docs link.2.4.2 Nice-to-have
2.5 Rule discovery & catalog
Rather than prescribing a fixed list of rules upfront, this RFC asks the intern to discover the initial catalog from sources of truth that already exist in the project. The PPL and SQL documentation, the existing exception sites in the codebase, and the issue tracker collectively describe — with examples — every footgun the team has cared about enough to write down. Those are the natural rule candidates.
Where to look for rule candidates
docs/user/ppl/cmd/*.md. Many commands include "Note", "Limitation", or "Caveat" callouts describing wrong-result patterns. Each one is a rule candidate.docs/user/ppl/functions/*.md. Function-specific footguns (regex semantics, type coercion, null handling) belong here.docs/user/ppl/general/. Cross-cutting topics like wildcards, identifiers, and time filters.docs/user/ppl/limitations/anddocs/user/limitations/. By definition these are documented constraints; every entry is a candidate for a pre-execution warning.docs/user/dql/,docs/user/admin/,docs/user/optimization/. Same pattern as PPL.SemanticCheckExceptionthrow sites — every place the analyzer throwsSemanticCheckExceptionis a hard error that fires at execution time today. Most of them have a natural pre-execution counterpart: the lint version surfaces the same condition as a warning while the user is still typing.bug/wrong-result/confused userin theopensearch-project/sqlissue tracker. Recurring user reports point to gaps that lint can close.What makes a good rule
_stats,_field_caps) — no execution required.correctness,performance,determinism,style.Where to look for rule exceptions (carve-outs)
For every rule, there are legitimate uses that should not fire a warning. The same docs are the source of those carve-outs: usage patterns the docs explicitly endorse, examples in the docs, and "this is fine when …" notes. A rule without documented carve-outs is probably under-specified.
For instance, a rule "warn on
headwithoutsort" has a documented carve-out "no warning when the upstream is already deterministic (e.g.,top N,rare N, a single-row scalar source)" — that carve-out comes directly from theheadcommand docs.Catalog target
The minimum bar at GA is at least 8 rules spanning all four categories, with at least one auto-fix-capable rule per category where it makes sense. The intern proposes the actual list in week 1 (see §3) after surveying the sources above.
Each proposed rule is reviewed against:
Per-rule cluster-settings example
2.6 High-level design
The pre-execution layer sits next to the existing query path and reuses its parser and analyzer:
Two design points worth calling out:
positionfield is non-negotiable.2.7 API contract
Request
POST /_plugins/_sql/_lintis symmetric.Response
{ "took": 14, "diagnostics": [ { "ruleId": "high-cardinality-groupby", "severity": "warning", "category": "performance", "message": "Grouping by 'user_id' (cardinality ≈ 4.2M) may exhaust coordinator memory.", "command": "stats", "position": { "startLine": 1, "startColumn": 51, "endLine": 1, "endColumn": 58, "startOffset": 50, "endOffset": 57 }, "fix": { "title": "Replace stats-by with `top N`", "edits": [ { "startOffset": 50, "endOffset": 57, "replacement": "user_id | head 1000" } ] }, "docUrl": "https://opensearch.org/docs/latest/.../lint/high-cardinality-groupby/" } ] }The endpoint always returns
200if the request is well-formed — parse errors are surfaced as a singleerror-severity diagnostic withruleId: "syntax"so the frontend renders them via the same code path.2.8 Frontend integration (OpenSearch Dashboards)
/api/enhancements/ppl/lintto_plugins/_ppl/_lint(same shape as the existing_grammarproxy), threadingdataSourceIdfor multi-DS clusters.MarkerSeverity.Warningmarkers using the existingsetModelMarkers()infrastructure.fixand exposes a one-click Apply fix in the lightbulb menu.No new UI framework, no new component library — pure addition on the existing Monaco infrastructure.
2.9 Cross-cutting concerns
_mapping. Per-rule and per-request timeouts bound worst-case latency. Lint is best-effort: a rule that times out is silently skipped rather than failing the response._mappingand_statsfor the target index — caller's role must already permit those reads. No privilege elevation. Audit-log impact (queries logged on every keystroke) is documented; the master switch lets operators opt out.dataSourceIdis plumbed end-to-end. For non-OpenSearch data sources, AST-only rules apply; metadata-dependent rules emit aninfodiagnostic noting they couldn't run.plugins.sql.lint.enabled,plugins.sql.lint.rules.<rule-id>.severity, and per-rule thresholds are cluster settings (no restart required to change).3. Acceptance Criteria
POST /_plugins/_ppl/_lintandPOST /_plugins/_sql/_lintreturn the documented schema for the eight rules from the catalog, with positive and negative integration tests for each.plugins.sql.lint.enabled: false, no behavior change anywhere; the new endpoints are inert.4. Open Questions
_field_capsdoesn't expose value cardinality; a sampledtermsaggregation is one possible path but needs a cost ceiling.docs/user/ppl/lint/<rule-id>.md, with each diagnostic'sdocUrlpointing into the published docs.error-severity diagnostics block submission if the user hits Run anyway? Recommended: no — the engine remains the source of truth for refusal. Lint is advisory by definition.5. References
Primary sources for rule and carve-out discovery (in-repo)
docs/user/ppl/docs/user/ppl/cmd/docs/user/ppl/functions/docs/user/ppl/general/docs/user/ppl/limitations/docs/user/dql/,docs/user/admin/,docs/user/optimization/docs/user/limitations/docs/dev/core/src/main/java/org/opensearch/sql/analysis/forSemanticCheckExceptionthrow sites and the Analyzer /TypeEnvironmentflow.Prior art (external)
Appendix — Illustrative end-to-end example
The rule IDs below are placeholders for shape only — the actual catalog is locked in week 1 from the discovery output (see §2.5).
Input
Diagnostics emitted (truncated for brevity)
[ { "ruleId": "missing-time-filter", "severity": "info", "category": "performance", "message": "Index 'logs-*' has a @timestamp field but no predicate filters on it. Full scans on time-series indices are usually unintended.", "fix": { "title": "Restrict to last 1 hour", "edits": [{ "startOffset": 16, "endOffset": 16, "replacement": "where @timestamp > now() - 1h | " }] } }, { "ruleId": "high-cardinality-groupby", "severity": "warning", "category": "performance", "message": "Grouping by 'user_id' (cardinality ≈ 4.2M) may exhaust coordinator memory." }, { "ruleId": "head-without-sort", "severity": "info", "category": "determinism", "message": "head returns an arbitrary subset of rows; output is non-deterministic." } ]Three warnings on a query the user thought was fine. That is the value proposition.