Skip to content

Fix #5114: preserve head/TopK semantics for sort-expression pushdown#5135

Open
penghuo wants to merge 4 commits intoopensearch-project:mainfrom
penghuo:bugFix/5114
Open

Fix #5114: preserve head/TopK semantics for sort-expression pushdown#5135
penghuo wants to merge 4 commits intoopensearch-project:mainfrom
penghuo:bugFix/5114

Conversation

@penghuo
Copy link
Collaborator

@penghuo penghuo commented Feb 11, 2026

Description

Fix issue #5114 where head could be lost on sort-expression plans, returning more rows than requested (for example: source=... | eval a = rand() | sort a | fields id | head 5).

The fix updates SortExprIndexScanRule to avoid rewriting CalciteEnumerableTopK paths that carry fetch semantics, and keeps limit pushdown on the logical scan path only when limit/offset are available. This preserves TopK semantics while still allowing safe sort-expression pushdown.

Related Issues

#5114

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 11, 2026

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes

    • Prevented incorrect pushdown of sort expressions when combined with head/limit semantics, preserving correct ordering.
  • Tests

    • Added a targeted regression test for sort-expression + head/limit scenarios.
    • Added an expected explain-plan resource for the new regression.
    • Added integration tests validating deterministic and non-deterministic sort expressions and order-equivalent expressions.

Walkthrough

Adds an early-return in SortExprIndexScanRule to skip pushdown when the Sort is an instance of EnumerableLimitSort, plus regression tests: a Calcite explain test, an expected explain YAML, and YAML REST tests covering deterministic, non-deterministic, and order-equivalent sort expressions with head (limit).

Changes

Cohort / File(s) Summary
Calcite explain test
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
Added testIssue5114SortExprHeadExplain() that enables pushdown, runs a query with an eval-derived sort expression + head, and asserts the YAML explain matches the new expected file.
Expected explain plan
integ-test/src/test/resources/expectedOutput/calcite/explain_issue_5114_sort_expr_head_push.yaml
Added expected Calcite explain YAML showing logical/physical scans, projection, scripted sort, limits, and PushDownContext for the pushdown case.
REST API integration tests
integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/5114.yml
New PPL YAML tests that create index/data and verify head behavior with deterministic, non-deterministic (rand()), and order-equivalent expressions.
Planner rule
opensearch/src/main/java/org/opensearch/sql/opensearch/planner/rules/SortExprIndexScanRule.java
Imported EnumerableLimitSort and added an early return in onMatchImpl when the Sort node is an instance of EnumerableLimitSort, preventing sort-expression pushdown for that case.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client as Client (PPL / Calcite)
participant Planner as Planner (Calcite/Rules)
participant Rule as SortExprIndexScanRule
participant Index as OpenSearch Index / RequestBuilder
Client->>Planner: Submit query with eval-derived sort + head
Planner->>Rule: Apply sort-expression pushdown rule
alt Sort is EnumerableLimitSort
Rule->>Planner: Early return (do not pushdown)
Planner->>Index: Plan: local sort + limit (no scripted sort)
else Other Sort types
Rule->>Index: Pushdown projection + scripted sort + limits
Index-->>Planner: Pushdown plan (OpenSearchRequestBuilder)
end
Planner->>Client: Return explain/execute plan

Note: rectangles use default styling; interactions focus on primary components and decision branch.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

backport-manually, backport 2.19-dev, testing

Suggested reviewers

  • ps48
  • kavithacm
  • derek-ho
  • joshuali925
  • Swiddis
  • anirudha
🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title directly addresses the main fix: preserving head/TopK semantics for sort-expression pushdown in issue #5114.
Description check ✅ Passed The description clearly explains the issue, the fix applied to SortExprIndexScanRule, and includes testing information related to the changeset.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@penghuo penghuo changed the title Bug fix/5114 Fix #5114: preserve head/TopK semantics for sort-expression pushdown Feb 11, 2026
@penghuo penghuo marked this pull request as ready for review February 11, 2026 16:43
@penghuo penghuo self-assigned this Feb 11, 2026
@penghuo penghuo added the bugFix label Feb 11, 2026
@penghuo
Copy link
Collaborator Author

penghuo commented Feb 11, 2026

@songkant-aws Please take a look.

@songkant-aws
Copy link
Contributor

@penghuo LGTM. Approved with non-blocking nitty comment.

Signed-off-by: Peng Huo <penghuo@gmail.com>
Signed-off-by: Peng Huo <penghuo@gmail.com>
Signed-off-by: Peng Huo <penghuo@gmail.com>
Signed-off-by: Peng Huo <penghuo@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants