Skip to content

Improve resource monitor errors#5129

Open
Swiddis wants to merge 6 commits intoopensearch-project:mainfrom
Swiddis:fix/resource-monitor-errors
Open

Improve resource monitor errors#5129
Swiddis wants to merge 6 commits intoopensearch-project:mainfrom
Swiddis:fix/resource-monitor-errors

Conversation

@Swiddis
Copy link
Collaborator

@Swiddis Swiddis commented Feb 9, 2026

Description

Quick update for resource management errors.

Technically related to work on #4919 but these changes don't involve the framework, this just adds some resource monitoring infrastructure that didn't exist before.

Before:

{
  "error": {
    "reason": "There was internal problem at backend",
    "details": "java.sql.SQLException: exception while executing query: insufficient resources to run the query, quit.",
    "type": "RuntimeException"
  },
  "status": 500
}

After:

{
  "error": {
    "reason": "There was internal problem at backend",
    "details": "Insufficient resources to start query: Memory usage exceeds limit: 74510336 bytes used, 21474836 bytes limit (current: 71.1MB, limit: 20.5MB, usage: 347.0%). To increase the limit, adjust the 'plugins.query.memory_limit' setting (default: 85%).",
    "type": "RuntimeException"
  },
  "status": 500
}

Related Issues

Resolves #4869

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 9, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Introduces a structured ResourceStatus model and public status-reporting API for typed health and usage details.
  • Improvements

    • Health checks now return enriched status objects; runtime messages include formatted usage, limits and guidance for memory limit settings.
  • Refactor

    • Health-check flow updated so components consume status objects rather than raw boolean hooks.
  • Bug Fixes

    • Broadened and clarified resource-limit error message assertions.
  • Tests

    • Added unit and integration tests for status formatting, retry behavior, and edge cases.

Walkthrough

Replace boolean health checks with a structured ResourceStatus model: add ResourceStatus/ResourceType, add getStatus()/isHealthyImpl() hooks to ResourceMonitor, update monitors and execution/enumerator code to return/consume ResourceStatus, add memory-specific status methods, and update tests to assert enriched status descriptions and formatted messages.

Changes

Cohort / File(s) Summary
Resource Status Model
core/src/main/java/org/opensearch/sql/monitor/ResourceStatus.java
New ResourceStatus class with ResourceType enum, healthy()/unhealthy() factories, usage metrics, and getFormattedDescription() including human-readable byte formatting.
Monitor Base & Simple Impl
core/src/main/java/org/opensearch/sql/monitor/ResourceMonitor.java, core/src/main/java/org/opensearch/sql/monitor/AlwaysHealthyMonitor.java
ResourceMonitor now exposes getStatus() and protected isHealthyImpl(); isHealthy() delegates to getStatus().isHealthy(). AlwaysHealthyMonitor implements isHealthyImpl() and adds getStatus(). Public API signatures changed.
OpenSearch Memory Monitors
opensearch/src/main/java/org/opensearch/sql/opensearch/monitor/OpenSearchMemoryHealthy.java, opensearch/src/main/java/org/opensearch/sql/opensearch/monitor/OpenSearchResourceMonitor.java
Added getMemoryStatus(long) and getStatus() paths returning ResourceStatus. OpenSearchResourceMonitor.getStatus() applies retry logic and maps failures to unhealthy ResourceStatus.
Execution & Enumerator Checks
opensearch/src/main/java/org/opensearch/sql/opensearch/executor/protector/ResourceMonitorPlan.java, opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexEnumerator.java
Replace boolean isHealthy() checks with getStatus() lookups; on unhealthy, throw exceptions including status.getFormattedDescription(), rows-processed info, and memory-limit guidance.
Integration & Unit Tests
integ-test/src/test/java/org/opensearch/sql/ppl/ResourceMonitorIT.java, opensearch/src/test/java/org/opensearch/sql/opensearch/executor/ResourceMonitorPlanTest.java, opensearch/src/test/java/org/opensearch/sql/opensearch/monitor/OpenSearchResourceMonitorTest.java, core/src/test/java/org/opensearch/sql/monitor/ResourceMonitorTest.java, core/src/test/java/org/opensearch/sql/monitor/ResourceStatusTest.java
Tests updated to mock/expect ResourceStatus, assert enriched error messages and formatted descriptions, add tests for default ResourceMonitor behavior (throws), and extensive ResourceStatus formatting tests.
Misc — Executor Tests Rename
opensearch/src/test/java/.../OpenSearchResourceMonitorTest.java
Test methods renamed to test* variants and new tests added to validate getStatus() retry/transition behaviors.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Executor as Query Executor
    participant Monitor as ResourceMonitor
    participant Memory as MemoryMonitor
    participant Status as ResourceStatus

    Executor->>Monitor: getStatus()
    Monitor->>Memory: getMemoryStatus(limitBytes)
    Memory-->>Monitor: ResourceStatus (MEMORY, current, limit, desc)
    Monitor-->>Executor: ResourceStatus

    Executor->>Status: status.isHealthy()
    Status-->>Executor: boolean

    alt Unhealthy
        Executor->>Status: status.getFormattedDescription()
        Status-->>Executor: formatted message
        Executor->>Executor: throw exception with formatted message (includes memory-limit guidance / rows processed)
    else Healthy
        Executor->>Executor: continue execution
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

bugFix, backport-manually, backport 2.19-dev

Suggested reviewers

  • ps48
  • kavithacm
  • derek-ho
  • penghuo
  • anirudha
  • RyanL1997
  • vamsimanohar
  • MaxKsyunz
  • Yury-Fridlyand
  • dai-chen
  • yuancu
  • GumpacG
🚥 Pre-merge checks | ✅ 4 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (102 files):

⚔️ build.gradle (content)
⚔️ core/src/main/java/org/opensearch/sql/analysis/Analyzer.java (content)
⚔️ core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java (content)
⚔️ core/src/main/java/org/opensearch/sql/ast/expression/Let.java (content)
⚔️ core/src/main/java/org/opensearch/sql/ast/tree/FillNull.java (content)
⚔️ core/src/main/java/org/opensearch/sql/ast/tree/Join.java (content)
⚔️ core/src/main/java/org/opensearch/sql/calcite/CalcitePlanContext.java (content)
⚔️ core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (content)
⚔️ core/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.java (content)
⚔️ core/src/main/java/org/opensearch/sql/calcite/ExtendedRexBuilder.java (content)
⚔️ core/src/main/java/org/opensearch/sql/calcite/utils/JoinAndLookupUtils.java (content)
⚔️ core/src/main/java/org/opensearch/sql/executor/QueryService.java (content)
⚔️ core/src/main/java/org/opensearch/sql/expression/function/BuiltinFunctionName.java (content)
⚔️ core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MVAppendFunctionImpl.java (content)
⚔️ core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/MapAppendFunctionImpl.java (content)
⚔️ core/src/main/java/org/opensearch/sql/expression/function/FunctionResolver.java (content)
⚔️ core/src/main/java/org/opensearch/sql/expression/function/PPLBuiltinOperators.java (content)
⚔️ core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java (content)
⚔️ core/src/main/java/org/opensearch/sql/expression/function/jsonUDF/JsonExtractAllFunctionImpl.java (content)
⚔️ core/src/main/java/org/opensearch/sql/monitor/AlwaysHealthyMonitor.java (content)
⚔️ core/src/main/java/org/opensearch/sql/monitor/ResourceMonitor.java (content)
⚔️ core/src/test/java/org/opensearch/sql/expression/function/CollectionUDF/MapAppendFunctionImplTest.java (content)
⚔️ core/src/test/java/org/opensearch/sql/expression/function/jsonUDF/JsonExtractAllFunctionImplTest.java (content)
⚔️ docs/category.json (content)
⚔️ docs/dev/ppl-commands.md (content)
⚔️ docs/user/ppl/cmd/spath.md (content)
⚔️ docs/user/ppl/general/identifiers.md (content)
⚔️ doctest/test_data/structured.json (content)
⚔️ integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java (content)
⚔️ integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteAliasFieldAggregationIT.java (content)
⚔️ integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (content)
⚔️ integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.java (content)
⚔️ integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLSpathCommandIT.java (content)
⚔️ integ-test/src/test/java/org/opensearch/sql/calcite/standalone/JsonExtractAllFunctionIT.java (content)
⚔️ integ-test/src/test/java/org/opensearch/sql/calcite/standalone/MapAppendFunctionIT.java (content)
⚔️ integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java (content)
⚔️ integ-test/src/test/java/org/opensearch/sql/ppl/ResourceMonitorIT.java (content)
⚔️ integ-test/src/test/java/org/opensearch/sql/security/CalciteCrossClusterSearchIT.java (content)
⚔️ integ-test/src/test/java/org/opensearch/sql/util/MatcherUtils.java (content)
⚔️ integ-test/src/test/resources/alias.json (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/big5/dedup_metrics_size_field.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/clickbench/q29.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_complex_sort_expr_pushdown_for_smj_w_max_option.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex1.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex2.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex3.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_complex4.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr1.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr2.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr2_alternative.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr3_alternative.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr4_alternative.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr_complex1.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr_complex1_alternative.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr_complex2.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_expr_complex2_alternative.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_keepempty_false_push.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_push.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr1.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr2.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr2_alternative.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr3_alternative.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_dedup_with_expr4_alternative.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_earliest_latest.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_earliest_latest_custom_time.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_first_last.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_join_with_criteria_max_option.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_join_with_fields_max_option.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_max_string_field.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_min_string_field.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_patterns_simple_pattern_agg_push.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite/explain_take.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_alias_type_field.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/ppl/explain_patterns_simple_pattern_agg_push.yaml (content)
⚔️ integ-test/src/test/resources/expectedOutput/ppl/explain_take.yaml (content)
⚔️ integ-test/src/test/resources/indexDefinitions/alias_index_mapping.json (content)
⚔️ opensearch/src/main/java/org/opensearch/sql/opensearch/executor/OpenSearchExecutionEngine.java (content)
⚔️ opensearch/src/main/java/org/opensearch/sql/opensearch/executor/protector/ResourceMonitorPlan.java (content)
⚔️ opensearch/src/main/java/org/opensearch/sql/opensearch/monitor/OpenSearchMemoryHealthy.java (content)
⚔️ opensearch/src/main/java/org/opensearch/sql/opensearch/monitor/OpenSearchResourceMonitor.java (content)
⚔️ opensearch/src/main/java/org/opensearch/sql/opensearch/request/AggregateAnalyzer.java (content)
⚔️ opensearch/src/main/java/org/opensearch/sql/opensearch/request/OpenSearchRequest.java (content)
⚔️ opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java (content)
⚔️ opensearch/src/main/java/org/opensearch/sql/opensearch/response/agg/TopHitsParser.java (content)
⚔️ opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/AbstractCalciteIndexScan.java (content)
⚔️ opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/OpenSearchIndexEnumerator.java (content)
⚔️ opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/context/AggPushDownAction.java (content)
⚔️ opensearch/src/main/java/org/opensearch/sql/opensearch/storage/script/aggregation/dsl/MetricAggregationBuilder.java (content)
⚔️ opensearch/src/test/java/org/opensearch/sql/opensearch/executor/ResourceMonitorPlanTest.java (content)
⚔️ opensearch/src/test/java/org/opensearch/sql/opensearch/monitor/OpenSearchResourceMonitorTest.java (content)
⚔️ opensearch/src/test/java/org/opensearch/sql/opensearch/request/AggregateAnalyzerTest.java (content)
⚔️ opensearch/src/test/java/org/opensearch/sql/opensearch/storage/script/aggregation/dsl/MetricAggregationBuilderTest.java (content)
⚔️ ppl/src/main/antlr/OpenSearchPPLLexer.g4 (content)
⚔️ ppl/src/main/antlr/OpenSearchPPLParser.g4 (content)
⚔️ ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java (content)
⚔️ ppl/src/main/java/org/opensearch/sql/ppl/parser/AstExpressionBuilder.java (content)
⚔️ ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLAbstractTest.java (content)
⚔️ ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLSpathTest.java (content)

These conflicts must be resolved before merging into main.
Resolve conflicts locally and push changes to this branch.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Improve resource monitor errors' accurately describes the main change: enhancing error messages from resource monitoring checks with better context and formatting.
Description check ✅ Passed The description is related to the changeset, providing context about resource monitoring improvements, before/after error message comparisons, and linking to issue #4869.
Linked Issues check ✅ Passed The PR meets all objectives from #4869: replaces boolean isHealthy() with ResourceStatus object, adds descriptive messages with resource type and usage details, provides guidance to adjust plugins.query.memory_limit setting, and includes comprehensive unit tests.
Out of Scope Changes check ✅ Passed All changes are in-scope: ResourceStatus class and monitoring API refactoring, error message enhancements in ResourceMonitorPlan and OpenSearchIndexEnumerator, and comprehensive test coverage aligning with #4869 objectives.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
⚔️ Resolve merge conflicts (beta)
  • Auto-commit resolved conflicts to branch fix/resource-monitor-errors
  • Post resolved changes as copyable diffs in a comment

No actionable comments were generated in the recent review. 🎉


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/src/main/java/org/opensearch/sql/monitor/ResourceMonitor.java (1)

13-38: ⚠️ Potential issue | 🟠 Major

Add required @throws tags to public method JavaDoc.
isHealthy() and getStatus() can throw UnsupportedOperationException via isHealthyImpl(), but their JavaDoc lacks @throws.

📌 Proposed JavaDoc fix
 /**
  * Is the resource healthy.
  *
  * `@return` true for healthy, otherwise false.
+ * `@throws` UnsupportedOperationException if the subclass doesn't override getStatus() or
+ *     isHealthyImpl()
  */
 public boolean isHealthy() {
   return getStatus().isHealthy();
 }
@@
  /**
   * Get detailed resource status including context for error messages. Subclasses should override
   * this method to provide rich status information.
   *
   * `@return` ResourceStatus with health state and detailed context
+  * `@throws` UnsupportedOperationException if the subclass doesn't override getStatus() or
+  *     isHealthyImpl()
   */
 public ResourceStatus getStatus() {

As per coding guidelines, Public methods MUST have JavaDoc with @param, @return, and @throws.

🤖 Fix all issues with AI agents
In `@core/src/main/java/org/opensearch/sql/monitor/ResourceMonitor.java`:
- Around line 28-52: The inline comment for ResourceMonitor.isHealthyImpl() is
incorrect; update the comment to state that the default implementation throws
UnsupportedOperationException and that subclasses must override either
getStatus() or isHealthyImpl() (reference ResourceMonitor, isHealthyImpl(),
getStatus()); then add a unit test in the core module that creates a minimal
anonymous subclass of ResourceMonitor which does not override getStatus() or
isHealthyImpl() and asserts that calling isHealthyImpl() (or invoking the
default health path) results in UnsupportedOperationException to document the
exception path (ensure the test references OpenSearchResourceMonitor and
AlwaysHealthyMonitor only as existing overrides for context).

In `@core/src/main/java/org/opensearch/sql/monitor/ResourceStatus.java`:
- Around line 84-90: In getFormattedDescription, guard against maxLimit <= 0
before computing percentage: check maxLimit and only compute and include the
usage percentage when maxLimit > 0; otherwise return a string that includes
description, formatBytes(currentUsage) and formatBytes(maxLimit) but omits the
percentage. Update the conditional that currently checks currentUsage != null &&
maxLimit != null to handle maxLimit <= 0 and use
formatBytes(currentUsage)/formatBytes(maxLimit) accordingly so you don't produce
Infinity/NaN in the output.
- Around line 18-92: Add unit tests for the new ResourceStatus class: cover the
static factory methods ResourceStatus.healthy(ResourceType) and
ResourceStatus.unhealthy(ResourceType,long,long,String) for each ResourceType
enum value, assert the healthy flag, type, description, and metric fields; test
getFormattedDescription() both when currentUsage/maxLimit are set (verify
formatted string contains formatted bytes and percentage) and when they are null
(returns description); add boundary byte-formatting tests exercising formatBytes
via getFormattedDescription() for values <1KB (e.g., 0 and 512), exactly 1KB,
1MB, 1GB, and edge cases like currentUsage==0 and currentUsage==maxLimit to
ensure correct formatting and percentage calculations. Ensure tests live
alongside the class tests and use clear assertions for fields and formatted
output.

1. Fixed contradictory comment in ResourceMonitor.isHealthyImpl() that
   claimed to delegate to old behavior but actually throws exception
2. Added unit test for ResourceMonitor default exception path
3. Created comprehensive unit tests for ResourceStatus class covering:
   - Factory methods (healthy/unhealthy) for all ResourceType values
   - getFormattedDescription() with and without metrics
   - Byte formatting boundary conditions (<1KB, 1KB, 1MB, 1GB)
   - Edge cases (zero usage, usage equals limit, invalid limit)
4. Fixed getFormattedDescription() to handle maxLimit <= 0 gracefully
   by treating it as 0 and omitting percentage calculation

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
@Swiddis Swiddis force-pushed the fix/resource-monitor-errors branch from c416116 to aa89c9d Compare February 10, 2026 19:27
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@core/src/main/java/org/opensearch/sql/monitor/ResourceMonitor.java`:
- Around line 22-39: Update the JavaDoc for getStatus() to document that calls
to isHealthyImpl() may propagate an UnsupportedOperationException: add an
`@throws` UnsupportedOperationException description to the existing comment for
getStatus(), referencing that the default implementation may throw this when
isHealthyImpl() is not supported; keep the existing `@return` text and general
description intact and ensure the JavaDoc follows the project guideline for
public methods.
- Around line 18-20: Update the JavaDoc for the public method isHealthy() to
include an `@throws` entry documenting that it can throw
UnsupportedOperationException (when neither getStatus() nor isHealthyImpl() is
overridden), e.g., add a brief sentence under `@throws` naming
UnsupportedOperationException and the condition; ensure the existing `@return`
remains and method summary stays intact for isHealthy().

In `@core/src/test/java/org/opensearch/sql/monitor/ResourceStatusTest.java`:
- Around line 18-70: The tests for ResourceStatus factory methods are missing
null-input coverage; add unit tests that call ResourceStatus.healthy(null) and
ResourceStatus.unhealthy(null, ...) and also pass null descriptions to unhealthy
(e.g., ResourceStatus.unhealthy(ResourceType.MEMORY, 1, 2, null)) and assert the
expected contract (either assertThrows for NPE/IllegalArgumentException or
assert fallback behavior) to match ResourceStatus' defined behavior; include
checks for both null ResourceType and null description for each factory method
(healthy and unhealthy) and use the existing test structure (ResourceStatusTest)
to locate ResourceStatus.healthy and ResourceStatus.unhealthy.

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In
`@opensearch/src/test/java/org/opensearch/sql/opensearch/monitor/OpenSearchResourceMonitorTest.java`:
- Around line 82-144: Rename the JUnit test methods in
OpenSearchResourceMonitorTest to follow the test<Functionality><Condition>
pattern: change getStatusHealthy to testGetStatusHealthy,
getStatusUnhealthyWithRetry to testGetStatusUnhealthyWithRetry, and
getStatusBecomesHealthyAfterRetry to testGetStatusBecomesHealthyAfterRetry;
update the method names for the three tests (they are the `@Test` methods that
instantiate OpenSearchResourceMonitor and call getStatus()) and run tests to
ensure no other references break.
- Around line 82-144: Add unit tests for OpenSearchResourceMonitor.getStatus to
cover null/boundary/error conditions: (1) stub settings.getSettingValue(...) to
return null and assert getStatus returns a valid ResourceStatus (no uncaught
exception) and verify expected calls to memoryMonitor.getMemoryStatus; (2) add
boundary tests where the memory limit is 0 and where usage == limit to assert
healthy/unhealthy per existing logic and verify call counts; (3) stub
memoryMonitor.getMemoryStatus(...) to throw an exception and assert getStatus
handles it (doesn't propagate) and verify retry behavior (calls to
memoryMonitor.getMemoryStatus and how many retries are attempted). Use the same
test structure as existing tests, Mockito stubbing/thenThrow for exceptions, and
reference OpenSearchResourceMonitor, settings.getSettingValue, and
memoryMonitor.getMemoryStatus when adding test methods.
🧹 Nitpick comments (2)
opensearch/src/main/java/org/opensearch/sql/opensearch/monitor/OpenSearchMemoryHealthy.java (1)

73-98: Include human-readable units/percent in the status description.

The description currently emits raw bytes, which may undercut the “richer error message” goal. Consider formatting with human-readable units and percentage (or delegating to a shared formatter) to keep downstream responses consistent.

♻️ Example formatting adjustment
+import org.opensearch.core.common.unit.ByteSizeValue;
...
-      String description =
-          String.format(
-              "Memory usage exceeds limit: %d bytes used, %d bytes limit",
-              currentMemoryUsage, limitBytes);
+      double usagePct = limitBytes > 0 ? (currentMemoryUsage * 100.0 / limitBytes) : 0.0;
+      String description =
+          String.format(
+              "Memory usage exceeds limit: %s used / %s limit (%.2f%%)",
+              ByteSizeValue.ofBytes(currentMemoryUsage),
+              ByteSizeValue.ofBytes(limitBytes),
+              usagePct);
opensearch/src/main/java/org/opensearch/sql/opensearch/monitor/OpenSearchResourceMonitor.java (1)

33-45: Avoid retrying on definitively unhealthy status.

statusRetry retries whenever the status is unhealthy, which can add latency under sustained memory pressure and doesn’t seem to provide new signal. Consider returning the first unhealthy status (or retrying only on transient exceptions) to minimize extra work when memory is already over limit.

Signed-off-by: Simeon Widdis <sawiddis@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request error-experience Issues related to how we handle failure cases in the plugin.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Improve resource manager error messaging

2 participants