Skip to content

[BugFix] Fix silent integer and long overflow wrapping in Calcite arithmetic (#5164)#26

Closed
qianheng-aws wants to merge 2 commits intorefactor/dedupe-reusable-workflowfrom
bugfix/integer-overflow-5164
Closed

[BugFix] Fix silent integer and long overflow wrapping in Calcite arithmetic (#5164)#26
qianheng-aws wants to merge 2 commits intorefactor/dedupe-reusable-workflowfrom
bugfix/integer-overflow-5164

Conversation

@qianheng-aws
Copy link
Copy Markdown
Owner

Description

Integer and long arithmetic in the Calcite execution path (eval command) silently wraps on overflow. For example, 2147483647 + 1 returns -2147483648 instead of raising an error.

Root cause: The PPL function table registered SqlStdOperatorTable.PLUS/MINUS/MULTIPLY for numeric arithmetic, which use Java's default wrapping semantics for integer/long types.

Fix: Created SafeArithmeticFunction UDFs that use Math.addExact, Math.subtractExact, and Math.multiplyExact for integer and long types, throwing ArithmeticException on overflow. Floating-point and decimal arithmetic is unchanged (IEEE 754 semantics). The UDFs use the same operator symbols (+, -, *) and SqlSyntax.BINARY to preserve identical plan display and SQL generation.

Related Issues

Resolves opensearch-project#5164

Check List

  • New functionality includes testing
  • Commits signed per DCO (-s)
  • spotlessCheck passed (blocked — could not run gradle)
  • Unit tests passed (blocked — could not run gradle)
  • Integration tests passed (blocked — could not run gradle)

Test files

  • SafeArithmeticFunctionTest.java — unit tests for the safe arithmetic methods
  • CalcitePPLArithmeticOverflowTest.java — PPL unit tests for overflow detection
  • CalciteArithmeticOverflowIT.java — integration tests
  • 5164.yml — YAML REST test

…pensearch-project#5269) (opensearch-project#5293)

When querying across indices with conflicting field mappings (boolean vs
text), numeric values like 0 were not coerced to boolean, causing
"node must be a boolean, found NUMBER" error. Added numeric handling to
parseBooleanValue() consistent with ObjectContent.booleanValue().

Signed-off-by: Heng Qian <qianheng@amazon.com>
…thmetic (opensearch-project#5164)

Replace SqlStdOperatorTable.PLUS/MINUS/MULTIPLY with overflow-safe UDFs
that use Math.addExact/subtractExact/multiplyExact for integer and long
types. Floating-point arithmetic is unchanged (IEEE 754 semantics).

Signed-off-by: Heng Qian <hengq@amazon.com>
Signed-off-by: Heng Qian <qianheng@amazon.com>
@qianheng-aws
Copy link
Copy Markdown
Owner Author

Decision Log

Root Cause: The Calcite execution path for PPL arithmetic registered SqlStdOperatorTable.PLUS, SqlStdOperatorTable.MINUS, and SqlStdOperatorTable.MULTIPLY in PPLFuncImpTable. These standard Calcite operators use Java's default wrapping arithmetic semantics, so Integer.MAX_VALUE + 1 wraps to Integer.MIN_VALUE without error.

Approach: Created SafeArithmeticFunction — custom SqlUserDefinedFunction operators named +, -, * with SqlSyntax.BINARY so they display identically to the standard operators in both logical plans and generated SQL. The UDF implementors dispatch to Math.addExact/Math.subtractExact/Math.multiplyExact for integer/long types (throwing ArithmeticException on overflow) and standard IEEE 754 arithmetic for floating-point/decimal types.

Alternatives Rejected:

  1. Modifying RexImpTable singleton — would affect all Calcite users of the standard operators globally; RexImpTable.INSTANCE is immutable after construction.
  2. Custom SqlBinaryOperator subclassSqlBinaryOperator doesn't have a hook for custom code generation; would still need RexImpTable registration.
  3. Post-plan validation shuttle — can only catch compile-time overflow (constant expressions), not runtime overflow from column values.

Pitfalls:

  • Must use Math.addExact(int, int) for int+int and Math.addExact(long, long) for long+long separately. Using only the long variant with coerceToWidestIntegralType would mask int overflow (e.g., 2147483648L fits in a long but not an int).
  • The DATETIME-DATETIME variant of SUBTRACT must remain using SqlStdOperatorTable.MINUS (not the safe UDF) since datetime subtraction is not numeric.

Things to Watch:

  • The safe arithmetic UDFs cannot be constant-folded by ReduceExpressionsRule (production path doesn't use it anyway). If Calcite constant folding is added later, overflow in constant expressions would throw at planning time rather than execution time.
  • Verify that SqlSyntax.BINARY on the UDF produces identical RelToSqlConverter output as the standard operators (infix a + b notation). This was designed to preserve verifyPPLToSparkSQL test expectations but needs gradle test verification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Silent integer and long overflow wrapping in Calcite arithmetic path

1 participant