Validate cached batch input seek bounds#15082
Conversation
Greptile SummaryThis PR tightens input validation in the
Confidence Score: 5/5The change is a targeted defensive fix to a single guard expression and introduces no new logic paths or regressions. The old bounds check accepted any Long value that fit in a signed 32-bit integer, even if it exceeded the buffer's actual capacity. The new check correctly restricts valid positions to [0, buff.length]. The comparison is safe because Scala/JVM will widen the Int buff.length to Long before the comparison. Seeking to exactly buff.length (EOF position) remains permitted, which is required by the Parquet footer-reading protocol. No other code paths are touched. No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["seek(newPos: Long)"] --> B{newPos < 0?}
B -- yes --> E["throw IllegalStateException"]
B -- no --> C{newPos > buff.length?}
C -- yes --> E
C -- no --> D["byteBuffer.position(newPos.toInt)"]
style E fill:#f66,color:#fff
style D fill:#6a6,color:#fff
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
A["seek(newPos: Long)"] --> B{newPos < 0?}
B -- yes --> E["throw IllegalStateException"]
B -- no --> C{newPos > buff.length?}
C -- yes --> E
C -- no --> D["byteBuffer.position(newPos.toInt)"]
style E fill:#f66,color:#fff
style D fill:#6a6,color:#fff
Reviews (1): Last reviewed commit: "Validate cached batch input seek bounds" | Re-trigger Greptile |
Signed-off-by: Minh Vu <vuhoangminh97@gmail.com>
94c35ec to
0f39ff8
Compare
No issue filed.
Description
Tighten the cached batch parquet input stream seek validation so it accepts only positions within the backing byte array. This gives a clear error before calling into
ByteBuffer.position.Checklists
Documentation
Testing
(Please provide the names of the existing tests in the PR description.)
Performance