Pipe: merge batched aligned chunks in scan parser by Caideyipi · Pull Request #18010 · apache/iotdb

Caideyipi · 2026-06-23T09:22:10Z

Description

This PR improves the pipe TsFile scan parser for legal aligned TsFiles whose value chunks are physically written in column batches, such as files produced by batched aligned compaction.

The current scan parser emits an aligned tablet when the value chunk occurrence index changes. For batched aligned compaction output, value chunks can be laid out as:

time chunk 0, time chunk 1
value columns 0-9 for chunk 0 and chunk 1
value columns 10-19 for chunk 0 and chunk 1
...

This layout is valid, but the previous parser behavior makes the emitted tablets inherit the physical compaction batch width, commonly 10 columns from compaction_max_aligned_series_num_in_one_batch, even when pipe reader memory allows a wider aligned tablet. That increases the number of tablets and hurts pipe performance.

This PR changes the scan parser to cache pending aligned value chunk groups by time chunk index and emit them only when memory limits or chunk group boundaries require it. With enough memory, consecutive physical value column batches for the same aligned chunks are merged into wider aligned tablets instead of being split at the compaction batch boundary.

It also defines pipeDataStructureTabletRowSize <= 0 as disabling the row-count cap for pipe tablets. In that mode, tablet row count is calculated only from pipe_data_structure_tablet_size_in_bytes, so users can rely on the memory-size limit instead of the fixed row-count limit.

Changes

Cache aligned value chunks in pending groups keyed by time chunk index in TsFileInsertionEventScanParser.
Preserve chunk/page memory protection when merging multiple physical aligned value chunk groups.
Keep cached value chunk replay subject to the same memory threshold checks.
Treat non-positive pipeDataStructureTabletRowSize as no row-count cap in PipeMemoryWeightUtil.
Add tests for batched aligned value chunk layout merging, memory-boundary flushing, and disabling the tablet row-size cap with 0/negative values.

Validation

mvn spotless:apply -pl iotdb-core/datanode
git diff --check

I also tried:

mvn -Ddevelocity.off=true -pl iotdb-core/datanode -DskipTests compile
mvn -Ddevelocity.off=true -Dmaven.main.skip=true -pl iotdb-core/datanode -Dtest=TsFileInsertionEventParserTest#testScanParserMergesBatchedAlignedValueChunkGroups+testPipeTabletRowSizeCanBeDisabledByNonPositiveValue test
mvn -pl iotdb-core/datanode -Dtest=TsFileInsertionEventParserTest#testScanParserMergesBatchedAlignedValueChunkGroups+testScanParserFlushesBatchedAlignedValueChunkGroupsByMemoryLimit+testPipeTabletRowSizeCanBeDisabledByNonPositiveValue test

These Maven compile/test attempts are blocked in this workspace by existing datanode-wide compile issues outside this PR, including generated query fill/aggregation classes and IOUtils.readFully unresolved symbols in unrelated files. The focused tests did not get executed because compilation fails before Surefire runs.

codecov · 2026-06-24T04:58:55Z

Codecov Report

❌ Patch coverage is 96.75676% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 41.42%. Comparing base (511d08f) to head (028deef).
⚠️ Report is 20 commits behind head on master.

Files with missing lines	Patch %	Lines
...le/parser/scan/TsFileInsertionEventScanParser.java	96.25%	6 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #18010      +/-   ##
============================================
+ Coverage     41.24%   41.42%   +0.18%     
  Complexity      318      318              
============================================
  Files          5272     5281       +9     
  Lines        367956   369190    +1234     
  Branches      47610    47770     +160     
============================================
+ Hits         151769   152946    +1177     
- Misses       216187   216244      +57

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sonarqubecloud · 2026-06-25T08:26:40Z

Quality Gate passed

Issues
9 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

* Pipe: merge batched aligned chunks in scan parser * Test pipe batched aligned chunk memory boundaries * Pipe: fix batched aligned scan parser memory split * Update TsFileInsertionEventParserTest.java * Rename pending aligned chunk consumer (cherry picked from commit f96fc58)

* Pipe: merge batched aligned chunks in scan parser * Test pipe batched aligned chunk memory boundaries * Pipe: fix batched aligned scan parser memory split * Update TsFileInsertionEventParserTest.java * Rename pending aligned chunk consumer

* Pipe: merge batched aligned chunks in scan parser * Test pipe batched aligned chunk memory boundaries * Pipe: fix batched aligned scan parser memory split * Update TsFileInsertionEventParserTest.java * Rename pending aligned chunk consumer (cherry picked from commit f96fc58)

Caideyipi added 4 commits June 23, 2026 17:19

Pipe: merge batched aligned chunks in scan parser

66b19a5

Test pipe batched aligned chunk memory boundaries

f2bd2eb

Pipe: fix batched aligned scan parser memory split

e3d496b

Update TsFileInsertionEventParserTest.java

5a826e3

jt2594838 reviewed Jun 25, 2026

View reviewed changes

Comment thread ...org/apache/iotdb/db/pipe/event/common/tsfile/parser/scan/TsFileInsertionEventScanParser.java Outdated

Rename pending aligned chunk consumer

028deef

jt2594838 merged commit f96fc58 into master Jun 26, 2026
43 of 45 checks passed

jt2594838 deleted the fix/pipe-merge-batched-aligned-chunks branch June 26, 2026 06:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pipe: merge batched aligned chunks in scan parser#18010

Pipe: merge batched aligned chunks in scan parser#18010
jt2594838 merged 5 commits into
masterfrom
fix/pipe-merge-batched-aligned-chunks

Caideyipi commented Jun 23, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

sonarqubecloud Bot commented Jun 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Caideyipi commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Validation

Uh oh!

codecov Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

sonarqubecloud Bot commented Jun 25, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Caideyipi commented Jun 23, 2026 •

edited

Loading

codecov Bot commented Jun 24, 2026 •

edited

Loading