Skip to content

[OSS PR #18484] feat(flink): Add metrics for RocksDB index backend in bucket assigner#29

Open
yihua wants to merge 2 commits intomasterfrom
oss-18484
Open

[OSS PR #18484] feat(flink): Add metrics for RocksDB index backend in bucket assigner#29
yihua wants to merge 2 commits intomasterfrom
oss-18484

Conversation

@yihua
Copy link
Copy Markdown
Owner

@yihua yihua commented Apr 9, 2026

Mirror of apache#18484 for automated bot review.

Original author: @cshuo
Base branch: master

Summary by CodeRabbit

  • New Features

    • Added clustering expiration with heartbeat-based rollback eligibility.
    • Introduced vector data type support with shredded variant schemas for Parquet.
    • Added SQL vector search functions for KNN similarity queries.
    • Implemented column pruning optimization for incremental reads.
    • Added catalog-backed partition discovery option.
  • Improvements

    • Enhanced log compaction with threshold-based scheduling.
    • Improved CDC record iteration and image management in Flink.
    • Centralized table service handler factory for cleaner compaction pipelines.
    • Enhanced read statistics tracking for log blocks.
  • Bug Fixes

    • Removed deprecated wrapper filesystem metrics initialization.
    • Fixed LANCE format validation in non-Spark engines.
    • Improved instant time validation and formatting for incremental queries.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 9, 2026

📝 Walkthrough

Walkthrough

Comprehensive feature expansion and architectural refactoring across Hudi: adds clustering instant expiration eligibility checking with heartbeat coordination, shredded Variant schema support for Parquet serialization, Spark SQL vector similarity search table-valued functions (single and batch modes), catalog-backed partition discovery, log compaction threshold-based scheduling, incremental query column pruning, and Flink compaction/clean handler abstraction with composite/factory patterns. Also consolidates CDC iterator implementations, enhances RocksDB metrics exposure, and enforces LANCE format Spark-engine-only constraint.

Changes

Cohort / File(s) Summary
Clustering Expiration & Eligibility
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieTableServiceClient.java, hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java, hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java, hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java
Added clustering expiration configuration (ENABLE_EXPIRATIONS, EXPIRATION_THRESHOLD_MINS), heartbeat-based instant eligibility checking, and rollback candidate filtering for clustering instants; updated UPDATES_STRATEGY inference based on conflict resolution preference.
Variant Schema & Shredding
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchema.java, hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieFileIndex.scala, hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala, hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/io/storage/row/HoodieRowParquetWriteSupport.java
Added Variant schema validation (no nesting), factory methods for shredded field structs and shredded Variant objects, and getPlainTypedValueSchema() extraction; enhanced Parquet writer to detect/handle shredded Variant structs via Spark adapter.
Vector Search TVFs
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/HoodieVectorSearchTableValuedFunction.scala, hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/analysis/HoodieVectorSearchPlanBuilder.scala, hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/analysis/VectorDistanceUtils.scala
Introduced vector search table-valued functions (hudi_vector_search, hudi_vector_search_batch) with distance metrics (cosine, L2, dot-product), brute-force search algorithm, and UDF-based distance computation.
CDC Consolidation
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcIterators.java, hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcImageManager.java, hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcInputFormat.java
Consolidated CDC iterator implementations (base, before/after image, log file, replace-commit variants) and extracted image management into dedicated CdcImageManager with spillable maps and serialization support.
Log Compaction Scheduling
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/ScheduleCompactionActionExecutor.java, hudi-common/src/main/java/org/apache/hudi/common/util/CompactionUtils.java
Added log-compaction-specific scheduling logic: needLogCompact() threshold comparison, getLatestDeltaCommitInfoSinceLogCompaction() timeline queries, and helper methods for pending log compaction discovery.
Partition Discovery via Catalog
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala, hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/metadata/CatalogBackedTableMetadata.scala, hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/SparkHoodieTableFileIndex.scala
Added FILE_INDEX_PARTITION_LISTING_VIA_CATALOG config option and CatalogBackedTableMetadata to source partitions from Spark catalog instead of filesystem enumeration.
Incremental Query Column Pruning
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/IncrementalRelationV1.scala, hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/IncrementalRelationV2.scala, hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/util/IncrementalRelationUtil.scala
Updated incremental relations to implement PrunedScan instead of TableScan, enabling column pruning; added IncrementalRelationUtil for schema pruning and partition/ordering field filtering.
Flink Handler Abstraction & Factory
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompactHandler.java, hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompactionCommitHandler.java, hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompactionPlanHandler.java, hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/TableServiceHandlerFactory.java, hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/DataTable*.java, hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/Composite*.java
Converted concrete handler classes to interfaces (CompactHandler, CompactionCommitHandler, CompactionPlanHandler, CleanHandler); introduced factory pattern via TableServiceHandlerFactory and composite/data-table implementations for metadata-aware handler routing.
Rollback Refactoring
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/RollbackHelper.java, hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/RollbackHelperV1.java, hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java
Simplified V8+ rollback by removing log-block handling paths; restructured V1 rollback to pre-compute log versions and enforced table-version preconditions in strategy implementation.
Instant Time Formatting
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/HoodieSqlCommonUtils.scala, hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/HoodieTableChanges.scala
Extended instant format support to include ISO datetimes with T separator, fractional seconds, and epoch seconds/milliseconds; introduced formatIncrementalInstant for sentinel pass-through.
Registry & Metrics
hudi-io/src/main/java/org/apache/hudi/common/metrics/Registry.java, hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/common/HoodieSparkEngineContext.java, hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/metrics/DistributedRegistry.java, hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkRocksDBIndexMetrics.java
Introduced table-aware compound-key registry model, distributed registry support in Spark engine context with executor-side availability, and RocksDB metrics exposure via Flink gauge registration.
Wrapper Filesystem Metrics Refactoring
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java, hudi-client/hudi-client-common/src/main/java/org/apache/hudi/util/DistributedRegistryUtil.java
Centralized wrapper filesystem metrics initialization via DistributedRegistryUtil; removed override from SparkRDDWriteClient and delegated to factory utility.
HFile & IO Utilities
hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileMetaIndexBlock.java, hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileRootIndexBlock.java, hudi-io/src/main/java/org/apache/hudi/io/util/IOUtils.java
Added variable-length integer encoding via IOUtils.writeVarInt() for HFile index block serialization.
LANCE Format Rejection
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileFormat.java, hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java, hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieFileReaderFactory.java, hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieFileWriterFactory.java
Added LANCE_SPARK_ONLY_ERROR_MSG constant and validation gates in Flink/Hive catalog and reader/writer factories to reject LANCE format for non-Spark engines.
Timeline & Schema Utilities
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/BaseHoodieTimeline.java, hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieTimeline.java, hudi-common/src/main/java/org/apache/hudi/common/table/log/LogReaderUtils.java, hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadata.java, hudi-common/src/main/java/org/apache/hudi/util/PartitionPathFilterUtil.java
Added timeline filtering by completion time, log file-to-instant mapping utilities, partition path prefix predicates, and metadata overload supporting predicate expressions.
Comprehensive Test Coverage
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestHoodieVectorSearchFunction.scala, hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/metrics/TestDistributedRegistry.java, hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestIncrementalQueryColumnPruning.scala, hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/common/TestCatalogBackedTableMetadata.scala
Added extensive test suites for vector search (40+ test cases), distributed registry behavior, incremental column pruning, catalog-backed metadata, instant time validation, and handler composition.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Spark Client
    participant Planner as Vector Search Planner
    participant Analyzer as Distance Function Builder
    participant Executor as Spark Executor
    participant Corpus as Corpus DataFrame
    participant Index as Vector Index/Broadcast

    Client->>Planner: buildSingleQueryPlan(corpusDf, embeddingCol, queryVector, k, metric)
    Planner->>Planner: validate embedding column exists & is array-typed
    Planner->>Analyzer: createDistanceUdf(metric, elementType, queryVector)
    Analyzer->>Analyzer: pre-compute query vector norm
    Analyzer-->>Planner: distance UDF
    Planner->>Planner: filter null embeddings from corpus
    Planner->>Executor: compute _hudi_distance per row via UDF
    Executor->>Executor: vectorized distance calculation (cosine/L2/dot-product)
    Planner->>Planner: order by _hudi_distance ASC
    Planner->>Planner: limit to k results
    Planner-->>Client: analyzed LogicalPlan with k nearest neighbors
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 Whiskers twitch with clustering delight,
Shredded Variants now parse just right,
Vectors dance in nearest-neighbor flight,
Handlers refactored, factory-bright,
Spark and Flink in perfect sight!

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch oss-18484

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieTableServiceClient.java (1)

733-742: ⚠️ Potential issue | 🟠 Major

Start the clustering heartbeat before persisting the requested instant.

Lines 739-742 start the heartbeat only after scheduleClustering(...) returns. That leaves a race where the pending clustering instant already exists but has no heartbeat yet, so a concurrent lazy-clean / partition rollback with hoodie.clustering.expiration.threshold.mins=0 can treat a freshly scheduled clustering as stale. Start the heartbeat first, and stop it again if no plan is produced or scheduling fails.

♻️ Proposed fix
        case CLUSTER:
          log.info("Scheduling clustering at instant time: {} for table {}", instantTime, config.getBasePath());
-          Option<HoodieClusteringPlan> clusteringPlan = table
-              .scheduleClustering(context, instantTime, extraMetadata);
-          option = clusteringPlan.map(plan -> instantTime);
-          if (option.isPresent() && config.isExpirationOfClusteringEnabled()) {
-            heartbeatClient.start(instantTime);
-            log.info("Started heartbeat for clustering instant {}", instantTime);
-          }
+          boolean heartbeatStarted = false;
+          try {
+            if (config.isExpirationOfClusteringEnabled()) {
+              heartbeatClient.start(instantTime);
+              heartbeatStarted = true;
+              log.info("Started heartbeat for clustering instant {}", instantTime);
+            }
+            Option<HoodieClusteringPlan> clusteringPlan = table
+                .scheduleClustering(context, instantTime, extraMetadata);
+            option = clusteringPlan.map(plan -> instantTime);
+            if (option.isEmpty() && heartbeatStarted) {
+              heartbeatClient.stop(instantTime);
+            }
+          } catch (Throwable t) {
+            if (heartbeatStarted) {
+              heartbeatClient.stop(instantTime);
+            }
+            throw t;
+          }
           break;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieTableServiceClient.java`
around lines 733 - 742, Start the clustering heartbeat before calling
table.scheduleClustering(...) to avoid a race where a pending clustering instant
exists without a heartbeat; in BaseHoodieTableServiceClient change the CLUSTER
branch so heartbeatClient.start(instantTime) is invoked prior to calling
table.scheduleClustering(context, instantTime, extraMetadata), and if
scheduleClustering returns empty or throws, stop the heartbeat
(heartbeatClient.stop(instantTime) or equivalent) and avoid leaving a running
heartbeat. Ensure existing checks using option/isExpirationOfClusteringEnabled
remain (only start/stop when config.isExpirationOfClusteringEnabled() is true)
and that failures still propagate after cleanup.
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchema.java (1)

421-447: ⚠️ Potential issue | 🟠 Major

Nested VECTORs still slip through ARRAY<STRUCT<...>> and MAP<STRING, STRUCT<...>>.

createArray / createMap only reject direct VECTOR element/value schemas, and validateNoVectorInNestedRecord only walks child RECORDs. A schema like array(record{embedding: VECTOR}) or map<string, record{embedding: VECTOR}> still passes construction even though the new contract says VECTOR inside ARRAY/MAP is unsupported. That leaves invalid schemas to fail much later instead of being rejected here.

Possible fix
-  private static void validateNoVectorInNestedRecord(List<HoodieSchemaField> fields, boolean nested) {
-    for (HoodieSchemaField field : fields) {
-      HoodieSchema nonNull = field.schema().getNonNullType();
-      if (nested && nonNull.getType() == HoodieSchemaType.VECTOR) {
-        throw new HoodieSchemaException(
-            "VECTOR column '" + field.name() + "' must be a top-level field. "
-                + "Nested VECTOR columns (inside STRUCT, ARRAY, or MAP) are not supported.");
-      }
-      if (nonNull.getType() == HoodieSchemaType.RECORD) {
-        validateNoVectorInNestedRecord(nonNull.getFields(), true);
-      }
-    }
-  }
+  private static void validateNoVectorInNestedRecord(List<HoodieSchemaField> fields, boolean nested) {
+    for (HoodieSchemaField field : fields) {
+      validateNoNestedVector(field.schema(), nested, field.name());
+    }
+  }
+
+  private static void validateNoNestedVector(HoodieSchema schema, boolean nested, String fieldName) {
+    HoodieSchema nonNull = schema.getNonNullType();
+    if (nested && nonNull.getType() == HoodieSchemaType.VECTOR) {
+      throw new HoodieSchemaException(
+          "VECTOR column '" + fieldName + "' must be a top-level field. "
+              + "Nested VECTOR columns (inside STRUCT, ARRAY, or MAP) are not supported.");
+    }
+    switch (nonNull.getType()) {
+      case RECORD:
+        validateNoVectorInNestedRecord(nonNull.getFields(), true);
+        break;
+      case ARRAY:
+        validateNoNestedVector(nonNull.getElementType(), true, fieldName);
+        break;
+      case MAP:
+        validateNoNestedVector(nonNull.getValueType(), true, fieldName);
+        break;
+      case UNION:
+        nonNull.getTypes().forEach(t -> validateNoNestedVector(t, nested, fieldName));
+        break;
+      default:
+    }
+  }

Also applies to: 484-516

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchema.java`
around lines 421 - 447, createArray and createMap only reject VECTOR when it's
the direct element/value type but allow VECTOR nested inside RECORDs (e.g.,
ARRAY<RECORD{embedding: VECTOR}>); update createArray(HoodieSchema) and
createMap(HoodieSchema) to recursively validate the element/value schema for
nested VECTORs (either call or extend validateNoVectorInNestedRecord to traverse
RECORD children and throw HoodieSchemaException on any nested VECTOR), ensuring
any RECORD-containing element/value is scanned for VECTOR fields before building
the Avro array/map schema; apply the same recursive check used elsewhere (see
validateNoVectorInNestedRecord and the VECTOR-related error messages) so nested
VECTORs are rejected early.
🟠 Major comments (21)
hudi-hadoop-common/src/main/java/org/apache/hudi/common/util/ParquetReaderIterator.java-58-63 (1)

58-63: ⚠️ Potential issue | 🟠 Major

Route error-path cleanup through close() to preserve idempotent close semantics.

Line 60 now closes on EOF, but error paths still close the raw parquetReader directly. That bypasses the new closed guard and can trigger duplicate ParquetReader.close() calls.

💡 Proposed fix
@@
     } catch (Exception e) {
-      FileIOUtils.closeQuietly(parquetReader);
+      try {
+        close();
+      } catch (Exception ignore) {
+        // no-op: preserve original failure context
+      }
       throw new HoodieException("unable to read next record from parquet file ", e);
     }
   }
@@
     } catch (Exception e) {
-      FileIOUtils.closeQuietly(parquetReader);
+      try {
+        close();
+      } catch (Exception ignore) {
+        // no-op: preserve original failure context
+      }
       throw new HoodieException("unable to read next record from parquet file ", e);
     }
   }

Also applies to: 88-94

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-hadoop-common/src/main/java/org/apache/hudi/common/util/ParquetReaderIterator.java`
around lines 58 - 63, The catch blocks in ParquetReaderIterator should not call
parquetReader.close() directly (which bypasses the iterator's closed guard) —
instead route cleanup through the iterator's close() method; locate the
try/catch sections around hasNext/next (references: ParquetReaderIterator,
close(), parquetReader) and replace direct parquetReader.close() invocations in
error paths with a call to this.close() (or close()) while rethrowing or
wrapping the original exception as before so idempotent close semantics are
preserved; apply the same change to the similar error-paths around lines 88-94.
hudi-hadoop-common/src/test/java/org/apache/hudi/common/util/TestParquetReaderIterator.java-68-68 (1)

68-68: ⚠️ Potential issue | 🟠 Major

Avoid asserting duplicate close() calls; this locks in unintended behavior.

Line 68 currently validates double-close behavior. With idempotent iterator close semantics, this should assert a single close call.

✅ Suggested assertion update
-    verify(reader, times(2)).close();
+    verify(reader, times(1)).close();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-hadoop-common/src/test/java/org/apache/hudi/common/util/TestParquetReaderIterator.java`
at line 68, The test currently asserts duplicate close calls on the Parquet
reader (verify(reader, times(2)).close()), which locks in non-idempotent
behavior; update the assertion in TestParquetReaderIterator to verify a single
close instead (e.g., use verify(reader).close() or verify(reader,
times(1)).close()) so the test expects idempotent iterator close semantics; keep
the rest of the test logic and mocks (reader, iterator code paths) unchanged.
hudi-spark-datasource/hudi-spark3.3.x/src/main/scala/org/apache/spark/sql/hudi/analysis/Spark33HoodiePruneFileSourcePartitions.scala-63-65 (1)

63-65: ⚠️ Potential issue | 🟠 Major

Guard filterFileSlices behind actual partition predicates.

At Line 63 this mutates the shared HoodieFileIndex before Line 65 proves there is anything to prune. Queries with only data filters—or no filters at all—will still mark pushdown as applied and then return op unchanged, which can suppress later pruning passes.

Suggested fix
-      // [[HudiFileIndex]] is a caching one, therefore we don't need to reconstruct new relation,
-      // instead we simply just refresh the index and update the stats
-      fileIndex.filterFileSlices(dataFilters, partitionPruningFilters, isPartitionPruned = true)
-
       if (partitionPruningFilters.nonEmpty) {
+        // [[HudiFileIndex]] is a caching one, therefore we don't need to reconstruct new relation,
+        // instead we simply refresh the index and update the stats when partition pruning actually applies
+        fileIndex.filterFileSlices(dataFilters, partitionPruningFilters, isPartitionPruned = true)
+
         // Change table stats based on the sizeInBytes of pruned files
         val filteredStats = FilterEstimation(Filter(partitionPruningFilters.reduce(And), lr)).estimate
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-spark-datasource/hudi-spark3.3.x/src/main/scala/org/apache/spark/sql/hudi/analysis/Spark33HoodiePruneFileSourcePartitions.scala`
around lines 63 - 65, Move the call that mutates the shared HoodieFileIndex so
it only runs when there are actual partition predicates: wrap or relocate the
fileIndex.filterFileSlices(dataFilters, partitionPruningFilters,
isPartitionPruned = true) call inside the if (partitionPruningFilters.nonEmpty)
block (the same block that subsequently marks pushdown applied and returns op)
so that fileIndex is not modified for queries with only data filters or no
filters; update any logic that marks pushdown as applied (related to the same
conditional) to only execute when partitionPruningFilters is nonEmpty to avoid
suppressing later pruning passes.
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteOperatorCoordinator.java-333-336 (1)

333-336: ⚠️ Potential issue | 🟠 Major

Don't use executor != null as the recovery-state check here.

close() shuts the executor/client resources down but leaves the fields non-null. StreamWriteFunctionWrapper.coordinatorFails() does close() -> resetToCheckpoint() -> start() on the same coordinator instance, so this branch can call restoreEvents() on already-closed resources before restart. Gate this on an explicit started flag, or clear the executor/client fields in close().

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteOperatorCoordinator.java`
around lines 333 - 336, The check using executor != null is unsafe because
close() leaves fields non-null; change the guard around the restoreEvents call
to use an explicit lifecycle flag or cleared resources: add or use a boolean
like started (set true in start() and false in close()) and replace "if
(executor != null && tableState.isRLIWithBootstrap)" with "if (started &&
tableState.isRLIWithBootstrap)"; alternatively, ensure close() nulls/cleans
executor and any client fields so the existing check is valid; make sure
restoreEvents(), restore/resetToCheckpoint(), and
StreamWriteFunctionWrapper.coordinatorFails() lifecycle interactions are
consistent with the chosen approach.
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/ScheduleCompactionActionExecutor.java-243-251 (1)

243-251: ⚠️ Potential issue | 🟠 Major

Compare the log-compaction blocks threshold against block counts, not delta-commit counts.

numDeltaCommitsSince is a commit count, but config.getLogCompactionBlocksThreshold() is a block-based threshold. A single delta commit can append multiple log blocks in metadata partitions, so this can delay log compaction well past the configured limit or trigger it too early. This decision needs to use the accumulated log-block metric instead of countInstants().

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/ScheduleCompactionActionExecutor.java`
around lines 243 - 251, needLogCompact currently compares delta-commit counts
(from latestDeltaCommitInfoSinceCompact and
getLatestDeltaCommitInfoSinceLogCompaction) against
config.getLogCompactionBlocksThreshold(), but the threshold is in log blocks not
commits; update needLogCompact to use the accumulated log-block counts instead
of commit counts—obtain the log-block metric for the range represented by
latestDeltaCommitInfoSinceCompact and the value from
getLatestDeltaCommitInfoSinceLogCompaction (or extend those info objects to
carry block counts), compute the min accumulated log blocks and compare that
against config.getLogCompactionBlocksThreshold() so the decision uses block
counts (references: needLogCompact, latestDeltaCommitInfoSinceCompact,
getLatestDeltaCommitInfoSinceLogCompaction,
config.getLogCompactionBlocksThreshold()).
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/util/DistributedRegistryUtil.java-48-51 (1)

48-51: ⚠️ Potential issue | 🟠 Major

Use LocalRegistry explicitly in the non-executor branch.

This branch currently derives the registry class from Registry.getRegistry(...), so the implementation chosen here depends on unrelated process-global state instead of the current config. In a JVM that has already created a different registry type under the bare wrapper-FS name, executor-disabled runs can bind to the wrong registry implementation.

Suggested fix
+import org.apache.hudi.common.metrics.LocalRegistry;
 import org.apache.hudi.common.metrics.Registry;
@@
       } else {
         registry = Registry.getRegistryOfClass(config.getTableName(), HoodieWrapperFileSystem.REGISTRY_NAME,
-            Registry.getRegistry(HoodieWrapperFileSystem.REGISTRY_NAME).getClass().getName());
+            LocalRegistry.class.getName());
         registryMeta = Registry.getRegistryOfClass(config.getTableName(), HoodieWrapperFileSystem.REGISTRY_META_NAME,
-            Registry.getRegistry(HoodieWrapperFileSystem.REGISTRY_META_NAME).getClass().getName());
+            LocalRegistry.class.getName());
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-client/hudi-client-common/src/main/java/org/apache/hudi/util/DistributedRegistryUtil.java`
around lines 48 - 51, The non-executor branch in DistributedRegistryUtil
currently infers the registry implementation via
Registry.getRegistry(...).getClass().getName(), which can pick up a
process-global registry type; change the calls to
Registry.getRegistryOfClass(...) for HoodieWrapperFileSystem.REGISTRY_NAME and
REGISTRY_META_NAME to explicitly pass LocalRegistry.class.getName() (i.e., use
LocalRegistry as the registry implementation for both registry and
registryMeta), and add the necessary import for LocalRegistry so the
non-executor path always binds to the local implementation.
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/common/HoodieSparkEngineContext.java-282-288 (1)

282-288: ⚠️ Potential issue | 🟠 Major

Use an unambiguous cache key for distributed registries.

Line 283 builds the cache key as tableName + "." + registryName, which can alias distinct pairs like ("db.tbl", "rocksdb") and ("db", "tbl.rocksdb"). That can return the wrong Registry instance and cross-wire metrics between tables. Reuse Registry.makeKey(tableName, registryName) here as well.

♻️ Proposed fix
  `@Override`
  public Registry getMetricRegistry(String tableName, String registryName) {
-    final String prefixedName = tableName.isEmpty() ? registryName : tableName + "." + registryName;
-    return DISTRIBUTED_REGISTRY_MAP.computeIfAbsent(prefixedName, key -> {
+    final String registryKey = Registry.makeKey(tableName, registryName);
+    return DISTRIBUTED_REGISTRY_MAP.computeIfAbsent(registryKey, key -> {
       Registry registry = Registry.getRegistryOfClass(tableName, registryName, DistributedRegistry.class.getName());
       ((DistributedRegistry) registry).register(javaSparkContext);
       return registry;
     });
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/common/HoodieSparkEngineContext.java`
around lines 282 - 288, The cache key in
HoodieSparkEngineContext.getMetricRegistry currently concatenates tableName +
"." + registryName which can alias different (tableName, registryName) pairs;
change the key to use Registry.makeKey(tableName, registryName) when calling
DISTRIBUTED_REGISTRY_MAP.computeIfAbsent so the map uses the unambiguous
Registry key; keep the rest of the lambda intact (create via
Registry.getRegistryOfClass(...), cast to DistributedRegistry, call
register(javaSparkContext), and return the registry).
hudi-io/src/main/java/org/apache/hudi/common/metrics/Registry.java-149-156 (1)

149-156: ⚠️ Potential issue | 🟠 Major

Preserve the original registry keys when shipping registries to executors.

Lines 153-154 re-register every registry under makeKey("", registry.getName()). That drops the original tableName / registryName split, so executor-side Registry.getRegistryOfClass(tableName, registryName, ...) won't find the pre-registered instance for table-scoped metrics and will create a second registry instead.

♻️ Proposed fix
-  static void setRegistries(Collection<Registry> registries) {
-    for (Registry registry : registries) {
-      REGISTRY_MAP.putIfAbsent(makeKey("", registry.getName()), registry);
-    }
+  static void setRegistries(Map<String, Registry> registries) {
+    registries.forEach(REGISTRY_MAP::putIfAbsent);
   }

Apply the corresponding call-site change in hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/common/HoodieSparkEngineContext.java:

   private static void setRegistries(Map<String, Registry> registries) {
-    Registry.setRegistries(registries.values());
+    Registry.setRegistries(registries);
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hudi-io/src/main/java/org/apache/hudi/common/metrics/Registry.java` around
lines 149 - 156, setRegistries currently re-registers every Registry under
makeKey("", registry.getName()), dropping the original tableName scope so
executor-side Registry.getRegistryOfClass(tableName, registryName, ...) can't
find pre-registered instances; fix by preserving the original key components
when inserting into REGISTRY_MAP (e.g., change the putIfAbsent call to use the
registry's actual table name instead of "" — use registry.getTableName() (or the
registry.getKey()/appropriate accessor that returns the original tableName) with
registry.getName() in makeKey) and ensure the corresponding call-site change is
applied in HoodieSparkEngineContext to keep keys consistent across driver and
executors.
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcImageManager.java-169-175 (1)

169-175: ⚠️ Potential issue | 🟠 Major

Guard against skipBytes() returning zero.

This loop can hang forever on truncated or corrupt buffers. DataInputStream.skipBytes(int) is not guaranteed to make progress and can return 0, leaving numBytes unchanged and causing an infinite loop.

Suggested fix
     `@Override`
     public void skipBytesToRead(int numBytes) throws IOException {
       while (numBytes > 0) {
         int skipped = skipBytes(numBytes);
+        if (skipped <= 0) {
+          throw new IOException("Unable to skip remaining " + numBytes + " bytes");
+        }
         numBytes -= skipped;
       }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcImageManager.java`
around lines 169 - 175, The skipBytesToRead method can loop forever if
skipBytes(numBytes) returns 0; update skipBytesToRead to detect a zero return
from skipBytes and fail fast (throw an EOFException or IOException) instead of
looping endlessly, or alternatively implement a fallback that reads and discards
bytes (e.g., readFully into a temporary buffer) when skipBytes makes no
progress; the change should be made in skipBytesToRead and reference
DataInputStream.skipBytes(int) and skipBytes(numBytes) so callers of
CdcImageManager.skipBytesToRead will get a clear error rather than hang.
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcIterators.java-279-285 (1)

279-285: ⚠️ Potential issue | 🟠 Major

Close the spillable map when this iterator created it locally.

Lines 284-285 allocate a standalone spillable map when there is no before-slice, but Lines 334-338 never close that map. That leaks temp files / memory on the no-before-image path.

🧹 Proposed fix
     private ExternalSpillableMap<String, byte[]> beforeImages;
+    private boolean ownsBeforeImages;
@@
       } else {
         this.beforeImages = FormatUtils.spillableMap(writeConfig, maxCompactionMemoryInBytes, getClass().getSimpleName());
+        this.ownsBeforeImages = true;
       }
@@
     public void close() {
       logRecordIterator.close();
+      if (ownsBeforeImages && beforeImages != null) {
+        beforeImages.close();
+        beforeImages = null;
+      }
       imageManager.close();
     }

Also applies to: 334-338

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcIterators.java`
around lines 279 - 285, The initImages method can create a standalone spillable
map via FormatUtils.spillableMap when fileSplit.getBeforeFileSlice() is absent,
but that map is never closed leading to leaked temp files; update initImages to
record whether beforeImages was created locally (e.g., a boolean field like
beforeImagesOwned) when using FormatUtils.spillableMap, and then in the
iterator/class close method (the one that currently closes resources around
lines 334-338) call beforeImages.close() only if beforeImagesOwned is true;
ensure similar ownership tracking/closing is added for any analogous afterImages
creation path if present.
hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieTimeline.java-490-493 (1)

490-493: ⚠️ Potential issue | 🟠 Major

This new abstract method on a public interface is a binary-incompatible API change.

Adding a non-default abstract method to HoodieTimeline breaks binary compatibility. Any external implementation of this interface will fail to compile when this version is upgraded to, since they must now implement the new method. While internal implementations in this repo are already handled (BaseHoodieTimeline has the implementation), this breaks external adopters.

Use a default method instead to maintain backward compatibility:

Compatibility-preserving approach
-  HoodieTimeline findInstantsModifiedBeforeOrEqualsByCompletionTime(String instantTime);
+  default HoodieTimeline findInstantsModifiedBeforeOrEqualsByCompletionTime(String instantTime) {
+    throw new UnsupportedOperationException("Not implemented");
+  }

Then override it in BaseHoodieTimeline with the actual implementation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieTimeline.java`
around lines 490 - 493, You added a new abstract method
findInstantsModifiedBeforeOrEqualsByCompletionTime(String instantTime) to the
public HoodieTimeline interface which is a binary-incompatible change; change it
to a default method on HoodieTimeline (provide a reasonable default such as
delegating to existing APIs or throwing UnsupportedOperationException) to
preserve ABI, and then override/implement the real logic in BaseHoodieTimeline
by adding a concrete override of
findInstantsModifiedBeforeOrEqualsByCompletionTime that contains the current
implementation. Ensure method signature and javadoc remain identical and
reference HoodieTimeline and BaseHoodieTimeline when making the edits.
hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchema.java-922-940 (1)

922-940: ⚠️ Potential issue | 🟠 Major

Use unique Avro record names for shredded Variant helper structs.

createShreddedFieldStruct uses the field key as the record name, and createVariantShreddedObject hard-codes the enclosing typed record name to typed_value. Avro named types are global by full name within a schema, so two shredded objects in the same schema can collide and the later definition can alias the first one during serialization. These helper-generated record names need to be derived from the parent path/namespace, not reused raw like this.

Suggested direction
-  public static HoodieSchema createShreddedFieldStruct(String fieldName, HoodieSchema fieldType) {
+  public static HoodieSchema createShreddedFieldStruct(
+      String recordName, String fieldName, String namespace, HoodieSchema fieldType) {
     ValidationUtils.checkArgument(fieldName != null && !fieldName.isEmpty(), "Field name cannot be null or empty");
     ValidationUtils.checkArgument(fieldType != null, "Field type cannot be null");
     List<HoodieSchemaField> fields = Arrays.asList(
       ...
     );
-    return HoodieSchema.createRecord(fieldName, null, null, fields);
+    return HoodieSchema.createRecord(recordName, namespace, null, fields);
   }
...
-      HoodieSchema fieldStruct = createShreddedFieldStruct(entry.getKey(), entry.getValue());
+      String recordName = name + "_" + entry.getKey() + "_shredded";
+      HoodieSchema fieldStruct = createShreddedFieldStruct(recordName, entry.getKey(), namespace, entry.getValue());
...
-    HoodieSchema typedValueSchema = HoodieSchema.createRecord(
-        Variant.VARIANT_TYPED_VALUE_FIELD, namespace, null, typedValueFields);
+    HoodieSchema typedValueSchema = HoodieSchema.createRecord(
+        name + "_typed_value", namespace, null, typedValueFields);

Also applies to: 988-1007

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchema.java`
around lines 922 - 940, The helper generators createShreddedFieldStruct and
createVariantShreddedObject currently produce Avro record names that can collide
(createShreddedFieldStruct uses the raw fieldName and
createVariantShreddedObject uses the hard-coded "typed_value"), so change both
to emit globally-unique Avro record names derived from the parent's qualified
path/namespace (e.g., incorporate the parent path or namespace plus the
fieldName and a stable suffix like "_shredded" or "_typed_value") rather than
raw names; update the calls to HoodieSchema.createRecord in
createShreddedFieldStruct and the record-name used in
createVariantShreddedObject to build and pass that fully-qualified unique name
(ensure Variant.VARIANT_TYPED_VALUE_FIELD and Variant.VARIANT_VALUE_FIELD usage
remains the same and only the record name/namespace generation is changed).
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/common/TestCatalogBackedTableMetadata.scala-45-62 (1)

45-62: ⚠️ Potential issue | 🟠 Major

This test never actually disables the metadata table.

The scenario claims to validate catalog-backed partition listing when metadata is disabled, but the table properties here never set 'hoodie.metadata.enable'='false'. If metadata is enabled by default in the test environment, this case still exercises the metadata-table path and misses the fallback behavior it is supposed to cover.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/common/TestCatalogBackedTableMetadata.scala`
around lines 45 - 62, The test in TestCatalogBackedTableMetadata.scala creates
the table for the scenario but never disables the metadata table; update the
CREATE TABLE statement (the SQL executed where targetTable is created) to
include the table property 'hoodie.metadata.enable'='false' in the tblproperties
so the test actually exercises the non-metadata fallback path. Locate the SQL
block that creates targetTable and add the 'hoodie.metadata.enable'='false'
property alongside the existing 'primaryKey', 'type', and 'preCombineField'
entries.
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/IncrementalRelationV2.scala-221-223 (1)

221-223: ⚠️ Potential issue | 🟠 Major

Keep datasource filter columns available until the incremental filters have run.

This has the same pruning problem as IncrementalRelationV1: prunedSchema only reflects requiredColumns, but the datasource-option filters are still applied afterwards. If those predicates reference non-projected columns, Line 292 drops them before evaluation and the scan fails with an unresolved-column error. Include filter columns in prunedSchema, or defer filterRequiredColumnsFromDF until after filters.foldLeft(...).

Also applies to: 292-293

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/IncrementalRelationV2.scala`
around lines 221 - 223, prunedSchema produced by
IncrementalRelationUtil.getPrunedSchema in IncrementalRelationV2 only uses
requiredColumns, but datasource-option filters are applied later
(filters.foldLeft(...)) and filterRequiredColumnsFromDF currently drops columns
needed by those predicates causing unresolved-column errors; fix by including
filter-referenced columns when building prunedSchema (e.g., compute all filter
columns from filters and merge them with requiredColumns before calling
IncrementalRelationUtil.getPrunedSchema) or alternatively move the call to
filterRequiredColumnsFromDF to after filters.foldLeft(...) so predicates see
needed columns; update references around prunedSchema, filters.foldLeft, and
filterRequiredColumnsFromDF accordingly.
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/IncrementalRelationV1.scala-231-233 (1)

231-233: ⚠️ Potential issue | 🟠 Major

Keep datasource filter columns in the scan schema until PUSH_DOWN_INCR_FILTERS has been applied.

prunedSchema is built only from requiredColumns, but Line 309 still applies filters from the datasource options. If one of those predicates references a non-projected data column, filterRequiredColumnsFromDF removes it first and the scan fails with an unresolved-column error. Include filter-referenced columns in prunedSchema, or move the filterRequiredColumnsFromDF call to after filters.foldLeft(...).

Also applies to: 308-310

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/IncrementalRelationV1.scala`
around lines 231 - 233, The prunedSchema (built via
IncrementalRelationUtil.getPrunedSchema) currently only uses requiredColumns and
thus drops columns referenced by datasource `filters` before
PUSH_DOWN_INCR_FILTERS is applied, causing unresolved-column errors when those
predicates are applied; fix by ensuring filter-referenced columns are retained
in the scan schema: either expand the input to getPrunedSchema to include
columns referenced by `filters` (union requiredColumns with the columns
extracted from `filters`), or move the call to filterRequiredColumnsFromDF to
after the `filters.foldLeft(...)` block so datasource filters are applied before
any projection removes referenced columns; update the code around prunedSchema,
IncrementalRelationUtil.getPrunedSchema, filters.foldLeft, and
filterRequiredColumnsFromDF accordingly.
hudi-common/src/main/java/org/apache/hudi/BaseHoodieTableFileIndex.java-361-369 (1)

361-369: ⚠️ Potential issue | 🟠 Major

Preserve incremental-range partition filtering in the new overload.

Unlike the overload on Line 383, this path never restricts results to TimelineUtils.getWrittenPartitions(findInstantsInRange()). Once Spark pushes partition predicates down through this method, incremental reads can pick up matching partitions that were not written in the requested instant range, which expands the scan well beyond the incremental window. Apply the same incremental-query branch here before delegating to tableMetadata.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hudi-common/src/main/java/org/apache/hudi/BaseHoodieTableFileIndex.java`
around lines 361 - 369, The listPartitionPaths(List<String>, Types.RecordType,
Expression, List<Object>) path currently delegates to
tableMetadata.getPartitionPathWithPathPrefixUsingFilterExpression without
applying the incremental-range filter; update this method (listPartitionPaths)
to restrict matchedPartitionPaths to only the partitions returned by
TimelineUtils.getWrittenPartitions(findInstantsInRange())—the same
incremental-query branch used in the other overload—before proceeding/returning
so Spark partition predicate pushdown doesn't include partitions outside the
requested instant range.
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/util/IncrementalRelationUtil.scala-46-73 (1)

46-73: ⚠️ Potential issue | 🟠 Major

Deduplicate implicit helper columns before appending them to prunedSchema.

If a MOR table uses the same field as both a partition field and an ordering field, this code adds that StructField twice. That produces duplicate column names in prunedSchema, which can break the later .schema(prunedSchema) reads and the select(fieldNames...) path. Track already-added names and skip duplicates when appending implicit columns.

Suggested fix
   def getPrunedSchema(requiredColumns: Array[String],
                       usedSchema: StructType,
                       metaClient: HoodieTableMetaClient) = {
     var prunedSchema = StructType(Seq())
+    val appended = scala.collection.mutable.HashSet.empty[String]
+
+    def appendIfPresent(col: String): Unit = {
+      if (!appended.contains(col)) {
+        usedSchema.find(_.name == col).foreach { field =>
+          prunedSchema = prunedSchema.add(field)
+          appended += col
+        }
+      }
+    }
 
     // _hoodie_commit_time is a required field. using which query filters are applied.
     if (!requiredColumns.contains(HoodieRecord.COMMIT_TIME_METADATA_FIELD)) {
-      prunedSchema = prunedSchema.add(usedSchema(HoodieRecord.COMMIT_TIME_METADATA_FIELD))
+      appendIfPresent(HoodieRecord.COMMIT_TIME_METADATA_FIELD)
     }
 
     // Add all the required columns as part of pruned schema
     requiredColumns.foreach(col => {
-      val field = usedSchema.find(_.name == col)
-      if (field.isDefined) {
-        prunedSchema = prunedSchema.add(field.get)
-      }
+      appendIfPresent(col)
     })
@@
       partitionColumns.get().foreach(col => {
         if (!requiredColumns.contains(col)) {
-          val field = usedSchema.find(_.name == col)
-          if (field.isDefined) {
-            prunedSchema = prunedSchema.add(field.get)
-          }
+          appendIfPresent(col)
         }
       })
@@
         orderingFields.forEach(col => {
           if (!requiredColumns.contains(col)) {
-            val field = usedSchema.find(_.name == col)
-            if (field.isDefined) {
-              prunedSchema = prunedSchema.add(field.get)
-            }
+            appendIfPresent(col)
           }
         })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/util/IncrementalRelationUtil.scala`
around lines 46 - 73, The code may append the same StructField twice to
prunedSchema when a column is both in tableConfig.getPartitionFields and
tableConfig.getOrderingFields; modify IncrementalRelationUtil to deduplicate
implicit helper columns by tracking a Set of added column names (seed it with
requiredColumns and any existing prunedSchema field names), and before adding a
field from usedSchema (when inspecting partitionColumns or orderingFields) check
the set and skip adding if already present, otherwise add the field to
prunedSchema and insert its name into the set; apply this logic around the
blocks referencing tableConfig.getPartitionFields,
tableConfig.getOrderingFields, prunedSchema, requiredColumns, usedSchema, and
HoodieTableType.MERGE_ON_READ.
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestIncrementalQueryColumnPruning.scala-129-168 (1)

129-168: ⚠️ Potential issue | 🟠 Major

These assertions don’t actually prove scan-time column pruning.

Both tests call .select(...) after the load, so the result schema will be narrowed even if the relation still reads every column. The plan.contains("col1") || plan.contains("col3") check is also satisfied by the projection itself. A regression in the actual PrunedScan path would still pass here.

Please assert against the scan/read schema in the physical plan instead of the projected output schema.

Also applies to: 225-240

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestIncrementalQueryColumnPruning.scala`
around lines 129 - 168, The current assertions check the projected output schema
(prunedDF and prunedRecords) which can be narrowed by .select(...) even when the
underlying scan reads all columns; instead inspect the physical scan node to
prove scan-time pruning: use prunedDF.queryExecution.executedPlan (or
prunedDF.queryExecution.toRdd / executedPlan) and assert the scan node/read
schema contains only "col1" and "col3" (e.g., look for
FileScan/ParquetScan/HadoopFsRelation node text like "ReadSchema" or attribute
names) and does NOT mention "col2","col4", etc.; update the assertions that
currently use plan.contains(...) to search executedPlan for the scan/read schema
and apply the same change to the other occurrence referenced (lines ~225-240) so
the test fails if the physical scan still reads unselected columns.
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/metadata/CatalogBackedTableMetadata.scala-99-106 (1)

99-106: ⚠️ Potential issue | 🟠 Major

Propagate predicate filters in the filesystem fallback.

When shouldUseCatalogPartitions is false, this override drops partitionPredicateExpressions and falls back to the older super call. In the path where CatalogBackedTableMetadata is instantiated but the table does not track partitions in the catalog, that regresses predicate-based partition pruning back to a full filesystem listing.

Suggested fix
     } else {
-      super.getPartitionPathWithPathPrefixUsingFilterExpression(relativePathPrefix, partitionFields, pushedExpr)
+      super.getPartitionPathWithPathPrefixUsingFilterExpression(
+        relativePathPrefix,
+        partitionFields,
+        pushedExpr,
+        partitionPredicateExpressions)
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/metadata/CatalogBackedTableMetadata.scala`
around lines 99 - 106, The fallback branch currently ignores
partitionPredicateExpressions and does a full filesystem listing; update the
else branch in getPartitionPathWithPathPrefixUsingFilterExpression to preserve
predicate-based pruning by converting partitionPredicateExpressions to a
Seq[Expression] (as done in the shouldUseCatalogPartitions branch) and applying
the same filtering logic: call
super.getPartitionPathWithPathPrefixUsingFilterExpression(...) to obtain
filesystem partitions and then pass that result into
filterPartitionsBasedOnRelativePathPrefixes(relativePathPrefix, <super result>)
using the derived partitionPredicateExpressionSeq so
partitionPredicateExpressions are honored even when shouldUseCatalogPartitions
is false.
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/analysis/HoodieVectorSearchPlanBuilder.scala-108-112 (1)

108-112: ⚠️ Potential issue | 🟠 Major

Ensure internal column names cannot collide with input data.

The code unconditionally uses _hudi_distance, _hudi_query_index, _hudi_query_emb, _hudi_rank, and _hudi_query_ prefix without checking whether the input corpus or query tables already contain these names. The withColumn() operations at lines 273, 303, 314, and 321 will silently overwrite user data if these collisions occur. The prefixing logic (lines 305–310) only guards columns shared between corpus and query, leaving gaps. Add a pre-validation check that raises an error if any reserved column names exist in the input DataFrames, or generate unique aliases before building the plan.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/analysis/HoodieVectorSearchPlanBuilder.scala`
around lines 108 - 112, Reserved internal column names (DISTANCE_COL,
QUERY_ID_COL, QUERY_EMB_ALIAS, RANK_COL, QUERY_COL_PREFIX) are used
unconditionally and may overwrite user columns; add a pre-validation step in
HoodieVectorSearchPlanBuilder (before building the plan / before any withColumn
calls) that checks both input DataFrames for any columns that equal these
constants or start with QUERY_COL_PREFIX and raise a clear error if found, or
alternatively generate unique aliases (e.g., append a UUID or incremental
suffix) for DISTANCE_COL/QUERY_EMB_ALIAS/QUERY_ID_COL/RANK_COL and use those
generated symbols consistently throughout the plan-building methods (the same
ones that call withColumn). Ensure the chosen approach updates all references in
the plan construction so no reserved names collide with input data.
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/HoodieVectorSearchTableValuedFunction.scala-90-108 (1)

90-108: ⚠️ Potential issue | 🟠 Major

Add foldable validation before calling eval() on k parameter.

parseK evaluates arbitrary user-provided expressions without checking if they are foldable. Non-foldable expressions—such as column references or function calls—will fail unpredictably or evaluate incorrectly during analysis. The k parameter must be constrained to a foldable expression before evaluation.

This pattern is already established elsewhere in the codebase; evaluateQueryVector in HoodieSparkBaseAnalysis.scala validates foldability before calling eval() on a similar vector search parameter.

Suggested fix
   private[logical] def parseK(funcName: String, expr: Expression): Int = {
+    if (!expr.foldable) {
+      throw new HoodieAnalysisException(
+        s"Function '$funcName': k must be a foldable positive integer expression, got: ${expr.sql}")
+    }
     val rawValue = expr.eval()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/HoodieVectorSearchTableValuedFunction.scala`
around lines 90 - 108, The parseK method currently calls expr.eval() on the
user-supplied k Expression without checking foldability; update parseK to first
require that expr.foldable is true (throwing a HoodieAnalysisException with a
clear message referencing funcName if not), then call expr.eval() and proceed
with the existing number parsing and positive-int checks; follow the same
pattern used by evaluateQueryVector in HoodieSparkBaseAnalysis.scala to validate
foldability before evaluation so non-foldable expressions (e.g., column refs)
are rejected early.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 72ad786b-9bd7-4e3f-8477-710a1e028675

📥 Commits

Reviewing files that changed from the base of the PR and between 35e2bbf and 4c19cc1.

📒 Files selected for processing (119)
  • hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieClient.java
  • hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieTableServiceClient.java
  • hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieClusteringConfig.java
  • hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java
  • hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/compact/ScheduleCompactionActionExecutor.java
  • hudi-client/hudi-client-common/src/main/java/org/apache/hudi/util/DistributedRegistryUtil.java
  • hudi-client/hudi-client-common/src/test/java/org/apache/hudi/client/TestBaseHoodieTableServiceClient.java
  • hudi-client/hudi-client-common/src/test/java/org/apache/hudi/config/TestHoodieWriteConfig.java
  • hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/util/HoodieSchemaConverter.java
  • hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/SparkRDDWriteClient.java
  • hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/common/HoodieSparkEngineContext.java
  • hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/io/storage/row/HoodieRowParquetWriteSupport.java
  • hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/metrics/DistributedRegistry.java
  • hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/avro/HoodieSparkSchemaConverters.scala
  • hudi-client/hudi-spark-client/src/main/scala/org/apache/spark/sql/hudi/SparkAdapter.scala
  • hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/common/table/log/TestLogReaderUtils.java
  • hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/metrics/TestDistributedRegistry.java
  • hudi-common/src/main/java/org/apache/hudi/BaseHoodieTableFileIndex.java
  • hudi-common/src/main/java/org/apache/hudi/common/engine/HoodieEngineContext.java
  • hudi-common/src/main/java/org/apache/hudi/common/schema/HoodieSchema.java
  • hudi-common/src/main/java/org/apache/hudi/common/table/log/LogReaderUtils.java
  • hudi-common/src/main/java/org/apache/hudi/common/table/timeline/BaseHoodieTimeline.java
  • hudi-common/src/main/java/org/apache/hudi/common/table/timeline/HoodieTimeline.java
  • hudi-common/src/main/java/org/apache/hudi/common/util/CompactionUtils.java
  • hudi-common/src/main/java/org/apache/hudi/common/util/collection/RocksDBDAO.java
  • hudi-common/src/main/java/org/apache/hudi/common/util/queue/DisruptorMessageQueue.java
  • hudi-common/src/main/java/org/apache/hudi/metadata/FileSystemBackedTableMetadata.java
  • hudi-common/src/main/java/org/apache/hudi/metadata/HoodieBackedTableMetadata.java
  • hudi-common/src/main/java/org/apache/hudi/metadata/HoodieTableMetadata.java
  • hudi-common/src/main/java/org/apache/hudi/util/PartitionPathFilterUtil.java
  • hudi-common/src/test/java/org/apache/hudi/common/schema/TestHoodieSchema.java
  • hudi-common/src/test/java/org/apache/hudi/util/TestPartitionPathFilterUtil.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkRocksDBIndexMetrics.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/StreamWriteOperatorCoordinator.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/append/AppendWriteFunctionWithDisruptorBufferSort.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/bootstrap/RLIBootstrapOperator.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/event/Correspondent.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/BucketAssignFunction.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/index/IndexBackend.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/index/RocksDBIndexBackend.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/utils/EventBuffers.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/utils/Pipelines.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/HoodieScanContext.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/HoodieSource.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/reader/HoodieSourceReader.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/reader/HoodieSourceSplitReader.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/reader/RecordLimiter.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/reader/function/AbstractSplitReaderFunction.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/reader/function/HoodieCdcSplitReaderFunction.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/reader/function/HoodieSplitReaderFunction.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/split/assign/HoodieSplitAssigners.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/split/assign/HoodieSplitBucketAssigner.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableSource.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcImageManager.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcInputFormat.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcIterators.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/TestStreamWriteOperatorCoordinator.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/TestWriteCopyOnWrite.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/TestWriteMergeOnRead.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/append/TestAppendWriteFunctionWithBufferSort.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/partitioner/index/TestRocksDBIndexBackend.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/MockStateSnapshotContext.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/StreamWriteFunctionWrapper.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/TestEventBuffers.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/TestWriteBase.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/reader/TestHoodieSourceSplitReader.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/reader/TestRecordLimiter.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/reader/function/TestAbstractSplitReaderFunction.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/reader/function/TestHoodieCdcSplitReaderFunction.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/reader/function/TestHoodieSplitReaderFunction.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/split/TestDefaultHoodieSplitProvider.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/source/split/assign/TestHoodieSplitBucketAssigner.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/ITTestHoodieDataSource.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/TestHoodieTableFactory.java
  • hudi-hadoop-common/src/main/java/org/apache/hudi/common/util/ParquetReaderIterator.java
  • hudi-hadoop-common/src/main/java/org/apache/hudi/hadoop/fs/HoodieWrapperFileSystem.java
  • hudi-hadoop-common/src/test/java/org/apache/hudi/common/table/TestHoodieTableConfig.java
  • hudi-hadoop-common/src/test/java/org/apache/hudi/common/util/TestCompactionUtils.java
  • hudi-hadoop-common/src/test/java/org/apache/hudi/common/util/TestParquetReaderIterator.java
  • hudi-io/src/main/java/org/apache/hudi/common/metrics/LocalRegistry.java
  • hudi-io/src/main/java/org/apache/hudi/common/metrics/Registry.java
  • hudi-spark-datasource/hudi-spark-common/pom.xml
  • hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala
  • hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieFileIndex.scala
  • hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/IncrementalRelationV1.scala
  • hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/IncrementalRelationV2.scala
  • hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/SparkHoodieTableFileIndex.scala
  • hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/metadata/CatalogBackedTableMetadata.scala
  • hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/util/IncrementalRelationUtil.scala
  • hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/HoodieTableChanges.scala
  • hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/HoodieVectorSearchTableValuedFunction.scala
  • hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/HoodieSqlCommonUtils.scala
  • hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/analysis/HoodieSparkBaseAnalysis.scala
  • hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/analysis/HoodieVectorSearchPlanBuilder.scala
  • hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/analysis/TableValuedFunctions.scala
  • hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/analysis/VectorDistanceUtils.scala
  • hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/streaming/HoodieStreamSourceV1.scala
  • hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/streaming/HoodieStreamSourceV2.scala
  • hudi-spark-datasource/hudi-spark/pom.xml
  • hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/analysis/HoodieAnalysis.scala
  • hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/procedures/ShowCommitsProcedure.scala
  • hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestCOWDataSource.scala
  • hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestHoodieVectorSearchFunction.scala
  • hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestIncrementalQueryColumnPruning.scala
  • hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestMORDataSource.scala
  • hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestVectorDataSource.scala
  • hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/avro/TestSchemaConverters.scala
  • hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/common/TestCatalogBackedTableMetadata.scala
  • hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/common/TestInstantTimeValidation.scala
  • hudi-spark-datasource/hudi-spark3-common/src/main/scala/org/apache/spark/sql/adapter/BaseSpark3Adapter.scala
  • hudi-spark-datasource/hudi-spark3.3.x/src/main/scala/org/apache/spark/sql/hudi/analysis/Spark33HoodiePruneFileSourcePartitions.scala
  • hudi-spark-datasource/hudi-spark4-common/src/main/scala/org/apache/spark/sql/adapter/BaseSpark4Adapter.scala
  • hudi-spark-datasource/hudi-spark4.0.x/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala
  • hudi-spark-datasource/hudi-spark4.0.x/src/test/java/org/apache/hudi/io/storage/row/TestHoodieRowParquetWriteSupportVariant.java
  • hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java
  • hudi-utilities/src/test/java/org/apache/hudi/utilities/offlinejob/TestHoodieClusteringJob.java
  • pom.xml
💤 Files with no reviewable changes (7)
  • hudi-client/hudi-flink-client/src/main/java/org/apache/hudi/util/HoodieSchemaConverter.java
  • hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieClient.java
  • hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/procedures/ShowCommitsProcedure.scala
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/utils/TestEventBuffers.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/TestStreamWriteOperatorCoordinator.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/event/Correspondent.java

Comment on lines +28 to +36
public static Predicate<String> relativePathPrefixPredicate(List<String> relativePathPrefixes) {
return path -> relativePathPrefixes.stream().anyMatch(relativePathPrefix ->
// Partition paths stored in metadata table do not have the slash at the end.
// If the relativePathPrefix is empty, return all partition paths;
// else if the relative path prefix is the same as the path, this is an exact match;
// else, we need to make sure the path is a subdirectory of relativePathPrefix, by
// checking if the path starts with relativePathPrefix appended by a slash ("/").
StringUtils.isNullOrEmpty(relativePathPrefix)
|| path.equals(relativePathPrefix) || path.startsWith(relativePathPrefix + "/"));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Handle null/empty prefix lists before anyMatch to avoid dropping all partitions.

At Line 29, relativePathPrefixes.stream().anyMatch(...) returns false for an empty list (and NPE for null). That makes the predicate reject every path, while the method contract/comments imply empty-prefix behavior should match all paths.

🐛 Proposed fix
 public class PartitionPathFilterUtil {
 
   public static Predicate<String> relativePathPrefixPredicate(List<String> relativePathPrefixes) {
+    if (relativePathPrefixes == null || relativePathPrefixes.isEmpty()) {
+      return path -> true;
+    }
     return path -> relativePathPrefixes.stream().anyMatch(relativePathPrefix ->
         // Partition paths stored in metadata table do not have the slash at the end.
         // If the relativePathPrefix is empty, return all partition paths;
         // else if the relative path prefix is the same as the path, this is an exact match;
         // else, we need to make sure the path is a subdirectory of relativePathPrefix, by
         // checking if the path starts with relativePathPrefix appended by a slash ("/").
         StringUtils.isNullOrEmpty(relativePathPrefix)
-            || path.equals(relativePathPrefix) || path.startsWith(relativePathPrefix + "/"));
+            || path.equals(relativePathPrefix) || path.startsWith(relativePathPrefix + "/"));
   }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public static Predicate<String> relativePathPrefixPredicate(List<String> relativePathPrefixes) {
return path -> relativePathPrefixes.stream().anyMatch(relativePathPrefix ->
// Partition paths stored in metadata table do not have the slash at the end.
// If the relativePathPrefix is empty, return all partition paths;
// else if the relative path prefix is the same as the path, this is an exact match;
// else, we need to make sure the path is a subdirectory of relativePathPrefix, by
// checking if the path starts with relativePathPrefix appended by a slash ("/").
StringUtils.isNullOrEmpty(relativePathPrefix)
|| path.equals(relativePathPrefix) || path.startsWith(relativePathPrefix + "/"));
public static Predicate<String> relativePathPrefixPredicate(List<String> relativePathPrefixes) {
if (relativePathPrefixes == null || relativePathPrefixes.isEmpty()) {
return path -> true;
}
return path -> relativePathPrefixes.stream().anyMatch(relativePathPrefix ->
// Partition paths stored in metadata table do not have the slash at the end.
// If the relativePathPrefix is empty, return all partition paths;
// else if the relative path prefix is the same as the path, this is an exact match;
// else, we need to make sure the path is a subdirectory of relativePathPrefix, by
// checking if the path starts with relativePathPrefix appended by a slash ("/").
StringUtils.isNullOrEmpty(relativePathPrefix)
|| path.equals(relativePathPrefix) || path.startsWith(relativePathPrefix + "/"));
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hudi-common/src/main/java/org/apache/hudi/util/PartitionPathFilterUtil.java`
around lines 28 - 36, The predicate returned by relativePathPrefixPredicate
currently calls relativePathPrefixes.stream().anyMatch(...) which will throw NPE
if relativePathPrefixes is null and will return false for an empty list, causing
every path to be rejected; update relativePathPrefixPredicate to first check if
relativePathPrefixes is null or empty and, in that case, return a predicate that
always returns true, otherwise return the existing predicate that checks
StringUtils.isNullOrEmpty(relativePathPrefix) or path.equals(relativePathPrefix)
or path.startsWith(relativePathPrefix + "/"); ensure the null check prevents NPE
and preserves described empty-prefix semantics.

Comment on lines +121 to +124
if (fileSplitIterator.hasNext()) {
recordIterator = recordIteratorFunc.apply(fileSplitIterator.next());
return recordIterator.hasNext();
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Skip empty split iterators instead of stopping the outer iterator.

Lines 121-124 return false as soon as one per-split iterator is empty. That drops every later CDC split, and the empty iterator is left open if the caller stops after that false.

🐛 Proposed fix
-      if (fileSplitIterator.hasNext()) {
-        recordIterator = recordIteratorFunc.apply(fileSplitIterator.next());
-        return recordIterator.hasNext();
-      }
+      while (fileSplitIterator.hasNext()) {
+        recordIterator = recordIteratorFunc.apply(fileSplitIterator.next());
+        if (recordIterator.hasNext()) {
+          return true;
+        }
+        recordIterator.close();
+        recordIterator = null;
+      }
       return false;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (fileSplitIterator.hasNext()) {
recordIterator = recordIteratorFunc.apply(fileSplitIterator.next());
return recordIterator.hasNext();
}
while (fileSplitIterator.hasNext()) {
recordIterator = recordIteratorFunc.apply(fileSplitIterator.next());
if (recordIterator.hasNext()) {
return true;
}
recordIterator.close();
recordIterator = null;
}
return false;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcIterators.java`
around lines 121 - 124, The current hasNext logic in CdcIterators stops on the
first empty per-split iterator (fileSplitIterator, recordIterator) which drops
subsequent splits and leaves the empty iterator open; change the loop so it
repeatedly pulls the next split via recordIteratorFunc.apply(...) and if that
per-split recordIterator hasNext() returns true, assign and return true,
otherwise close the empty recordIterator (if it implements AutoCloseable or has
a close method) and continue to the next split until none remain, returning
false only when all split iterators are exhausted. Ensure you reference and
update the code paths that set and use recordIterator, fileSplitIterator, and
recordIteratorFunc so the empty iterators are closed and skipped rather than
causing an early stop.

Comment on lines +161 to +174
@Override
public boolean hasNext() {
if (nested.hasNext()) {
currentRecord = nested.next();
currentRecord.setRowKind(RowKind.INSERT);
return true;
}
return false;
}

@Override
public RowData next() {
return currentRecord;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "CdcIterators.java" 2>/dev/null

Repository: yihua/hudi

Length of output: 153


🏁 Script executed:

# Find the file and get basic info
fd -t f "CdcIterators.java" --exec wc -l {} \;

Repository: yihua/hudi

Length of output: 157


🏁 Script executed:

git ls-files | grep -i "CdcIterators"

Repository: yihua/hudi

Length of output: 151


🏁 Script executed:

sed -n '155,180p' ./hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcIterators.java

Repository: yihua/hudi

Length of output: 603


🏁 Script executed:

sed -n '285,335p' ./hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcIterators.java

Repository: yihua/hudi

Length of output: 1801


🏁 Script executed:

sed -n '415,445p' ./hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcIterators.java

Repository: yihua/hudi

Length of output: 832


🏁 Script executed:

grep -n "class.*Iterator\|public.*hasNext\|public.*next()" ./hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcIterators.java | head -30

Repository: yihua/hudi

Length of output: 1247


🏁 Script executed:

sed -n '153,187p' ./hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcIterators.java

Repository: yihua/hudi

Length of output: 938


🏁 Script executed:

sed -n '230,368p' ./hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcIterators.java

Repository: yihua/hudi

Length of output: 6167


🏁 Script executed:

sed -n '370,479p' ./hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcIterators.java

Repository: yihua/hudi

Length of output: 4090


🏁 Script executed:

sed -n '189,228p' ./hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcIterators.java

Repository: yihua/hudi

Length of output: 1249


🏁 Script executed:

sed -n '288,332p' ./hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcIterators.java | head -20

Repository: yihua/hudi

Length of output: 752


🏁 Script executed:

sed -n '420,442p' ./hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcIterators.java

Repository: yihua/hudi

Length of output: 688


🏁 Script executed:

sed -n '161,174p' ./hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcIterators.java

Repository: yihua/hudi

Length of output: 349


Don't advance the underlying stream from hasNext().

Lines 162-167, 289-332, and 421-442 all consume data during hasNext(). A second hasNext() before next() will skip records, and for updates it can drop the UPDATE_BEFORE row entirely. These iterators need an idempotent prefetch flag instead of advancing on every probe.

🔁 Fix pattern
   private RowData currentRecord;
+  private boolean prefetched;

   `@Override`
   public boolean hasNext() {
-    if (nested.hasNext()) {
+    if (prefetched) {
+      return true;
+    }
+    if (nested.hasNext()) {
       currentRecord = nested.next();
       currentRecord.setRowKind(RowKind.INSERT);
+      prefetched = true;
       return true;
     }
     return false;
   }

   `@Override`
   public RowData next() {
+    prefetched = false;
     return currentRecord;
   }

Apply the same buffering pattern to DataLogFileIterator and BaseImageIterator.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@Override
public boolean hasNext() {
if (nested.hasNext()) {
currentRecord = nested.next();
currentRecord.setRowKind(RowKind.INSERT);
return true;
}
return false;
}
@Override
public RowData next() {
return currentRecord;
}
`@Override`
public boolean hasNext() {
if (prefetched) {
return true;
}
if (nested.hasNext()) {
currentRecord = nested.next();
currentRecord.setRowKind(RowKind.INSERT);
prefetched = true;
return true;
}
return false;
}
`@Override`
public RowData next() {
prefetched = false;
return currentRecord;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/cdc/CdcIterators.java`
around lines 161 - 174, The hasNext() implementations in CdcIterators (and
similarly DataLogFileIterator and BaseImageIterator) currently advance the
underlying iterator (nested.next()) every probe, which can skip records when
hasNext() is called multiple times; change to an idempotent prefetch pattern:
add a nullable buffer field (e.g., bufferedRecord) and a boolean/nullable check
so hasNext() only calls nested.next() to fill bufferedRecord when it is null,
returns true if bufferedRecord != null, and next() returns bufferedRecord and
clears it (or sets it to null); update references to currentRecord to use this
buffered field so repeated hasNext() calls do not advance the nested iterator.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (11)
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkRocksDBIndexMetrics.java (1)

55-58: Fail fast if backend is missing

At Line 55–58, the constructor should guard rocksDBIndexBackend with a null check to avoid deferred NPEs during metric polling.

Proposed patch
 import org.apache.hudi.sink.partitioner.index.RocksDBIndexBackend;
 
 import org.apache.flink.metrics.Gauge;
 import org.apache.flink.metrics.MetricGroup;
 import org.rocksdb.TickerType;
+import java.util.Objects;
@@
   public FlinkRocksDBIndexMetrics(MetricGroup metricGroup, RocksDBIndexBackend rocksDBIndexBackend) {
     super(metricGroup);
-    this.rocksDBIndexBackend = rocksDBIndexBackend;
+    this.rocksDBIndexBackend = Objects.requireNonNull(rocksDBIndexBackend, "rocksDBIndexBackend must not be null");
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkRocksDBIndexMetrics.java`
around lines 55 - 58, The FlinkRocksDBIndexMetrics constructor currently accepts
a possibly null rocksDBIndexBackend which can cause deferred NPEs; update the
FlinkRocksDBIndexMetrics(MetricGroup metricGroup, RocksDBIndexBackend
rocksDBIndexBackend) constructor to validate rocksDBIndexBackend is non-null and
throw a clear exception (e.g., IllegalArgumentException or NullPointerException)
with a descriptive message if it is null, then assign to
this.rocksDBIndexBackend only after the check.
hudi-io/src/test/java/org/apache/hudi/io/util/TestIOUtils.java (1)

98-100: Expand encode round-trip vectors to include negative/boundary ints.

The new encode tests are good, but they currently miss the negative/sign branch of writeVarInt. Adding a few negative boundaries will harden this coverage.

✅ Suggested test vector extension
-    int[] testValues = {0, 1, 127, 128, 146, 200, 255, 256, 300, 1000, 32080, 65535, 100000,
-        2034958, 632492350, Integer.MAX_VALUE};
+    int[] testValues = {
+        Integer.MIN_VALUE, -203, -128, -113, -112, -1,
+        0, 1, 127, 128, 146, 200, 255, 256, 300, 1000, 32080, 65535, 100000,
+        2034958, 632492350, Integer.MAX_VALUE
+    };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@hudi-io/src/test/java/org/apache/hudi/io/util/TestIOUtils.java` around lines
98 - 100, The test's encode/decode vector in TestIOUtils misses negative cases
and the sign-handling branch of writeVarInt; update the testValues array used in
the loop (the array defined in TestIOUtils) to include negative and boundary
integers (e.g., -1, -128, -129, -1000, Integer.MIN_VALUE) so the round-trip via
writeVarInt/readVarInt (or the corresponding encode/decode methods) exercises
the negative-paths and integer boundary behavior.
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/catalog/TestHoodieHiveCatalog.java (1)

394-405: Nice negative-path test; consider asserting the shared constant directly.

Using HoodieFileFormat.LANCE_SPARK_ONLY_ERROR_MSG in the message assertion would avoid drift if wording changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/catalog/TestHoodieHiveCatalog.java`
around lines 394 - 405, In testLanceFormatNotSupportedByHiveCatalog(), replace
the hard-coded error string in the assertion with the shared constant
HoodieFileFormat.LANCE_SPARK_ONLY_ERROR_MSG: change the containsString("Lance
base file format is currently only supported with the Spark engine") check to
containsString(HoodieFileFormat.LANCE_SPARK_ONLY_ERROR_MSG) (add or update the
import for HoodieFileFormat if necessary) so the test asserts against the
canonical constant instead of a literal.
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/ITTestHoodieDataSource.java (1)

3274-3283: Strengthen filtered LIMIT assertion to avoid false positives

Line 3276 only checks row count. If filtering regresses, this can still pass with an incorrect row. Consider asserting deterministic content for the par1 case as well.

♻️ Suggested test-hardening diff
-    List<Row> result1 = CollectionUtil.iterableToList(
-        () -> tableEnv.sqlQuery("select * from t1 where `partition` = 'par1' limit 1").execute().collect());
-    assertThat(result1).hasSize(1);
+    List<Row> result1 = CollectionUtil.iterableToList(
+        () -> tableEnv.sqlQuery("select * from t1 where `partition` = 'par1' order by uuid limit 1").execute().collect());
+    assertRowsEquals(result1, "[+I[id1, Danny, 23, 1970-01-01T00:00:01, par1]]");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/ITTestHoodieDataSource.java`
around lines 3274 - 3283, The test currently only asserts the row count for the
par1 filter (result1) which can hide wrong rows; replace the size-only check
with a deterministic content assertion using the existing helper
assertRowsEquals to compare result1 against the expected single-row string for
partition 'par1'. Locate the code that builds result1 (the
tableEnv.sqlQuery("select * from t1 where `partition` = 'par1' limit
1").execute().collect()) in ITTestHoodieDataSource and change
assertThat(result1).hasSize(1) to assertRowsEquals(result1, "<expected
single-row string matching the known par1 record>") so the test verifies exact
row content rather than just count.
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/metrics/TestFlinkBucketAssignMetrics.java (1)

63-90: Add coverage for clamped sampling rate inputs.

Please add a test for indexLookupSamplingRate <= 0 to lock in the constructor clamp behavior and prevent modulo-by-zero regressions.

✅ Suggested test addition
+  `@Test`
+  void testSamplingRateClampToOne() {
+    MetricGroup metricGroup = mock(MetricGroup.class);
+    FlinkBucketAssignMetrics metrics = new FlinkBucketAssignMetrics(metricGroup, 0);
+    metrics.registerMetrics();
+
+    assertTrue(metrics.shouldSampleLookup());
+    assertTrue(metrics.shouldSampleLookup());
+  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/metrics/TestFlinkBucketAssignMetrics.java`
around lines 63 - 90, Add a unit test that constructs FlinkBucketAssignMetrics
with indexLookupSamplingRate <= 0 (e.g., 0 and negative) to assert the
constructor clamps the value and does not cause modulo-by-zero in
shouldSampleLookup; call registerMetrics(), invoke shouldSampleLookup multiple
times to verify deterministic sampling behavior (no exception), and exercise
recordLookupDuration/markInsert/markUpdate to validate gauges still update;
reference the FlinkBucketAssignMetrics constructor and shouldSampleLookup method
and reuse the existing MetricGroup/gauges mocking pattern from
testLookupSamplingAndCounts to capture and assert gauge values.
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompositeTableServiceHandler.java (1)

32-49: Consider adding an abstract close() declaration or a default implementation.

The class implements Closeable but doesn't declare or implement close(). While subclasses do implement it correctly, adding an explicit declaration would make the contract clearer and provide compile-time enforcement:

♻️ Suggested improvement
 public abstract class CompositeTableServiceHandler<H> implements Closeable {
   protected final H dataTableHandler;
   protected final H metadataTableHandler;
 
   protected CompositeTableServiceHandler(H dataTableHandler, H metadataTableHandler) {
     this.dataTableHandler = dataTableHandler;
     this.metadataTableHandler = metadataTableHandler;
   }
 
   protected H getHandler(boolean isMetadataTable) {
     return isMetadataTable ? metadataTableHandler : dataTableHandler;
   }
 
   protected void forEachHandler(Consumer<H> consumer) {
     consumer.accept(dataTableHandler);
     consumer.accept(metadataTableHandler);
   }
+
+  `@Override`
+  public abstract void close();
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompositeTableServiceHandler.java`
around lines 32 - 49, CompositeTableServiceHandler implements Closeable but
doesn't declare close(); add an explicit contract by declaring public abstract
void close() throws IOException in CompositeTableServiceHandler (or provide a
default implementation that closes both dataTableHandler and
metadataTableHandler if they are Closeable), so subclasses are forced to
implement/adhere to the Closeable contract; update subclasses to `@Override` the
close() accordingly.
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompositeCleanHandler.java (1)

45-48: Consider exception-safe close to ensure both handlers are closed.

If dataTableHandler.close() throws, metadataTableHandler.close() won't be called, potentially leaking resources.

♻️ Suggested fix for exception-safe close
   `@Override`
   public void close() {
-    forEachHandler(CleanHandler::close);
+    Exception firstException = null;
+    try {
+      dataTableHandler.close();
+    } catch (Exception e) {
+      firstException = e;
+    }
+    try {
+      metadataTableHandler.close();
+    } catch (Exception e) {
+      if (firstException != null) {
+        firstException.addSuppressed(e);
+      } else {
+        firstException = e;
+      }
+    }
+    if (firstException != null) {
+      throw new RuntimeException("Failed to close handlers", firstException);
+    }
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompositeCleanHandler.java`
around lines 45 - 48, CompositeCleanHandler.close currently calls
forEachHandler(CleanHandler::close) which will stop if the first handler throws
and prevent the other (e.g., metadataTableHandler) from closing; change close in
CompositeCleanHandler to ensure exception-safe close by invoking each handler
close in its own try/catch, collecting any thrown exception (store first thrown
as primary and add subsequent exceptions as suppressed) and rethrowing at the
end if any occurred—this ensures both dataTableHandler and metadataTableHandler
(and any handlers iterated by forEachHandler) are always closed even when one
throws.
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompositeCompactionCommitHandler.java (1)

44-47: Same exception-safety concern in close() as noted for CompositeCleanHandler.

If the data table handler's close() throws, the metadata table handler won't be closed. Consider the same exception-safe pattern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompositeCompactionCommitHandler.java`
around lines 44 - 47, The CompositeCompactionCommitHandler.close() currently
calls forEachHandler(CompactionCommitHandler::close) which can abort closing
other handlers if one throws; modify close() to invoke close on both underlying
handlers (data table and metadata table) in a fail-safe way: call each handler's
close() in its own try/catch, capture the first thrown exception and add
subsequent exceptions as suppressed (or if none thrown, do nothing), and rethrow
the first exception after both close attempts; locate
CompositeCompactionCommitHandler.close(), the forEachHandler utility, and the
underlying CompactionCommitHandler implementations to implement this
exception-safe closing behavior.
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/DataTableCompactionCommitHandler.java (1)

97-101: Add null-safety or document the contract that registerMetrics() must be called before commitIfNecessary().

The compactionMetrics field is accessed in multiple places (lines 133, 143, 180, 220-221) without null checks. If commitIfNecessary() is called before registerMetrics(), an NPE will occur.

While the current usage in CompactionCommitSink.open() ensures proper initialization order, consider defensive coding:

♻️ Option 1: Add null checks at usage sites
       } finally {
         // remove commitBuffer to avoid obsolete metadata commit
         reset(instant);
-        this.compactionMetrics.markCompactionRolledBack();
+        if (this.compactionMetrics != null) {
+          this.compactionMetrics.markCompactionRolledBack();
+        }
       }
♻️ Option 2: Document the contract via Javadoc on registerMetrics
   `@Override`
+  /**
+   * Registers compaction metrics with the given metric group.
+   * <p><b>Must be called before {`@link` `#commitIfNecessary`(CompactionCommitEvent)}.</b></p>
+   */
   public void registerMetrics(MetricGroup metricGroup) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/DataTableCompactionCommitHandler.java`
around lines 97 - 101, The compactionMetrics field in
DataTableCompactionCommitHandler can be null if commitIfNecessary() (and other
methods that use compactionMetrics) is called before registerMetrics(); add
defensive checks and clear error messaging: in commitIfNecessary() (and any
other methods that access compactionMetrics such as where compactionMetrics.* is
invoked) verify compactionMetrics != null and if null throw an
IllegalStateException with a message like "registerMetrics() must be called
before commitIfNecessary()" (or alternately return/no-op where appropriate), and
also add a short Javadoc to registerMetrics() stating the initialization
contract so callers know to call it prior to commitIfNecessary().
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/compact/handler/TestCompositeHandlers.java (1)

96-143: Consider covering rollbackCompaction() and close() delegation too.

This suite validates collectCompactionOperations() and registerMetrics(), but the new composite plan handler also adds recovery/cleanup behavior. A small ordered verification for rollback/close would make regressions in the failure path much harder to miss.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/compact/handler/TestCompositeHandlers.java`
around lines 96 - 143, Add ordered tests to verify that
CompositeCompactionPlanHandler delegates rollbackCompaction(...) and close() to
both underlying CompactionPlanHandler instances in the correct order: create
mocks for dataHandler and metadataHandler, construct
CompositeCompactionPlanHandler(dataHandler, metadataHandler), call
handler.rollbackCompaction(someCheckpointId, output) and handler.close(), then
use InOrder(inOrder = inOrder(dataHandler, metadataHandler)) to verify
inOrder.verify(dataHandler).rollbackCompaction(same(checkpointId),
same(output)); inOrder.verify(metadataHandler).rollbackCompaction(...); and
similarly verify close() is called on dataHandler then metadataHandler, finally
verifyNoMoreInteractions on both handlers. Ensure you reuse the same unique
symbols from the diff: CompositeCompactionPlanHandler,
CompactionPlanHandler.rollbackCompaction, and close.
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompactionPlanHandler.java (1)

38-45: Document or enforce the registerMetrics() lifecycle.

This abstraction makes metric wiring a separate mandatory call, but the concrete handlers dereference compactionMetrics during collectCompactionOperations(...) without a guard. A missed registration now fails as a late NPE. Please either state the ordering contract here or give implementations a no-op/fail-fast default.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompactionPlanHandler.java`
around lines 38 - 45, The interface currently requires callers to call
registerMetrics before collectCompactionOperations but this is undocumented and
implementations dereference compactionMetrics causing NPEs; add a clear
lifecycle contract Javadoc to CompactionPlanHandler.registerMetrics stating it
must be invoked before collectCompactionOperations, and supply a safe default by
giving registerMetrics a default no-op implementation in the interface (so
implementations that don’t need metrics don’t NPE) and/or add a fail-fast
null-check at the start of collectCompactionOperations (use
Objects.requireNonNull on the compactionMetrics field in concrete handlers) so
missing registration produces an immediate, clear error instead of a late NPE;
update all concrete handlers that reference compactionMetrics to either rely on
the default no-op or explicitly guard/require registration.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/rollback/TestRollbackHelper.java`:
- Around line 800-801: The expected rollback log version is off-by-one: when
versions 1..ctx.logVersionCount are created, the rollback should be at
maxExistingVersion + 1. Update the StoragePath/filename construction used for
the expected rollback log in TestRollbackHelper (the line creating
rollbackLogPath using partitionStoragePath and FileCreateUtils.logFileName) to
use ctx.logVersionCount + 1 (instead of ctx.logVersionCount) when computing the
log version so the expected rollback file reflects version 5 for
logVersionCount==4.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkBucketAssignMetrics.java`:
- Around line 34-37: The primitive long fields lookupCount,
indexLookupDurationMs, insertCount, and updateCount in FlinkBucketAssignMetrics
are accessed from operator code and read by Gauge callbacks on Flink's metrics
thread; replace each with java.util.concurrent.atomic.AtomicLong, initialize
them in the class, update all mutations to use incrementAndGet/addAndGet/set as
appropriate, and update gauge suppliers (registered with the MetricGroup) to
call get() on the corresponding AtomicLong so reads are thread-safe and
race-free.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/common/AbstractStreamWriteFunction.java`:
- Around line 237-250: The code merges all WriteMetadataEvent entries from
writeMetadataState before checking pendingTimeline, which can allow
completed-instant statuses to be merged into a pending instant; change the
stream to filter entries against pendingTimeline first (e.g.,
stream(writeMetadataState.get().spliterator()).filter(e ->
pendingTimeline.containsInstant(e.getInstantTime()))), then reduce using
WriteMetadataEvent::mergeWithRescale, and keep the existing logic to
metadataEvent.setTaskID(taskID), sendWriteMetadataEvent(metadataEvent) and
log.info; this ensures only events for pending instants are merged and sent.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompositeCompactionPlanHandler.java`:
- Around line 37-54: forEachHandler currently stops on the first exception, so
modify it to invoke each handler invocation (used by rollbackCompaction, close,
registerMetrics, collectCompactionOperations) inside its own try-catch to ensure
both handlers always run; collect any thrown exceptions (first as primary) and
add subsequent ones via addSuppressed, then after iterating rethrow the primary
exception (or return normally if none). Update forEachHandler's signature/usage
accordingly so lifecycle methods (rollbackCompaction, close, etc.) will execute
both handlers unconditionally and propagate a combined exception if any handler
failed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/DataTableCompactHandler.java`:
- Around line 127-147: The compaction metric is started in doCompaction
(compactionMetrics.startCompaction()) but endCompaction() is only called on the
happy path; wrap the main body of doCompaction (including
CompactionUtil.setAvroSchema(...), compactor.compact(...), and
collector.collect(createCommitEvent(...))) in a try block and invoke
compactionMetrics.endCompaction() in a finally block so the metric is always
finished even if exceptions are thrown; rethrow the exception after finally to
preserve behavior.
- Around line 107-110: The code submits work via executor.execute(...) which
calls doCompaction(...) and the failure callback that both invoke
collector.collect(...) from a background thread; change this so
collector.collect(...) is only invoked on the operator task thread: move
compaction work to the executor but send results back to the operator thread
(e.g., post a Runnable to the MailboxExecutor or use Flink Async I/O via
AsyncDataStream.unorderedWait/orderedWait) instead of calling
collector.collect(...) directly in doCompaction or the executor failure
callback; update references around executor.execute, doCompaction, and the
failure lambda that currently calls createFailedCommitEvent and
collector.collect to instead queue the result to MailboxExecutor (or return a
Future for AsyncDataStream) so the actual collector.collect(...) runs on the
task thread.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/DefaultCleanHandler.java`:
- Around line 53-61: Reserve the cleaning slot by setting isCleaning = true
before calling executor.execute so the slot is held while the task is queued;
specifically, in DefaultCleanHandler.clean() (and the analogous block around
lines 90-99/startAsyncCleaning()) move the isCleaning assignment to just before
executor.execute and guard against executor rejection by catching exceptions
from executor.execute and resetting isCleaning = false if dispatch fails,
ensuring the existing finally in the worker thread still clears isCleaning after
HoodieFlinkWriteClient.clean() completes.
- Around line 105-115: The close() method currently closes the executor then
immediately calls writeClient.clean(), which can race with async work started by
startAsyncCleaning() because that async path bypasses the executor; modify
close() to detect in-flight async cleaning and either await its completion via
writeClient.waitForCleaningFinish() before invoking writeClient.clean() or skip
the shutdown clean entirely when async cleaning is active. Specifically, in
DefaultCleanHandler.close() after executor.close() (and before
writeClient.clean()), call the writeClient API that indicates or waits for
ongoing cleaning (e.g., writeClient.waitForCleaningFinish()) and only call
writeClient.clean() if no async cleaning is in progress, or return/skip the
extra clean when waitForCleaningFinish() completes indicating the async cleanup
already handled it.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/TableServiceHandlerFactory.java`:
- Around line 36-79: The factory methods createCompactionPlanHandler,
createCompactHandler, and createCompactionCommitHandler currently fall back to
DataTable*Handler when both OptionsResolver.needsAsyncCompaction(conf) and
OptionsResolver.needsAsyncMetadataCompaction(conf) are false; change each method
to detect the "both disabled" case first and return a no-op handler instead of a
DataTable handler (e.g. return a new NoOpCompactionPlanHandler(),
NoOpCompactHandler(), NoOpCompactionCommitHandler() respectively) so callers
that use the handler unconditionally won't perform unintended work; add minimal
NoOp* handler implementations if they don't exist and wire them into the factory
before the existing composite/metadata/data branching logic.

In
`@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/common/TestSqlConf.scala`:
- Around line 157-160: The teardown currently calls unsetEnv(...) then
DFSPropertiesConfiguration.clearGlobalProps(), but if unsetEnv throws the
clearGlobalProps() call can be skipped; wrap the calls in a try/finally inside
the after block so unsetEnv is attempted and
DFSPropertiesConfiguration.clearGlobalProps() is always executed in the finally
clause (referencing unsetEnv and DFSPropertiesConfiguration.clearGlobalProps to
locate the lines).

---

Nitpick comments:
In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkRocksDBIndexMetrics.java`:
- Around line 55-58: The FlinkRocksDBIndexMetrics constructor currently accepts
a possibly null rocksDBIndexBackend which can cause deferred NPEs; update the
FlinkRocksDBIndexMetrics(MetricGroup metricGroup, RocksDBIndexBackend
rocksDBIndexBackend) constructor to validate rocksDBIndexBackend is non-null and
throw a clear exception (e.g., IllegalArgumentException or NullPointerException)
with a descriptive message if it is null, then assign to
this.rocksDBIndexBackend only after the check.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompactionPlanHandler.java`:
- Around line 38-45: The interface currently requires callers to call
registerMetrics before collectCompactionOperations but this is undocumented and
implementations dereference compactionMetrics causing NPEs; add a clear
lifecycle contract Javadoc to CompactionPlanHandler.registerMetrics stating it
must be invoked before collectCompactionOperations, and supply a safe default by
giving registerMetrics a default no-op implementation in the interface (so
implementations that don’t need metrics don’t NPE) and/or add a fail-fast
null-check at the start of collectCompactionOperations (use
Objects.requireNonNull on the compactionMetrics field in concrete handlers) so
missing registration produces an immediate, clear error instead of a late NPE;
update all concrete handlers that reference compactionMetrics to either rely on
the default no-op or explicitly guard/require registration.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompositeCleanHandler.java`:
- Around line 45-48: CompositeCleanHandler.close currently calls
forEachHandler(CleanHandler::close) which will stop if the first handler throws
and prevent the other (e.g., metadataTableHandler) from closing; change close in
CompositeCleanHandler to ensure exception-safe close by invoking each handler
close in its own try/catch, collecting any thrown exception (store first thrown
as primary and add subsequent exceptions as suppressed) and rethrowing at the
end if any occurred—this ensures both dataTableHandler and metadataTableHandler
(and any handlers iterated by forEachHandler) are always closed even when one
throws.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompositeCompactionCommitHandler.java`:
- Around line 44-47: The CompositeCompactionCommitHandler.close() currently
calls forEachHandler(CompactionCommitHandler::close) which can abort closing
other handlers if one throws; modify close() to invoke close on both underlying
handlers (data table and metadata table) in a fail-safe way: call each handler's
close() in its own try/catch, capture the first thrown exception and add
subsequent exceptions as suppressed (or if none thrown, do nothing), and rethrow
the first exception after both close attempts; locate
CompositeCompactionCommitHandler.close(), the forEachHandler utility, and the
underlying CompactionCommitHandler implementations to implement this
exception-safe closing behavior.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompositeTableServiceHandler.java`:
- Around line 32-49: CompositeTableServiceHandler implements Closeable but
doesn't declare close(); add an explicit contract by declaring public abstract
void close() throws IOException in CompositeTableServiceHandler (or provide a
default implementation that closes both dataTableHandler and
metadataTableHandler if they are Closeable), so subclasses are forced to
implement/adhere to the Closeable contract; update subclasses to `@Override` the
close() accordingly.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/DataTableCompactionCommitHandler.java`:
- Around line 97-101: The compactionMetrics field in
DataTableCompactionCommitHandler can be null if commitIfNecessary() (and other
methods that use compactionMetrics) is called before registerMetrics(); add
defensive checks and clear error messaging: in commitIfNecessary() (and any
other methods that access compactionMetrics such as where compactionMetrics.* is
invoked) verify compactionMetrics != null and if null throw an
IllegalStateException with a message like "registerMetrics() must be called
before commitIfNecessary()" (or alternately return/no-op where appropriate), and
also add a short Javadoc to registerMetrics() stating the initialization
contract so callers know to call it prior to commitIfNecessary().

In
`@hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/metrics/TestFlinkBucketAssignMetrics.java`:
- Around line 63-90: Add a unit test that constructs FlinkBucketAssignMetrics
with indexLookupSamplingRate <= 0 (e.g., 0 and negative) to assert the
constructor clamps the value and does not cause modulo-by-zero in
shouldSampleLookup; call registerMetrics(), invoke shouldSampleLookup multiple
times to verify deterministic sampling behavior (no exception), and exercise
recordLookupDuration/markInsert/markUpdate to validate gauges still update;
reference the FlinkBucketAssignMetrics constructor and shouldSampleLookup method
and reuse the existing MetricGroup/gauges mocking pattern from
testLookupSamplingAndCounts to capture and assert gauge values.

In
`@hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/compact/handler/TestCompositeHandlers.java`:
- Around line 96-143: Add ordered tests to verify that
CompositeCompactionPlanHandler delegates rollbackCompaction(...) and close() to
both underlying CompactionPlanHandler instances in the correct order: create
mocks for dataHandler and metadataHandler, construct
CompositeCompactionPlanHandler(dataHandler, metadataHandler), call
handler.rollbackCompaction(someCheckpointId, output) and handler.close(), then
use InOrder(inOrder = inOrder(dataHandler, metadataHandler)) to verify
inOrder.verify(dataHandler).rollbackCompaction(same(checkpointId),
same(output)); inOrder.verify(metadataHandler).rollbackCompaction(...); and
similarly verify close() is called on dataHandler then metadataHandler, finally
verifyNoMoreInteractions on both handlers. Ensure you reuse the same unique
symbols from the diff: CompositeCompactionPlanHandler,
CompactionPlanHandler.rollbackCompaction, and close.

In
`@hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/catalog/TestHoodieHiveCatalog.java`:
- Around line 394-405: In testLanceFormatNotSupportedByHiveCatalog(), replace
the hard-coded error string in the assertion with the shared constant
HoodieFileFormat.LANCE_SPARK_ONLY_ERROR_MSG: change the containsString("Lance
base file format is currently only supported with the Spark engine") check to
containsString(HoodieFileFormat.LANCE_SPARK_ONLY_ERROR_MSG) (add or update the
import for HoodieFileFormat if necessary) so the test asserts against the
canonical constant instead of a literal.

In
`@hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/ITTestHoodieDataSource.java`:
- Around line 3274-3283: The test currently only asserts the row count for the
par1 filter (result1) which can hide wrong rows; replace the size-only check
with a deterministic content assertion using the existing helper
assertRowsEquals to compare result1 against the expected single-row string for
partition 'par1'. Locate the code that builds result1 (the
tableEnv.sqlQuery("select * from t1 where `partition` = 'par1' limit
1").execute().collect()) in ITTestHoodieDataSource and change
assertThat(result1).hasSize(1) to assertRowsEquals(result1, "<expected
single-row string matching the known par1 record>") so the test verifies exact
row content rather than just count.

In `@hudi-io/src/test/java/org/apache/hudi/io/util/TestIOUtils.java`:
- Around line 98-100: The test's encode/decode vector in TestIOUtils misses
negative cases and the sign-handling branch of writeVarInt; update the
testValues array used in the loop (the array defined in TestIOUtils) to include
negative and boundary integers (e.g., -1, -128, -129, -1000, Integer.MIN_VALUE)
so the round-trip via writeVarInt/readVarInt (or the corresponding encode/decode
methods) exercises the negative-paths and integer boundary behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ae81fb9a-9969-4fe8-a551-e89efe5eb175

📥 Commits

Reviewing files that changed from the base of the PR and between 4c19cc1 and 12bfc6e.

📒 Files selected for processing (73)
  • hudi-cli/src/main/java/org/apache/hudi/cli/utils/InputStreamConsumer.java
  • hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/BaseRollbackActionExecutor.java
  • hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/MarkerBasedRollbackStrategy.java
  • hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/RollbackHelper.java
  • hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/RollbackHelperFactory.java
  • hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/RollbackHelperV1.java
  • hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/upgrade/ZeroToOneUpgradeHandler.java
  • hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/rollback/TestRollbackHelper.java
  • hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/table/functional/TestMarkerBasedRollbackStrategy.java
  • hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileFormat.java
  • hudi-common/src/main/java/org/apache/hudi/common/table/timeline/versioning/v1/ArchivedTimelineV1.java
  • hudi-common/src/main/java/org/apache/hudi/common/util/collection/RocksDBDAO.java
  • hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieFileReaderFactory.java
  • hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieFileWriterFactory.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkBucketAssignMetrics.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkMdtCompactionMetrics.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkRocksDBIndexMetrics.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/CleanFunction.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/common/AbstractStreamWriteFunction.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactOperator.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionCommitSink.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/CompactionPlanOperator.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CleanHandler.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompactHandler.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompactionCommitHandler.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompactionPlanHandler.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompositeCleanHandler.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompositeCompactHandler.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompositeCompactionCommitHandler.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompositeCompactionPlanHandler.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompositeTableServiceHandler.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/DataTableCompactHandler.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/DataTableCompactionCommitHandler.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/DataTableCompactionPlanHandler.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/DefaultCleanHandler.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/MetadataTableCompactHandler.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/MetadataTableCompactionCommitHandler.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/MetadataTableCompactionPlanHandler.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/TableServiceHandlerFactory.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/event/WriteMetadataEvent.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/BucketAssignFunction.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/index/IndexBackend.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/index/RocksDBIndexBackend.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/catalog/HoodieHiveCatalog.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/format/FlinkRowDataReaderContext.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/metrics/TestFlinkBucketAssignMetrics.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/metrics/TestFlinkCompactionMetrics.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/compact/handler/TestCompositeHandlers.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/event/TestWriteMetadataEvent.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/partitioner/index/TestRocksDBIndexBackend.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/ITTestHoodieDataSource.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/TestHoodieTableFactory.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/catalog/TestHoodieHiveCatalog.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/format/TestFlinkRowDataReaderContext.java
  • hudi-gcp/src/main/java/org/apache/hudi/gcp/bigquery/BigQuerySchemaResolver.java
  • hudi-gcp/src/test/java/org/apache/hudi/gcp/bigquery/TestBigQuerySchemaResolver.java
  • hudi-hadoop-common/src/test/java/org/apache/hudi/common/table/timeline/TestArchivedTimelineV1.java
  • hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HiveHoodieReaderContext.java
  • hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileMetaIndexBlock.java
  • hudi-io/src/main/java/org/apache/hudi/io/hfile/HFileRootIndexBlock.java
  • hudi-io/src/main/java/org/apache/hudi/io/util/IOUtils.java
  • hudi-io/src/test/java/org/apache/hudi/io/hfile/TestHFileWriter.java
  • hudi-io/src/test/java/org/apache/hudi/io/util/TestIOUtils.java
  • hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestColumnStatsIndex.scala
  • hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/hudi/functional/TestPartitionStatsPruning.scala
  • hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/common/TestSqlConf.scala
  • hudi-sync/hudi-hive-sync/src/main/java/org/apache/hudi/hive/util/HiveSchemaUtil.java
  • hudi-sync/hudi-hive-sync/src/test/java/org/apache/hudi/hive/TestSparkSchemaUtils.java
  • hudi-sync/hudi-hive-sync/src/test/java/org/apache/hudi/hive/util/TestHiveSchemaUtil.java
  • hudi-sync/hudi-sync-common/src/main/java/org/apache/hudi/sync/common/util/SparkSchemaUtils.java
  • pom.xml
✅ Files skipped from review due to trivial changes (3)
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompactionCommitHandler.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/metrics/TestFlinkCompactionMetrics.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/event/TestWriteMetadataEvent.java
🚧 Files skipped from review as they are similar to previous changes (7)
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/index/IndexBackend.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/configuration/FlinkOptions.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/partitioner/index/TestRocksDBIndexBackend.java
  • hudi-common/src/main/java/org/apache/hudi/common/util/collection/RocksDBDAO.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/index/RocksDBIndexBackend.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/table/TestHoodieTableFactory.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/HoodieTableFactory.java

Comment on lines +800 to +801
StoragePath rollbackLogPath = new StoragePath(partitionStoragePath,
FileCreateUtils.logFileName(ctx.baseInstantTimeOfLogFiles, ctx.logFileId, ctx.logVersionCount));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the rollback log version behavior by checking RollbackHelperV1 implementation
ast-grep --pattern $'class RollbackHelperV1 {
  $$$
}'

# Also check the log file version computation logic
rg -n "logVersion|LOGFILE_BASE_VERSION|nextVersion" --type java -C3 -g '*RollbackHelper*'

Repository: yihua/hudi

Length of output: 23240


🏁 Script executed:

# Get the full test method around line 800-801 to see what assertion is made
sed -n '750,820p' hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/rollback/TestRollbackHelper.java

Repository: yihua/hudi

Length of output: 3349


🏁 Script executed:

# Check the test at line 737 (testV1MaybeDeleteAndCollectStatsWithSingleRequestPerFileGroup)
sed -n '720,750p' hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/rollback/TestRollbackHelper.java

Repository: yihua/hudi

Length of output: 1705


🏁 Script executed:

# Check the test at line 702 (testV1MaybeDeleteAndCollectStatsWithMultipleRequestsPerFileGroup)
sed -n '685,715p' hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/rollback/TestRollbackHelper.java

Repository: yihua/hudi

Length of output: 1899


🏁 Script executed:

# Find the ROLLBACK_LOG_VERSION constant definition
rg -n "ROLLBACK_LOG_VERSION" hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/rollback/TestRollbackHelper.java

Repository: yihua/hudi

Length of output: 500


🏁 Script executed:

# Check the SingleRequestTestContext and MultiRequestTestContext to see what log files they create
sed -n '400,470p' hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/rollback/TestRollbackHelper.java

Repository: yihua/hudi

Length of output: 3798


🏁 Script executed:

# Find SingleRequestTestContext and MultiRequestTestContext builders
sed -n '80,200p' hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/rollback/TestRollbackHelper.java

Repository: yihua/hudi

Length of output: 6332


🏁 Script executed:

# Get the full assertion at lines 802-811
sed -n '802,811p' hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/rollback/TestRollbackHelper.java

Repository: yihua/hudi

Length of output: 554


🏁 Script executed:

# Let's also check if there's any actual file system check on the rollbackLogPath
sed -n '788,815p' hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/rollback/TestRollbackHelper.java

Repository: yihua/hudi

Length of output: 1474


🏁 Script executed:

# Check if the test verifies the rollback log exists and what version it expects
rg -n "storage.exists.*rollbackLogPath\|storage.getPathInfo.*rollbackLogPath" hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/rollback/TestRollbackHelper.java -A2 -B2

Repository: yihua/hudi

Length of output: 36


🏁 Script executed:

# Let's understand what maybeDeleteAndCollectStats actually does - check if it creates the rollback log
rg -n "getOutputStream\|nextVersion\|rolloverVersion" hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/RollbackHelperV1.java -A2 -B2

Repository: yihua/hudi

Length of output: 36


🏁 Script executed:

# Search for where rollback logs are written in RollbackHelperV1
rg -n "getOutputStream\|HoodieLogFile\|logFileName" hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/RollbackHelperV1.java -A1 -B1 | head -80

Repository: yihua/hudi

Length of output: 36


🏁 Script executed:

# Check what storage.getPathInfo does and if it requires the file to exist
rg -n "getPathInfo" hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/rollback/TestRollbackHelper.java -B2 -A2

Repository: yihua/hudi

Length of output: 1405


🏁 Script executed:

# Search for HoodieLogFile instantiation and where logs are written
rg -n "new HoodieLogFile\|\.write\|getOutputStream" hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/ -A2 -B2 | head -100

Repository: yihua/hudi

Length of output: 36


🏁 Script executed:

# Check the HoodieLogFormat or HoodieLogFile class for how it determines next version
fd "HoodieLogFile.java" hudi-client/hudi-client-common/src/main/java/ -exec cat {} \; | head -150

Repository: yihua/hudi

Length of output: 200


🏁 Script executed:

# Find HoodieLogFile.java
fd "HoodieLogFile.java" hudi-client/hudi-client-common/src/main/java/

Repository: yihua/hudi

Length of output: 36


🏁 Script executed:

# Look for where rollback logs are actually created and at what version
rg -n "HoodieLogFile\.Builder\|withLogVersion\|logVersion.*=" hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/RollbackHelperV1.java -A2 -B2

Repository: yihua/hudi

Length of output: 36


🏁 Script executed:

# Look at how the rollback log version is computed
sed -n '290,350p' hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/RollbackHelperV1.java

Repository: yihua/hudi

Length of output: 4147


🏁 Script executed:

# Check if there's a comment or logic about version rollover
rg -n "rollover\|nextVersion\|logVersion\s*=" hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/RollbackHelperV1.java -B3 -A3 | head -60

Repository: yihua/hudi

Length of output: 36


🏁 Script executed:

# Check what getPathInfo does - does it require the file to exist?
rg -n "getPathInfo" hudi-client/hudi-client-common/src/main/java/org/apache/hudi/storage/ -A3 | head -50

Repository: yihua/hudi

Length of output: 163


🏁 Script executed:

# Let's trace through the actual test to understand what files should exist after rollback
# Check if there are any assertions that verify the file exists at line 801
sed -n '794,815p' hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/rollback/TestRollbackHelper.java

Repository: yihua/hudi

Length of output: 1096


🏁 Script executed:

# Look at lines around 135-136 where max version is stored
sed -n '130,150p' hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/RollbackHelperV1.java

Repository: yihua/hudi

Length of output: 1156


🏁 Script executed:

# Check the comment about rollover again and surrounding context
sed -n '325,340p' hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/rollback/RollbackHelperV1.java

Repository: yihua/hudi

Length of output: 991


🏁 Script executed:

# Search for getPathInfo implementation to see if it requires the file to exist
fd -type f -name "*.java" hudi-client/hudi-client-common/src/main/java | xargs rg -l "getPathInfo" | head -5

Repository: yihua/hudi

Length of output: 784


🏁 Script executed:

# Check the test assertion to see if it validates file existence
rg -n "assertRollbackStatsEquals" hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/rollback/TestRollbackHelper.java -A10 | head -30

Repository: yihua/hudi

Length of output: 1515


🏁 Script executed:

# Let me verify the exact pattern. In the MultiRequestTestContext:
# logFileId1: IntStream.of(1) creates version 1
# Expected rollback at line 704: version 2
# That's 1 file at version 1 → rollback at version 2 (1+1)

# In SingleRequestTestContext:
# IntStream.range(1, ROLLBACK_LOG_VERSION) where ROLLBACK_LOG_VERSION=20
# That's versions 1-19
# Expected rollback at line 737: version 20 (19+1)

# In V1LogBlockTestContext:
# IntStream.range(1, ctx.logVersionCount + 1) where logVersionCount=4
# That's versions 1-4
# Expected rollback at line 801: version 4 (which should be 4+1=5)

# Confirm ROLLBACK_LOG_VERSION value
rg -n "ROLLBACK_LOG_VERSION\s*=" hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/rollback/TestRollbackHelper.java

Repository: yihua/hudi

Length of output: 110


🏁 Script executed:

# Verify the exact log versions created in MultiRequestTestContext
sed -n '118,122p' hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/rollback/TestRollbackHelper.java

Repository: yihua/hudi

Length of output: 511


Fix off-by-one error in expected rollback log version.

Versions 1–4 are created via IntStream.range(1, ctx.logVersionCount + 1). Following the pattern established elsewhere in the file, where the rollback log is written to maxExistingVersion + 1:

  • Line 737: versions 1–19 exist → rollback at version 20
  • Line 704: version 1 exists → rollback at version 2

The expected rollback log here should be at version 5, not version 4.

Proposed fix
 StoragePath rollbackLogPath = new StoragePath(partitionStoragePath,
-    FileCreateUtils.logFileName(ctx.baseInstantTimeOfLogFiles, ctx.logFileId, ctx.logVersionCount));
+    FileCreateUtils.logFileName(ctx.baseInstantTimeOfLogFiles, ctx.logFileId, ctx.logVersionCount + 1));
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
StoragePath rollbackLogPath = new StoragePath(partitionStoragePath,
FileCreateUtils.logFileName(ctx.baseInstantTimeOfLogFiles, ctx.logFileId, ctx.logVersionCount));
StoragePath rollbackLogPath = new StoragePath(partitionStoragePath,
FileCreateUtils.logFileName(ctx.baseInstantTimeOfLogFiles, ctx.logFileId, ctx.logVersionCount + 1));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-client/hudi-client-common/src/test/java/org/apache/hudi/table/action/rollback/TestRollbackHelper.java`
around lines 800 - 801, The expected rollback log version is off-by-one: when
versions 1..ctx.logVersionCount are created, the rollback should be at
maxExistingVersion + 1. Update the StoragePath/filename construction used for
the expected rollback log in TestRollbackHelper (the line creating
rollbackLogPath using partitionStoragePath and FileCreateUtils.logFileName) to
use ctx.logVersionCount + 1 (instead of ctx.logVersionCount) when computing the
log version so the expected rollback file reflects version 5 for
logVersionCount==4.

Comment on lines +34 to +37
private long lookupCount;
private long indexLookupDurationMs;
private long insertCount;
private long updateCount;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Read-only verification: confirm current implementation uses unsynchronized mutable longs
# and where they are read/written.
rg -n "private long (lookupCount|indexLookupDurationMs|insertCount|updateCount)|gauge\\(|\\+= 1|indexLookupDurationMs\\s*=" \
  hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkBucketAssignMetrics.java

Repository: yihua/hudi

Length of output: 557


🏁 Script executed:

cat -n hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkBucketAssignMetrics.java

Repository: yihua/hudi

Length of output: 2763


🏁 Script executed:

fd -type f -name "HoodieFlinkMetrics.java" | head -5

Repository: yihua/hudi

Length of output: 224


🏁 Script executed:

fd --type f -name "HoodieFlinkMetrics.java"

Repository: yihua/hudi

Length of output: 284


🏁 Script executed:

find . -name "HoodieFlinkMetrics.java"

Repository: yihua/hudi

Length of output: 150


🏁 Script executed:

cat -n ./hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/HoodieFlinkMetrics.java

Repository: yihua/hudi

Length of output: 2337


🏁 Script executed:

rg "FlinkBucketAssignMetrics" --type java -B 3 -A 3

Repository: yihua/hudi

Length of output: 11020


🏁 Script executed:

rg "AtomicLong|volatile long" --type java hudi-flink-datasource/ | head -20

Repository: yihua/hudi

Length of output: 1171


🏁 Script executed:

cat -n hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/source/reader/RecordLimiter.java | head -60

Repository: yihua/hudi

Length of output: 2709


Make gauge backing fields thread-safe.

These counters and duration fields are mutated by operator methods and read by Gauge callbacks registered with Flink's MetricGroup, which runs on a separate metrics reporting thread. This creates a data race that can yield torn or stale metric values. Switch to AtomicLong for thread-safe access.

🔧 Suggested fix
 import org.apache.flink.metrics.Gauge;
 import org.apache.flink.metrics.MetricGroup;
+import java.util.concurrent.atomic.AtomicLong;
@@
-  private long lookupCount;
-  private long indexLookupDurationMs;
-  private long insertCount;
-  private long updateCount;
+  private final AtomicLong lookupCount = new AtomicLong();
+  private final AtomicLong indexLookupDurationMs = new AtomicLong();
+  private final AtomicLong insertCount = new AtomicLong();
+  private final AtomicLong updateCount = new AtomicLong();
@@
-    metricGroup.gauge(INDEX_LOOKUP_DURATION_MS, (Gauge<Long>) () -> indexLookupDurationMs);
-    metricGroup.gauge(INSERT_COUNT, (Gauge<Long>) () -> insertCount);
-    metricGroup.gauge(UPDATE_COUNT, (Gauge<Long>) () -> updateCount);
+    metricGroup.gauge(INDEX_LOOKUP_DURATION_MS, (Gauge<Long>) indexLookupDurationMs::get);
+    metricGroup.gauge(INSERT_COUNT, (Gauge<Long>) insertCount::get);
+    metricGroup.gauge(UPDATE_COUNT, (Gauge<Long>) updateCount::get);
@@
-    lookupCount += 1;
-    return lookupCount % indexLookupSamplingRate == 0;
+    return lookupCount.incrementAndGet() % indexLookupSamplingRate == 0;
@@
-    this.indexLookupDurationMs = lookupDurationMs;
+    this.indexLookupDurationMs.set(lookupDurationMs);
@@
-    insertCount += 1;
+    insertCount.incrementAndGet();
@@
-    updateCount += 1;
+    updateCount.incrementAndGet();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkBucketAssignMetrics.java`
around lines 34 - 37, The primitive long fields lookupCount,
indexLookupDurationMs, insertCount, and updateCount in FlinkBucketAssignMetrics
are accessed from operator code and read by Gauge callbacks on Flink's metrics
thread; replace each with java.util.concurrent.atomic.AtomicLong, initialize
them in the class, update all mutations to use incrementAndGet/addAndGet/set as
appropriate, and update gauge suppliers (registered with the MetricGroup) to
call get() on the corresponding AtomicLong so reads are thread-safe and
race-free.

Comment on lines +237 to +250
StreamSupport.stream(this.writeMetadataState.get().spliterator(), false)
.reduce(WriteMetadataEvent::mergeWithRescale)
.ifPresent(metadataEvent -> {
// Must filter out the completed instants in case it is a partial failover,
// the write status should not be accumulated in such case.
if (pendingTimeline.containsInstant(metadataEvent.getInstantTime())) {
// Reset taskID for event
metadataEvent.setTaskID(taskID);
// The checkpoint succeed but the meta does not commit,
// re-commit the inflight instant
sendWriteMetadataEvent(metadataEvent);
log.info("Send uncommitted write metadata event to coordinator, task[{}].", taskID);
}
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Filter pending instants before merge to avoid cross-instant status accumulation.

At Line 237-239, events are merged first; at Line 240-243, filtering happens after. That can accumulate completed-instant writeStatuses into a pending instant during restore/rescale.

🐛 Proposed fix
+import java.util.stream.Collectors;
 import java.util.stream.StreamSupport;
@@
-        StreamSupport.stream(this.writeMetadataState.get().spliterator(), false)
-            .reduce(WriteMetadataEvent::mergeWithRescale)
-            .ifPresent(metadataEvent -> {
-              // Must filter out the completed instants in case it is a partial failover,
-              // the write status should not be accumulated in such case.
-              if (pendingTimeline.containsInstant(metadataEvent.getInstantTime())) {
-                // Reset taskID for event
-                metadataEvent.setTaskID(taskID);
-                // The checkpoint succeed but the meta does not commit,
-                // re-commit the inflight instant
-                sendWriteMetadataEvent(metadataEvent);
-                log.info("Send uncommitted write metadata event to coordinator, task[{}].", taskID);
-              }
-            });
+        StreamSupport.stream(this.writeMetadataState.get().spliterator(), false)
+            .filter(event -> pendingTimeline.containsInstant(event.getInstantTime()))
+            .collect(Collectors.groupingBy(WriteMetadataEvent::getInstantTime))
+            .values()
+            .forEach(events -> events.stream()
+                .reduce(WriteMetadataEvent::mergeWithRescale)
+                .ifPresent(metadataEvent -> {
+                  // Reset taskID for event
+                  metadataEvent.setTaskID(taskID);
+                  // The checkpoint succeed but the meta does not commit,
+                  // re-commit the inflight instant
+                  sendWriteMetadataEvent(metadataEvent);
+                  log.info("Send uncommitted write metadata event to coordinator, task[{}].", taskID);
+                }));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/common/AbstractStreamWriteFunction.java`
around lines 237 - 250, The code merges all WriteMetadataEvent entries from
writeMetadataState before checking pendingTimeline, which can allow
completed-instant statuses to be merged into a pending instant; change the
stream to filter entries against pendingTimeline first (e.g.,
stream(writeMetadataState.get().spliterator()).filter(e ->
pendingTimeline.containsInstant(e.getInstantTime()))), then reduce using
WriteMetadataEvent::mergeWithRescale, and keep the existing logic to
metadataEvent.setTaskID(taskID), sendWriteMetadataEvent(metadataEvent) and
log.info; this ensures only events for pending instants are merged and sent.

Comment on lines +37 to +54
public void rollbackCompaction() {
forEachHandler(CompactionPlanHandler::rollbackCompaction);
}

@Override
public void registerMetrics(MetricGroup metricGroup) {
forEachHandler(compactionPlanHandler -> compactionPlanHandler.registerMetrics(metricGroup));
}

@Override
public void collectCompactionOperations(long checkpointId,
Output<StreamRecord<CompactionPlanEvent>> output) {
forEachHandler(compactionPlanHandler -> compactionPlanHandler.collectCompactionOperations(checkpointId, output));
}

@Override
public void close() {
forEachHandler(CompactionPlanHandler::close);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE=$(fd 'CompositeTableServiceHandler.java' hudi-flink-datasource/hudi-flink/src/main/java | head -n1)
sed -n '1,220p' "$FILE"

echo
echo "---- usages ----"
rg -n -C2 'forEachHandler\s*\(' hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler

Repository: yihua/hudi

Length of output: 11386


🏁 Script executed:

rg -n 'forEachHandler' hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler --type java -A5 -B5 | head -100

Repository: yihua/hudi

Length of output: 13741


🏁 Script executed:

# Check if there are any try-catch blocks wrapping the handler invocations
rg -n 'try\s*\{' hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompositeCompactionPlanHandler.java -A10

Repository: yihua/hudi

Length of output: 36


Add exception handling to forEachHandler() to ensure both handlers execute in lifecycle methods.

forEachHandler() stops at the first exception, preventing the metadata handler from running if the data handler fails. For critical operations like rollbackCompaction() and close(), this leaves the second write client or metadata table in an inconsistent state. Wrap handler invocations in forEachHandler() with try-catch logic that collects exceptions and invokes both handlers unconditionally, then throws a suppressed exception if either handler failed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/CompositeCompactionPlanHandler.java`
around lines 37 - 54, forEachHandler currently stops on the first exception, so
modify it to invoke each handler invocation (used by rollbackCompaction, close,
registerMetrics, collectCompactionOperations) inside its own try-catch to ensure
both handlers always run; collect any thrown exceptions (first as primary) and
add subsequent ones via addSuppressed, then after iterating rethrow the primary
exception (or return normally if none). Update forEachHandler's signature/usage
accordingly so lifecycle methods (rollbackCompaction, close, etc.) will execute
both handlers unconditionally and propagate a combined exception if any handler
failed.

Comment on lines +107 to +110
if (executor != null) {
executor.execute(
() -> doCompaction(event, collector, needReloadMetaClient),
(errMsg, t) -> collector.collect(createFailedCommitEvent(event)),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

In Apache Flink, is org.apache.flink.util.Collector safe to call from a background thread outside the operator or mailbox thread, and what is the recommended pattern for emitting results from asynchronous operator work?

💡 Result:

No, org.apache.flink.util.Collector is not safe to call directly from a background thread outside the operator or mailbox thread. Flink enforces a single-threaded execution model for user code (processElement, callbacks, onTimer, etc.) within the task thread, as per FLIP-425 and async state documentation. Direct calls from external threads risk concurrency issues, data races, or violations of this model. The recommended pattern for emitting results from asynchronous operator work is to use Flink's Async I/O API (AsyncDataStream.unorderedWait or orderedWait). Implement an AsyncFunction where asyncInvoke returns quickly and completes a provided ResultFuture.complete(collectionOfResults) from the async callback. The operator handles buffering, ordering (if ordered), timeouts, fault tolerance, and emission via its internal emitter on the task thread. This ensures thread-safety and exactly-once guarantees. For State V2 async access, callbacks like StateFuture.thenAccept execute on the single task thread, allowing safe collector use there, but not from arbitrary background threads.

Citations:


🏁 Script executed:

find . -type f -name "DataTableCompactHandler.java" | head -5

Repository: yihua/hudi

Length of output: 168


🏁 Script executed:

find . -type f -name "DataTableCompactHandler.java" -exec wc -l {} \;

Repository: yihua/hudi

Length of output: 172


🏁 Script executed:

fd DataTableCompactHandler.java -x wc -l

Repository: yihua/hudi

Length of output: 172


🏁 Script executed:

fd DataTableCompactHandler.java

Repository: yihua/hudi

Length of output: 166


🏁 Script executed:

git ls-files | grep -i datatablecompacthandler

Repository: yihua/hudi

Length of output: 283


🏁 Script executed:

cat -n hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/DataTableCompactHandler.java

Repository: yihua/hudi

Length of output: 10329


🏁 Script executed:

sed -n '100,160p' hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/DataTableCompactHandler.java

Repository: yihua/hudi

Length of output: 2790


🏁 Script executed:

find . -type f -name "NonThrownExecutor.java"

Repository: yihua/hudi

Length of output: 152


🏁 Script executed:

cat -n ./hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/utils/NonThrownExecutor.java

Repository: yihua/hudi

Length of output: 7888


Fix the threading violation: collector.collect() must not be called from the executor thread.

The executor.execute() calls at lines 107-110 submit work to a background thread (NonThrownExecutor uses a single-thread executor pool). Both the success path in doCompaction() (line 147) and the failure callback (line 110) invoke collector.collect() from this background thread. Flink's Collector is not thread-safe and must only be called from the operator's task thread. Calling it from background threads violates Flink's single-threaded execution model and risks data races with checkpointing and output emission.

Use Flink's Async I/O API (AsyncDataStream.unorderedWait/orderedWait) or emit results back to the task thread via MailboxExecutor instead.

Also applies to: 127-147

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/DataTableCompactHandler.java`
around lines 107 - 110, The code submits work via executor.execute(...) which
calls doCompaction(...) and the failure callback that both invoke
collector.collect(...) from a background thread; change this so
collector.collect(...) is only invoked on the operator task thread: move
compaction work to the executor but send results back to the operator thread
(e.g., post a Runnable to the MailboxExecutor or use Flink Async I/O via
AsyncDataStream.unorderedWait/orderedWait) instead of calling
collector.collect(...) directly in doCompaction or the executor failure
callback; update references around executor.execute, doCompaction, and the
failure lambda that currently calls createFailedCommitEvent and
collector.collect to instead queue the result to MailboxExecutor (or return a
Future for AsyncDataStream) so the actual collector.collect(...) runs on the
task thread.

Comment on lines +127 to +147
protected void doCompaction(CompactionPlanEvent event,
Collector<CompactionCommitEvent> collector,
boolean needReloadMetaClient) throws Exception {
compactionMetrics.startCompaction();
HoodieFlinkMergeOnReadTableCompactor<?> compactor = new HoodieFlinkMergeOnReadTableCompactor<>();
HoodieTableMetaClient metaClient = table.getMetaClient();
if (needReloadMetaClient) {
// reload the timeline
metaClient.reload();
}
// schema evolution
CompactionUtil.setAvroSchema(writeClient.getConfig(), metaClient);
List<WriteStatus> writeStatuses = compactor.compact(
writeClient.getConfig(),
event.getOperation(),
event.getCompactionInstantTime(),
table.getTaskContextSupplier(),
createReaderContext(needReloadMetaClient),
table);
compactionMetrics.endCompaction();
collector.collect(createCommitEvent(event, writeStatuses));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Always finish the compaction metric in a finally block.

startCompaction() runs for every attempt, but endCompaction() only runs after the happy path completes. Any exception from schema setup, the compactor, or even collector.collect(...) leaves the metric stuck in-progress and will skew the new compaction metrics after the first failure.

Proposed fix
   protected void doCompaction(CompactionPlanEvent event,
                               Collector<CompactionCommitEvent> collector,
                               boolean needReloadMetaClient) throws Exception {
     compactionMetrics.startCompaction();
-    HoodieFlinkMergeOnReadTableCompactor<?> compactor = new HoodieFlinkMergeOnReadTableCompactor<>();
-    HoodieTableMetaClient metaClient = table.getMetaClient();
-    if (needReloadMetaClient) {
-      // reload the timeline
-      metaClient.reload();
-    }
-    // schema evolution
-    CompactionUtil.setAvroSchema(writeClient.getConfig(), metaClient);
-    List<WriteStatus> writeStatuses = compactor.compact(
-        writeClient.getConfig(),
-        event.getOperation(),
-        event.getCompactionInstantTime(),
-        table.getTaskContextSupplier(),
-        createReaderContext(needReloadMetaClient),
-        table);
-    compactionMetrics.endCompaction();
-    collector.collect(createCommitEvent(event, writeStatuses));
+    try {
+      HoodieFlinkMergeOnReadTableCompactor<?> compactor = new HoodieFlinkMergeOnReadTableCompactor<>();
+      HoodieTableMetaClient metaClient = table.getMetaClient();
+      if (needReloadMetaClient) {
+        metaClient.reload();
+      }
+      CompactionUtil.setAvroSchema(writeClient.getConfig(), metaClient);
+      List<WriteStatus> writeStatuses = compactor.compact(
+          writeClient.getConfig(),
+          event.getOperation(),
+          event.getCompactionInstantTime(),
+          table.getTaskContextSupplier(),
+          createReaderContext(needReloadMetaClient),
+          table);
+      collector.collect(createCommitEvent(event, writeStatuses));
+    } finally {
+      compactionMetrics.endCompaction();
+    }
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
protected void doCompaction(CompactionPlanEvent event,
Collector<CompactionCommitEvent> collector,
boolean needReloadMetaClient) throws Exception {
compactionMetrics.startCompaction();
HoodieFlinkMergeOnReadTableCompactor<?> compactor = new HoodieFlinkMergeOnReadTableCompactor<>();
HoodieTableMetaClient metaClient = table.getMetaClient();
if (needReloadMetaClient) {
// reload the timeline
metaClient.reload();
}
// schema evolution
CompactionUtil.setAvroSchema(writeClient.getConfig(), metaClient);
List<WriteStatus> writeStatuses = compactor.compact(
writeClient.getConfig(),
event.getOperation(),
event.getCompactionInstantTime(),
table.getTaskContextSupplier(),
createReaderContext(needReloadMetaClient),
table);
compactionMetrics.endCompaction();
collector.collect(createCommitEvent(event, writeStatuses));
protected void doCompaction(CompactionPlanEvent event,
Collector<CompactionCommitEvent> collector,
boolean needReloadMetaClient) throws Exception {
compactionMetrics.startCompaction();
try {
HoodieFlinkMergeOnReadTableCompactor<?> compactor = new HoodieFlinkMergeOnReadTableCompactor<>();
HoodieTableMetaClient metaClient = table.getMetaClient();
if (needReloadMetaClient) {
metaClient.reload();
}
CompactionUtil.setAvroSchema(writeClient.getConfig(), metaClient);
List<WriteStatus> writeStatuses = compactor.compact(
writeClient.getConfig(),
event.getOperation(),
event.getCompactionInstantTime(),
table.getTaskContextSupplier(),
createReaderContext(needReloadMetaClient),
table);
collector.collect(createCommitEvent(event, writeStatuses));
} finally {
compactionMetrics.endCompaction();
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/DataTableCompactHandler.java`
around lines 127 - 147, The compaction metric is started in doCompaction
(compactionMetrics.startCompaction()) but endCompaction() is only called on the
happy path; wrap the main body of doCompaction (including
CompactionUtil.setAvroSchema(...), compactor.compact(...), and
collector.collect(createCommitEvent(...))) in a try block and invoke
compactionMetrics.endCompaction() in a finally block so the metric is always
finished even if exceptions are thrown; rethrow the exception after finally to
preserve behavior.

Comment on lines +53 to +61
public void clean() {
executor.execute(() -> {
this.isCleaning = true;
try {
this.writeClient.clean();
} finally {
this.isCleaning = false;
}
}, "wait for cleaning finish");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reserve the cleaning slot before dispatching work.

clean() only flips isCleaning after the executor thread starts running, so the first checkpoint can still enter startAsyncCleaning() and kick off a second clean while the initial one is merely queued. That makes overlapping clean operations possible against the same HoodieFlinkWriteClient.

Suggested fix
+import java.util.concurrent.atomic.AtomicBoolean;
+
 public class DefaultCleanHandler implements CleanHandler {
   private final HoodieFlinkWriteClient writeClient;
-  private volatile boolean isCleaning;
+  private final AtomicBoolean isCleaning = new AtomicBoolean(false);
   private final NonThrownExecutor executor;
@@
   public void clean() {
+    if (!isCleaning.compareAndSet(false, true)) {
+      return;
+    }
     executor.execute(() -> {
-      this.isCleaning = true;
       try {
         this.writeClient.clean();
       } finally {
-        this.isCleaning = false;
+        isCleaning.set(false);
       }
-    }, "wait for cleaning finish");
+    }, "clean");
   }
@@
   public void startAsyncCleaning() {
-    if (!isCleaning) {
+    if (isCleaning.compareAndSet(false, true)) {
       try {
         this.writeClient.startAsyncCleaning();
-        this.isCleaning = true;
       } catch (Throwable throwable) {
+        isCleaning.set(false);
         // catch the exception to not affect the normal checkpointing
         log.warn("Failed to start async cleaning", throwable);
       }
     }
   }

Also applies to: 90-99

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/DefaultCleanHandler.java`
around lines 53 - 61, Reserve the cleaning slot by setting isCleaning = true
before calling executor.execute so the slot is held while the task is queued;
specifically, in DefaultCleanHandler.clean() (and the analogous block around
lines 90-99/startAsyncCleaning()) move the isCleaning assignment to just before
executor.execute and guard against executor rejection by catching exceptions
from executor.execute and resetting isCleaning = false if dispatch fails,
ensuring the existing finally in the worker thread still clears isCleaning after
HoodieFlinkWriteClient.clean() completes.

Comment on lines +105 to +115
@Override
public void close() {
if (executor != null) {
try {
executor.close();
} catch (Exception e) {
throw new HoodieException("Failed to close executor of clean handler.", e);
}
}
this.writeClient.clean();
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Wait for in-flight async cleaning before the shutdown clean.

startAsyncCleaning() bypasses executor, so executor.close() does not wait for that work. Line 114 can therefore race a new writeClient.clean() against an already running async clean. Either block on writeClient.waitForCleaningFinish() when cleaning is in progress, or skip the extra shutdown clean entirely.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/DefaultCleanHandler.java`
around lines 105 - 115, The close() method currently closes the executor then
immediately calls writeClient.clean(), which can race with async work started by
startAsyncCleaning() because that async path bypasses the executor; modify
close() to detect in-flight async cleaning and either await its completion via
writeClient.waitForCleaningFinish() before invoking writeClient.clean() or skip
the shutdown clean entirely when async cleaning is active. Specifically, in
DefaultCleanHandler.close() after executor.close() (and before
writeClient.clean()), call the writeClient API that indicates or waits for
ongoing cleaning (e.g., writeClient.waitForCleaningFinish()) and only call
writeClient.clean() if no async cleaning is in progress, or return/skip the
extra clean when waitForCleaningFinish() completes indicating the async cleanup
already handled it.

Comment on lines +36 to +79
public static CompactionPlanHandler createCompactionPlanHandler(Configuration conf, RuntimeContext runtimeContext) {
HoodieFlinkWriteClient writeClient = FlinkWriteClients.createWriteClient(conf, runtimeContext);
boolean needsDataCompaction = OptionsResolver.needsAsyncCompaction(conf);
boolean needsMetadataCompaction = OptionsResolver.needsAsyncMetadataCompaction(conf);
if (needsDataCompaction && needsMetadataCompaction) {
return new CompositeCompactionPlanHandler(
new DataTableCompactionPlanHandler(writeClient),
new MetadataTableCompactionPlanHandler(StreamerUtil.createMetadataWriteClient(writeClient)));
}
if (needsMetadataCompaction) {
return new MetadataTableCompactionPlanHandler(StreamerUtil.createMetadataWriteClient(writeClient));
}
return new DataTableCompactionPlanHandler(writeClient);
}

public static CompactHandler createCompactHandler(Configuration conf, RuntimeContext runtimeContext, int taskId) {
HoodieFlinkWriteClient writeClient = FlinkWriteClients.createWriteClient(conf, runtimeContext);
boolean needsDataCompaction = OptionsResolver.needsAsyncCompaction(conf);
boolean needsMetadataCompaction = OptionsResolver.needsAsyncMetadataCompaction(conf);
if (needsDataCompaction && needsMetadataCompaction) {
return new CompositeCompactHandler(
new DataTableCompactHandler(writeClient, taskId),
new MetadataTableCompactHandler(StreamerUtil.createMetadataWriteClient(writeClient), taskId));
}
if (needsMetadataCompaction) {
return new MetadataTableCompactHandler(StreamerUtil.createMetadataWriteClient(writeClient), taskId);
}
return new DataTableCompactHandler(writeClient, taskId);
}

public static CompactionCommitHandler createCompactionCommitHandler(Configuration conf, RuntimeContext runtimeContext) {
HoodieFlinkWriteClient writeClient = FlinkWriteClients.createWriteClient(conf, runtimeContext);
boolean needsDataCompaction = OptionsResolver.needsAsyncCompaction(conf);
boolean needsMetadataCompaction = OptionsResolver.needsAsyncMetadataCompaction(conf);
if (needsDataCompaction && needsMetadataCompaction) {
return new CompositeCompactionCommitHandler(
new DataTableCompactionCommitHandler(conf, writeClient),
new MetadataTableCompactionCommitHandler(conf, StreamerUtil.createMetadataWriteClient(writeClient)));
}
if (needsMetadataCompaction) {
return new MetadataTableCompactionCommitHandler(conf, StreamerUtil.createMetadataWriteClient(writeClient));
}
return new DataTableCompactionCommitHandler(conf, writeClient);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Do not default to data-table handlers when both compaction modes are off.

All three factory methods fall through to a DataTable*Handler when needsAsyncCompaction(conf) and needsAsyncMetadataCompaction(conf) are both false. After this refactor the callers use the returned handler unconditionally, so the disabled case stops being a no-op and starts doing rollback/planning/commit work on the data table instead.

Suggested fix
+  private static void validateCompactionEnabled(boolean needsDataCompaction, boolean needsMetadataCompaction) {
+    if (!needsDataCompaction && !needsMetadataCompaction) {
+      throw new IllegalStateException("No compaction mode is enabled");
+    }
+  }
+
   public static CompactionPlanHandler createCompactionPlanHandler(Configuration conf, RuntimeContext runtimeContext) {
     HoodieFlinkWriteClient writeClient = FlinkWriteClients.createWriteClient(conf, runtimeContext);
     boolean needsDataCompaction = OptionsResolver.needsAsyncCompaction(conf);
     boolean needsMetadataCompaction = OptionsResolver.needsAsyncMetadataCompaction(conf);
+    validateCompactionEnabled(needsDataCompaction, needsMetadataCompaction);
     if (needsDataCompaction && needsMetadataCompaction) {
       return new CompositeCompactionPlanHandler(
           new DataTableCompactionPlanHandler(writeClient),
@@
   public static CompactHandler createCompactHandler(Configuration conf, RuntimeContext runtimeContext, int taskId) {
     HoodieFlinkWriteClient writeClient = FlinkWriteClients.createWriteClient(conf, runtimeContext);
     boolean needsDataCompaction = OptionsResolver.needsAsyncCompaction(conf);
     boolean needsMetadataCompaction = OptionsResolver.needsAsyncMetadataCompaction(conf);
+    validateCompactionEnabled(needsDataCompaction, needsMetadataCompaction);
     if (needsDataCompaction && needsMetadataCompaction) {
       return new CompositeCompactHandler(
           new DataTableCompactHandler(writeClient, taskId),
@@
   public static CompactionCommitHandler createCompactionCommitHandler(Configuration conf, RuntimeContext runtimeContext) {
     HoodieFlinkWriteClient writeClient = FlinkWriteClients.createWriteClient(conf, runtimeContext);
     boolean needsDataCompaction = OptionsResolver.needsAsyncCompaction(conf);
     boolean needsMetadataCompaction = OptionsResolver.needsAsyncMetadataCompaction(conf);
+    validateCompactionEnabled(needsDataCompaction, needsMetadataCompaction);
     if (needsDataCompaction && needsMetadataCompaction) {
       return new CompositeCompactionCommitHandler(
           new DataTableCompactionCommitHandler(conf, writeClient),
@@
-    return new DataTableCompactionCommitHandler(conf, writeClient);
+    return new DataTableCompactionCommitHandler(conf, writeClient);
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public static CompactionPlanHandler createCompactionPlanHandler(Configuration conf, RuntimeContext runtimeContext) {
HoodieFlinkWriteClient writeClient = FlinkWriteClients.createWriteClient(conf, runtimeContext);
boolean needsDataCompaction = OptionsResolver.needsAsyncCompaction(conf);
boolean needsMetadataCompaction = OptionsResolver.needsAsyncMetadataCompaction(conf);
if (needsDataCompaction && needsMetadataCompaction) {
return new CompositeCompactionPlanHandler(
new DataTableCompactionPlanHandler(writeClient),
new MetadataTableCompactionPlanHandler(StreamerUtil.createMetadataWriteClient(writeClient)));
}
if (needsMetadataCompaction) {
return new MetadataTableCompactionPlanHandler(StreamerUtil.createMetadataWriteClient(writeClient));
}
return new DataTableCompactionPlanHandler(writeClient);
}
public static CompactHandler createCompactHandler(Configuration conf, RuntimeContext runtimeContext, int taskId) {
HoodieFlinkWriteClient writeClient = FlinkWriteClients.createWriteClient(conf, runtimeContext);
boolean needsDataCompaction = OptionsResolver.needsAsyncCompaction(conf);
boolean needsMetadataCompaction = OptionsResolver.needsAsyncMetadataCompaction(conf);
if (needsDataCompaction && needsMetadataCompaction) {
return new CompositeCompactHandler(
new DataTableCompactHandler(writeClient, taskId),
new MetadataTableCompactHandler(StreamerUtil.createMetadataWriteClient(writeClient), taskId));
}
if (needsMetadataCompaction) {
return new MetadataTableCompactHandler(StreamerUtil.createMetadataWriteClient(writeClient), taskId);
}
return new DataTableCompactHandler(writeClient, taskId);
}
public static CompactionCommitHandler createCompactionCommitHandler(Configuration conf, RuntimeContext runtimeContext) {
HoodieFlinkWriteClient writeClient = FlinkWriteClients.createWriteClient(conf, runtimeContext);
boolean needsDataCompaction = OptionsResolver.needsAsyncCompaction(conf);
boolean needsMetadataCompaction = OptionsResolver.needsAsyncMetadataCompaction(conf);
if (needsDataCompaction && needsMetadataCompaction) {
return new CompositeCompactionCommitHandler(
new DataTableCompactionCommitHandler(conf, writeClient),
new MetadataTableCompactionCommitHandler(conf, StreamerUtil.createMetadataWriteClient(writeClient)));
}
if (needsMetadataCompaction) {
return new MetadataTableCompactionCommitHandler(conf, StreamerUtil.createMetadataWriteClient(writeClient));
}
return new DataTableCompactionCommitHandler(conf, writeClient);
}
private static void validateCompactionEnabled(boolean needsDataCompaction, boolean needsMetadataCompaction) {
if (!needsDataCompaction && !needsMetadataCompaction) {
throw new IllegalStateException("No compaction mode is enabled");
}
}
public static CompactionPlanHandler createCompactionPlanHandler(Configuration conf, RuntimeContext runtimeContext) {
HoodieFlinkWriteClient writeClient = FlinkWriteClients.createWriteClient(conf, runtimeContext);
boolean needsDataCompaction = OptionsResolver.needsAsyncCompaction(conf);
boolean needsMetadataCompaction = OptionsResolver.needsAsyncMetadataCompaction(conf);
validateCompactionEnabled(needsDataCompaction, needsMetadataCompaction);
if (needsDataCompaction && needsMetadataCompaction) {
return new CompositeCompactionPlanHandler(
new DataTableCompactionPlanHandler(writeClient),
new MetadataTableCompactionPlanHandler(StreamerUtil.createMetadataWriteClient(writeClient)));
}
if (needsMetadataCompaction) {
return new MetadataTableCompactionPlanHandler(StreamerUtil.createMetadataWriteClient(writeClient));
}
return new DataTableCompactionPlanHandler(writeClient);
}
public static CompactHandler createCompactHandler(Configuration conf, RuntimeContext runtimeContext, int taskId) {
HoodieFlinkWriteClient writeClient = FlinkWriteClients.createWriteClient(conf, runtimeContext);
boolean needsDataCompaction = OptionsResolver.needsAsyncCompaction(conf);
boolean needsMetadataCompaction = OptionsResolver.needsAsyncMetadataCompaction(conf);
validateCompactionEnabled(needsDataCompaction, needsMetadataCompaction);
if (needsDataCompaction && needsMetadataCompaction) {
return new CompositeCompactHandler(
new DataTableCompactHandler(writeClient, taskId),
new MetadataTableCompactHandler(StreamerUtil.createMetadataWriteClient(writeClient), taskId));
}
if (needsMetadataCompaction) {
return new MetadataTableCompactHandler(StreamerUtil.createMetadataWriteClient(writeClient), taskId);
}
return new DataTableCompactHandler(writeClient, taskId);
}
public static CompactionCommitHandler createCompactionCommitHandler(Configuration conf, RuntimeContext runtimeContext) {
HoodieFlinkWriteClient writeClient = FlinkWriteClients.createWriteClient(conf, runtimeContext);
boolean needsDataCompaction = OptionsResolver.needsAsyncCompaction(conf);
boolean needsMetadataCompaction = OptionsResolver.needsAsyncMetadataCompaction(conf);
validateCompactionEnabled(needsDataCompaction, needsMetadataCompaction);
if (needsDataCompaction && needsMetadataCompaction) {
return new CompositeCompactionCommitHandler(
new DataTableCompactionCommitHandler(conf, writeClient),
new MetadataTableCompactionCommitHandler(conf, StreamerUtil.createMetadataWriteClient(writeClient)));
}
if (needsMetadataCompaction) {
return new MetadataTableCompactionCommitHandler(conf, StreamerUtil.createMetadataWriteClient(writeClient));
}
return new DataTableCompactionCommitHandler(conf, writeClient);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/compact/handler/TableServiceHandlerFactory.java`
around lines 36 - 79, The factory methods createCompactionPlanHandler,
createCompactHandler, and createCompactionCommitHandler currently fall back to
DataTable*Handler when both OptionsResolver.needsAsyncCompaction(conf) and
OptionsResolver.needsAsyncMetadataCompaction(conf) are false; change each method
to detect the "both disabled" case first and return a no-op handler instead of a
DataTable handler (e.g. return a new NoOpCompactionPlanHandler(),
NoOpCompactHandler(), NoOpCompactionCommitHandler() respectively) so callers
that use the handler unconditionally won't perform unintended work; add minimal
NoOp* handler implementations if they don't exist and wire them into the factory
before the existing composite/metadata/data branching logic.

Comment on lines 157 to 160
after {
unsetEnv(DFSPropertiesConfiguration.CONF_FILE_DIR_ENV_NAME)
DFSPropertiesConfiguration.clearGlobalProps()
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Make teardown cleanup failure-safe with try/finally.

If unsetEnv fails on Line 158, DFSPropertiesConfiguration.clearGlobalProps() on Line 159 is skipped, which can leak state into later tests.

Proposed fix
   after {
-    unsetEnv(DFSPropertiesConfiguration.CONF_FILE_DIR_ENV_NAME)
-    DFSPropertiesConfiguration.clearGlobalProps()
+    try {
+      unsetEnv(DFSPropertiesConfiguration.CONF_FILE_DIR_ENV_NAME)
+    } finally {
+      DFSPropertiesConfiguration.clearGlobalProps()
+    }
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
after {
unsetEnv(DFSPropertiesConfiguration.CONF_FILE_DIR_ENV_NAME)
DFSPropertiesConfiguration.clearGlobalProps()
}
after {
try {
unsetEnv(DFSPropertiesConfiguration.CONF_FILE_DIR_ENV_NAME)
} finally {
DFSPropertiesConfiguration.clearGlobalProps()
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/common/TestSqlConf.scala`
around lines 157 - 160, The teardown currently calls unsetEnv(...) then
DFSPropertiesConfiguration.clearGlobalProps(), but if unsetEnv throws the
clearGlobalProps() call can be skipped; wrap the calls in a try/finally inside
the after block so unsetEnv is attempted and
DFSPropertiesConfiguration.clearGlobalProps() is always executed in the finally
clause (referencing unsetEnv and DFSPropertiesConfiguration.clearGlobalProps to
locate the lines).

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/partitioner/index/TestRocksDBIndexBackend.java (1)

69-76: Consider extracting shared gauge-capture setup helper.

The same Mockito gauge-capture block appears three times; a small helper would reduce duplication and make these tests easier to maintain.

Also applies to: 107-113, 164-170

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/partitioner/index/TestRocksDBIndexBackend.java`
around lines 69 - 76, The test repeats the Mockito gauge-capture block (using
MetricGroup, gauges map, and doAnswer(...).when(metricGroup).gauge(...)) in
TestRocksDBIndexBackend; extract that into a small reusable helper method (e.g.,
a private static method like captureGauges(MetricGroup metricGroup,
Map<String,Gauge<?>> gauges) or a helper that returns the populated map) and
replace the three inline blocks (around the gauge setup at lines shown) with
calls to this helper to remove duplication and centralize the doAnswer(...)
configuration for metricGroup.gauge.
hudi-common/src/main/java/org/apache/hudi/common/table/log/BaseHoodieLogRecordReader.java (1)

356-357: Move these instant-list dumps back to DEBUG.

scanInternal() sits on the hot path for MOR reads/compaction. Emitting full instant sets at INFO on every scan will generate a lot of executor log volume, especially for long-lived file groups.

Proposed change
-      LOG.info("Ordered instant times seen {}", orderedInstantsList);
-      LOG.info("Targeted instants that are rolled back are {}", targetRollbackInstants);
+      LOG.debug("Ordered instant times seen {}", orderedInstantsList);
+      LOG.debug("Targeted instants that are rolled back are {}", targetRollbackInstants);

-      LOG.info("Total valid instants found are {}. Instants are {}", validBlockInstants.size(), validBlockInstants);
+      LOG.debug("Total valid instants found are {}. Instants are {}", validBlockInstants.size(), validBlockInstants);
       if (ignoredBlockCount > 0) {
-        LOG.info("Ignored {} log blocks from {} instants not in the range: {}", ignoredBlockCount, ignoredInstants.size(), ignoredInstants);
+        LOG.debug("Ignored {} log blocks from {} instants not in the range: {}", ignoredBlockCount, ignoredInstants.size(), ignoredInstants);
       }

Also applies to: 414-417

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-common/src/main/java/org/apache/hudi/common/table/log/BaseHoodieLogRecordReader.java`
around lines 356 - 357, The INFO-level logging of the full instant lists in
BaseHoodieLogRecordReader should be lowered to DEBUG to avoid hot-path log
noise: inside scanInternal() replace the LOG.info calls that print
orderedInstantsList and targetRollbackInstants (and the similar INFO dumps
around the later block that logs those same collections) with LOG.debug so the
detailed instant sets are emitted only at debug level; update both occurrences
referencing orderedInstantsList and targetRollbackInstants in
BaseHoodieLogRecordReader.scanInternal().
hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieReadStats.java (1)

58-61: Keep totalLogBlocks and totalValidLogBlocks semantically separate.

totalLogBlocks is still fed from the raw scanned-block counter in BaseHoodieLogRecordReader, which is incremented before corrupt/inflight/range filtering. Describing it as "valid" here makes it overlap with totalValidLogBlocks and muddies what each metric means.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieReadStats.java`
around lines 58 - 61, The field comments for totalLogBlocks and
totalValidLogBlocks are conflating semantics; update the comment for
totalLogBlocks in HoodieReadStats to reflect that it counts raw scanned log
blocks (incremented in BaseHoodieLogRecordReader before corrupt/inflight/range
filtering) and is NOT "valid"—leave totalValidLogBlocks described as
post-filtering valid blocks. Ensure the comment mentions
BaseHoodieLogRecordReader and the distinction between pre-filter raw scanned
blocks (totalLogBlocks) and post-filter valid blocks (totalValidLogBlocks) so
future readers don't confuse the two metrics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@hudi-common/src/main/java/org/apache/hudi/common/table/read/buffer/LogScanningRecordBufferLoader.java`:
- Around line 62-65: The current setTotalLogSizeCompacted call in
LogScanningRecordBufferLoader uses getBlocksStats() and sums
BLOCK_SIZE_IN_BYTES, which undercounts delete-only blocks because
BaseHoodieLogRecordReader.processDataBlock does not populate BLOCK_SIZE_IN_BYTES
for delete blocks; change the computation to derive totalLogSizeCompacted from
the aggregated valid compacted log bytes/files produced by the reader (e.g., use
the reader's compacted file/bytes metadata or file-level stats that include
delete-only blocks) rather than re-summing data-block BLOCK_SIZE_IN_BYTES;
update the logic around readStats.setTotalLogSizeCompacted to sum the compacted
log file byte sizes (from the reader's file-level metrics or valid compacted
file list) so delete-heavy and mixed slices are counted correctly.

In
`@hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/partitioner/index/TestRocksDBIndexBackend.java`:
- Around line 129-141: The test currently writes an excessively large payload in
TestRocksDBIndexBackend by looping 32K times with 4K-sized strings; instead,
limit the writes and stop early when SST metrics become positive: in the loop
that calls rocksDBIndexBackend.update(...) reduce the repeated payload size
(e.g., 256–512 chars) and cap iterations (e.g., 1K or less), and after each
update check FlinkRocksDBIndexMetrics.ROCKSDB_DISK_TOTAL_SST_FILES_SIZE and
ROCKSDB_DISK_LIVE_SST_FILES_SIZE gauges and break once either metric > 0; this
keeps the test bounded and still verifies SST flush behavior without huge
memory/CI cost.

---

Nitpick comments:
In
`@hudi-common/src/main/java/org/apache/hudi/common/table/log/BaseHoodieLogRecordReader.java`:
- Around line 356-357: The INFO-level logging of the full instant lists in
BaseHoodieLogRecordReader should be lowered to DEBUG to avoid hot-path log
noise: inside scanInternal() replace the LOG.info calls that print
orderedInstantsList and targetRollbackInstants (and the similar INFO dumps
around the later block that logs those same collections) with LOG.debug so the
detailed instant sets are emitted only at debug level; update both occurrences
referencing orderedInstantsList and targetRollbackInstants in
BaseHoodieLogRecordReader.scanInternal().

In
`@hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieReadStats.java`:
- Around line 58-61: The field comments for totalLogBlocks and
totalValidLogBlocks are conflating semantics; update the comment for
totalLogBlocks in HoodieReadStats to reflect that it counts raw scanned log
blocks (incremented in BaseHoodieLogRecordReader before corrupt/inflight/range
filtering) and is NOT "valid"—leave totalValidLogBlocks described as
post-filtering valid blocks. Ensure the comment mentions
BaseHoodieLogRecordReader and the distinction between pre-filter raw scanned
blocks (totalLogBlocks) and post-filter valid blocks (totalValidLogBlocks) so
future readers don't confuse the two metrics.

In
`@hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/partitioner/index/TestRocksDBIndexBackend.java`:
- Around line 69-76: The test repeats the Mockito gauge-capture block (using
MetricGroup, gauges map, and doAnswer(...).when(metricGroup).gauge(...)) in
TestRocksDBIndexBackend; extract that into a small reusable helper method (e.g.,
a private static method like captureGauges(MetricGroup metricGroup,
Map<String,Gauge<?>> gauges) or a helper that returns the populated map) and
replace the three inline blocks (around the gauge setup at lines shown) with
calls to this helper to remove duplication and centralize the doAnswer(...)
configuration for metricGroup.gauge.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7e02edcf-0aae-4ba3-915e-64fda4516988

📥 Commits

Reviewing files that changed from the base of the PR and between 12bfc6e and 23f73ff.

📒 Files selected for processing (13)
  • hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/FileGroupReaderBasedAppendHandle.java
  • hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/FileGroupReaderBasedMergeHandle.java
  • hudi-common/src/main/java/org/apache/hudi/common/table/log/BaseHoodieLogRecordReader.java
  • hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieReadStats.java
  • hudi-common/src/main/java/org/apache/hudi/common/table/read/buffer/LogScanningRecordBufferLoader.java
  • hudi-common/src/main/java/org/apache/hudi/common/util/collection/RocksDBDAO.java
  • hudi-common/src/test/java/org/apache/hudi/common/table/read/TestHoodieReadStats.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/metrics/FlinkRocksDBIndexMetrics.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/BucketAssignFunction.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/index/IndexBackend.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/index/RocksDBIndexBackend.java
  • hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/partitioner/index/TestRocksDBIndexBackend.java
  • hudi-spark-datasource/hudi-spark/src/test/java/org/apache/hudi/table/TestHoodieMergeOnReadTable.java
🚧 Files skipped from review as they are similar to previous changes (3)
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/BucketAssignFunction.java
  • hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/sink/partitioner/index/IndexBackend.java
  • hudi-common/src/main/java/org/apache/hudi/common/util/collection/RocksDBDAO.java

Comment on lines +62 to +65
readStats.setTotalLogSizeCompacted(logRecordReader.getBlocksStats().stream()
.filter(blockScanMetrics -> blockScanMetrics.containsKey(BLOCK_SIZE_IN_BYTES))
.mapToLong(blockScanMetrics -> ((Number) blockScanMetrics.get(BLOCK_SIZE_IN_BYTES)).longValue())
.sum());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

totalLogSizeCompacted now under-counts delete-heavy slices.

blocksStats is only populated from BaseHoodieLogRecordReader.processDataBlock(...); valid delete blocks never contribute BLOCK_SIZE_IN_BYTES. A delete-only slice therefore records 0 here, and mixed slices under-report bytes before this value is written into HoodieWriteStat. Please derive this from all valid compacted log bytes/files instead of re-summing data-block stats.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-common/src/main/java/org/apache/hudi/common/table/read/buffer/LogScanningRecordBufferLoader.java`
around lines 62 - 65, The current setTotalLogSizeCompacted call in
LogScanningRecordBufferLoader uses getBlocksStats() and sums
BLOCK_SIZE_IN_BYTES, which undercounts delete-only blocks because
BaseHoodieLogRecordReader.processDataBlock does not populate BLOCK_SIZE_IN_BYTES
for delete blocks; change the computation to derive totalLogSizeCompacted from
the aggregated valid compacted log bytes/files produced by the reader (e.g., use
the reader's compacted file/bytes metadata or file-level stats that include
delete-only blocks) rather than re-summing data-block BLOCK_SIZE_IN_BYTES;
update the logic around readStats.setTotalLogSizeCompacted to sum the compacted
log file byte sizes (from the reader's file-level metrics or valid compacted
file list) so delete-heavy and mixed slices are counted correctly.

Comment on lines +129 to +141
// Keep writing larger records to trigger auto-flushes memtable into SST.
int nextKey = 32;
for (int i = 0; i < 32768; i++) {
String par = "par-" + nextKey + "-" + "x".repeat(4096);
rocksDBIndexBackend.update("id" + nextKey,
new HoodieRecordGlobalLocation(par, "00" + nextKey, UUID.randomUUID().toString()));
nextKey++;
}
long totalSstSize =
((Number) gauges.get(FlinkRocksDBIndexMetrics.ROCKSDB_DISK_TOTAL_SST_FILES_SIZE).getValue()).longValue();
long liveSstSize =
((Number) gauges.get(FlinkRocksDBIndexMetrics.ROCKSDB_DISK_LIVE_SST_FILES_SIZE).getValue()).longValue();
assertTrue(totalSstSize > 0 || liveSstSize > 0);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Test payload volume is too large for a unit test.

Lines 129-141 allocate/process a very large dataset (~100MB+ string content plus object overhead), which can make CI slow/flaky. Prefer bounded writes with early exit once SST metrics become positive.

💡 Proposed bounded-write approach
-      // Keep writing larger records to trigger auto-flushes memtable into SST.
-      int nextKey = 32;
-      for (int i = 0; i < 32768; i++) {
-        String par = "par-" + nextKey + "-" + "x".repeat(4096);
-        rocksDBIndexBackend.update("id" + nextKey,
-            new HoodieRecordGlobalLocation(par, "00" + nextKey, UUID.randomUUID().toString()));
-        nextKey++;
-      }
-      long totalSstSize =
-          ((Number) gauges.get(FlinkRocksDBIndexMetrics.ROCKSDB_DISK_TOTAL_SST_FILES_SIZE).getValue()).longValue();
-      long liveSstSize =
-          ((Number) gauges.get(FlinkRocksDBIndexMetrics.ROCKSDB_DISK_LIVE_SST_FILES_SIZE).getValue()).longValue();
-      assertTrue(totalSstSize > 0 || liveSstSize > 0);
+      // Bounded writes with early stop once SST metrics move.
+      int nextKey = 32;
+      boolean sstObserved = false;
+      for (int i = 0; i < 4096; i++) {
+        String par = "par-" + nextKey + "-" + "x".repeat(2048);
+        rocksDBIndexBackend.update("id" + nextKey,
+            new HoodieRecordGlobalLocation(par, "00" + nextKey, UUID.randomUUID().toString()));
+        nextKey++;
+
+        if (i % 128 == 0) {
+          long totalSstSize =
+              ((Number) gauges.get(FlinkRocksDBIndexMetrics.ROCKSDB_DISK_TOTAL_SST_FILES_SIZE).getValue()).longValue();
+          long liveSstSize =
+              ((Number) gauges.get(FlinkRocksDBIndexMetrics.ROCKSDB_DISK_LIVE_SST_FILES_SIZE).getValue()).longValue();
+          if (totalSstSize > 0 || liveSstSize > 0) {
+            sstObserved = true;
+            break;
+          }
+        }
+      }
+      assertTrue(sstObserved, "Expected SST metrics to become positive after bounded writes");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@hudi-flink-datasource/hudi-flink/src/test/java/org/apache/hudi/sink/partitioner/index/TestRocksDBIndexBackend.java`
around lines 129 - 141, The test currently writes an excessively large payload
in TestRocksDBIndexBackend by looping 32K times with 4K-sized strings; instead,
limit the writes and stop early when SST metrics become positive: in the loop
that calls rocksDBIndexBackend.update(...) reduce the repeated payload size
(e.g., 256–512 chars) and cap iterations (e.g., 1K or less), and after each
update check FlinkRocksDBIndexMetrics.ROCKSDB_DISK_TOTAL_SST_FILES_SIZE and
ROCKSDB_DISK_LIVE_SST_FILES_SIZE gauges and break once either metric > 0; this
keeps the test bounded and still verifies SST flush behavior without huge
memory/CI cost.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants