Core: Rebase replace transaction onto refreshed metadata on concurrent commit by Kurtiscwright · Pull Request #16975 · apache/iceberg

Kurtiscwright · 2026-06-26T17:21:12Z

Fixes #16232
Rebased on PR #16975

Replacing a table replaces its schema, partitioning, and data, but it keeps the
table's history instead of deleting the table and making a new one. The old code
didn't do this when another commit landed first. While a replace was being
prepared, a second commit could update the table. The replace then reloaded the
latest table state to pass the catalog's conflict check, but still wrote out the
changes it had prepared at the start. The second commit's snapshots disappeared
from history, and if that commit had changed the schema, the replace could end
up pointing columns at IDs that no longer matched the data files.

Now, when the table has changed, the replace is re-applied on top of the latest
table state: the new schema, partitioning, sort order, location, and properties
are kept, but they sit on top of the other commit's snapshots, and the replace's
own new data is added after it. If the table no longer exists when the replace
commits (for example, it was dropped in the meantime), there is nothing to build
on, so the replace just creates the table fresh.

Add TableMetadata.buildReplacementPreservingIds. It keeps the field IDs the
replace schema already has, instead of matching columns up by name and handing
out new IDs. Matching by name can move a column to a different ID than its data
files were written with, which makes that data unreadable.

Two design notes:

The replace gets its own re-apply path rather than reusing
applyUpdates, because a replace's schema and partitioning live in the prepared
metadata, not in the list of pending updates, so replaying only those updates
would drop the replacement and leave the old schema in place.
It uses buildReplacementPreservingIds rather than buildReplacement
so a concurrent schema change can't shift the replace's field IDs away
from its data files.

Add tests for the rebuilt replace, kept history, stable field IDs, and the
dropped-table case. Two REST-only tests used to expect a replace with an
unchanged schema or partitioning to fail on a spurious ID check; update them to
expect it to succeed.

…t commit When a concurrent commit lands after a replace transaction is staged, commitReplaceTransaction advanced base to the refreshed metadata to satisfy the optimistic lock but committed the stale staged metadata. This dropped the concurrent commit's snapshots from history and reused its sequence number, violating sequence-number monotonicity. A replace is last-writer-wins on the current schema, spec, and data, but it is not a drop-and-recreate: the table must keep its history. Rebuild the replacement on the refreshed base and replay the staged updates so the concurrent commit's history is preserved and the replacement snapshot's sequence number is re-derived from the refreshed base. Add TableMetadata.buildReplacementPreservingIds, which keeps the staged schema's field IDs instead of re-deriving them by name, so a concurrent schema change cannot shift the IDs the staged data files were written against. Add tests covering the rebase, history preservation, and field-id stability. Update the two server-side conflict tests in CatalogTests, which previously asserted a spurious last-assigned-id requirement failure, to assert the replace succeeds as last-writer-wins.

Co-authored-by: Sreesh Maheshwar <maheshwarsreesh@gmail.com>

Kurtis Wright and others added 2 commits June 26, 2026 17:10

Merge branch 'apache:main' into replace-seq-reuse-repro

98d47e7

github-actions Bot added core hive labels Jun 26, 2026

Fix Hive CI Tests

f19a18c

Co-authored-by: Sreesh Maheshwar <maheshwarsreesh@gmail.com>

Kurtiscwright force-pushed the replace-seq-reuse-repro branch from 44eb85c to f19a18c Compare June 26, 2026 20:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Core: Rebase replace transaction onto refreshed metadata on concurrent commit#16975

Core: Rebase replace transaction onto refreshed metadata on concurrent commit#16975
Kurtiscwright wants to merge 3 commits into
apache:mainfrom
Kurtiscwright:replace-seq-reuse-repro

Kurtiscwright commented Jun 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Kurtiscwright commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Two design notes:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Kurtiscwright commented Jun 26, 2026 •

edited

Loading