Move-tables. Multi-table#1726
Conversation
…sh of ordered table names for consistent identifier
…r things like "GetTargetTableName" in move-tables mode
| // or a given table name | ||
| func (mctx *MigrationContext) GetGhostTableName() string { | ||
| if mctx.IsMoveTablesMode() { | ||
| panic("GetGhostTableName() must not be called in move-tables mode; there is no ghost table (the target keeps each migrated table's name)") |
There was a problem hiding this comment.
I chose to add panics to function calls that I felt were dangerous enough if they accidentally were called by a codepath from --move-tables. Happy to discuss alternatives, or ditching the panics altogether, but they gave me confidence while making the changes across this PR.
One thought - we could swap the raw panics for failpoint panics and only enable them for integration tests if we wanted to get fancier while removing intentional panics from the prod binary. But I don't know if that gains us that much.
| // It is used to name run-wide singular artifacts (checkpoint table, applier | ||
| // advisory lock, serve socket) so they never depend on any single migrated | ||
| // table name. Returns "" outside move-tables mode. | ||
| func (mctx *MigrationContext) MoveTablesRunToken() string { |
There was a problem hiding this comment.
This is the func that determines the unique identifier per run based on the table names. More details in the comment above but just wanted to flag it directly with a comment.
A non-start alternative could have been concating all the table names, but that would make for a really long identifier in most cases.
Another alternative could have been to require ForceTmpTableName in --move-tables mode, so that the operator chooses what the identifier for the table is. We'd potentially need to extend that a bit for other callsites like the server sock file, etc.
| @@ -809,12 +959,6 @@ func (apl *Applier) CreateCheckpointTable() error { | |||
| "`gh_ost_dml_applied` bigint", | |||
| "`gh_ost_is_cutover` tinyint(1) DEFAULT '0'", | |||
| } | |||
| if apl.migrationContext.IsMoveTablesMode() { | |||
| colDefs = append(colDefs, | |||
| "`gh_ost_move_tables_cutover_started` tinyint(1) DEFAULT '0'", | |||
| "`gh_ost_move_tables_drain_gtid` text charset ascii", | |||
| ) | |||
| } | |||
| for _, col := range apl.migrationContext.UniqueKey.Columns.Columns() { | |||
There was a problem hiding this comment.
we had originally made these conditional changes to CreateCheckpointTable in our feature branch for single move. These removals just get it back to baseline (main) since we have our own CreateCheckpointTable variant for move-tables now createMoveTablesCheckpointTable
There was a problem hiding this comment.
all the funky hex serialization stuff lives in this diff.
alternative could have been to create a checkpoint table per table, but there is shared state across the checkpoints that's valuable to keep in a single table. This way, on crash-resume when mid cutover, we can easily just query the last checkpoint row.
There was a problem hiding this comment.
no major changes here, just some guards against function calls we don't want called in move-tables mode (panic)
| var replicationLagQuery string | ||
| if !thlr.migrationContext.IsMoveTablesMode() { | ||
| replicationLagQuery = fmt.Sprintf(` | ||
| select value from %s.%s where hint = 'heartbeat' and id <= 255 | ||
| `, | ||
| sql.EscapeName(thlr.migrationContext.DatabaseName), | ||
| sql.EscapeName(thlr.migrationContext.GetChangelogTableName()), | ||
| ) | ||
| } |
There was a problem hiding this comment.
These lines just build a query that's only run in NOT move tables mode (below). However, building the query here calls GetChangelogTableName which is now guarded with the panic. So this just wraps the query building in the move tables check
There was a problem hiding this comment.
@chriskirkland and I talked about making some changes to the existing integration tests for the crash/resume so that they cover multi-table as well.
There was a problem hiding this comment.
Pull request overview
Extends --move-tables from single-table to true multi-table operation (one run handling N tables) by introducing per-table runtime containers, run-wide artifact naming via a stable run token, per-table interleaved row copy + DML routing, redesigned checkpointing, and an atomic multi-table cutover RENAME TABLE.
Changes:
- Introduces per-table
MoveTablecontainers and run-token-based naming for run-wide artifacts (checkpoint table, socket, lock identity). - Refactors migrator/applier/inspector to inspect/copy/apply/checkpoint multiple tables (interleaved copy, table-routed DML, multi-row checkpoint model, atomic multi-table cutover).
- Adds/updates localtest fixtures and the
script/move-tablesmanual harness to seed/reset/drive workload across 1–3 canonical tables, plus new atomic-cutover and resume validation scenarios.
Show a summary per file
| File | Description |
|---|---|
| script/move-tables/setup | Switches manual harness seeding to the canonical 3-table fixture. |
| script/move-tables/reset | Resets all canonical tables and drops run-token checkpoint tables on target. |
| script/move-tables/README.md | Updates manual workflow docs for 1/2/3-table runs, reset, and CI scenarios. |
| script/move-tables/insert-source-primary-loop | Drives DML across whichever canonical fixtures exist (1–3 tables). |
| localtests/move-tables/resume-panic-on-row-copy/test.sh | Finds run-token checkpoint table dynamically for resume validation. |
| localtests/move-tables/resume-panic-before-drain-complete/test.sh | Finds run-token checkpoint table dynamically for drain/cutover resume validation. |
| localtests/move-tables/multiple-two/tables.txt | Adds 2-table move-tables fixture table list. |
| localtests/move-tables/multiple-two/create.sql | Adds 2-table heterogeneous schema fixture. |
| localtests/move-tables/multiple-three/tables.txt | Adds 3-table move-tables fixture table list. |
| localtests/move-tables/multiple-three/create.sql | Adds canonical 3-table heterogeneous schema fixture. |
| localtests/move-tables/multiple-three-concurrent-writes/tables.txt | Adds 3-table concurrent-writes fixture table list. |
| localtests/move-tables/multiple-three-concurrent-writes/on_test.sh | Starts DML loop during 3-table copy to validate DML capture. |
| localtests/move-tables/multiple-three-concurrent-writes/create.sql | Adds canonical 3-table concurrent-writes SQL fixture. |
| localtests/move-tables/atomic-multi-table-cutover/test.sh | New deterministic atomic-cutover workload + invariant validation. |
| localtests/move-tables/atomic-multi-table-cutover/tables.txt | Adds 2-table list for atomic-cutover test fixture. |
| localtests/move-tables/atomic-multi-table-cutover/create.sql | Adds paired-schema fixture for atomic-cutover validation. |
| go/logic/throttler.go | Avoids changelog-heartbeat lag query in move-tables mode; uses replica status lag. |
| go/logic/server.go | Updates interactive command table-arg matching/messages for multi-table migrations. |
| go/logic/migrator.go | Implements multi-table inspection, interleaved copy loop, resume logic, atomic multi-table cutover rename, and cleanup. |
| go/logic/migrator_move_tables_cutover_test.go | Adapts cutover tests for multi-table token/table-list expectations and cleanup API rename. |
| go/logic/migrator_move_tables_cleanup_test.go | Updates cleanup tests for run-token checkpoint naming and multi-table drop method. |
| go/logic/inspect.go | Adds move-tables-safe per-table validation/inspection/counting and removes representative-table calls. |
| go/logic/hooks.go | Exposes multi-table context via hook env vars (comma-joined tables + per-table rollback handles). |
| go/logic/checkpoint.go | Adds per-table checkpoint row support and serialized heterogeneous key-range encoding helpers. |
| go/logic/checkpoint_test.go | Unit tests for range serialization/deserialization and empty-range detection. |
| go/logic/applier.go | Adds per-table query builders, move-tables checkpoint table schema + read/write, and per-table DML routing. |
| go/logic/applier_test.go | Updates/extends tests for move-tables checkpointing, copy, and DML routing via per-table containers. |
| go/cmd/gh-ost/main.go | Allows multi-table --move-tables, rejects empty/duplicate names, and uses run-token socket naming. |
| go/base/context.go | Adds MoveTable container model, deterministic ordered iteration, run-token generation, and move-tables-specific naming helpers. |
| go/base/context_test.go | Tests run-token determinism/order-independence and MoveTableDelName behavior. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 30/30 changed files
- Comments generated: 4
| now := time.Now().Unix() | ||
| for _, chk := range rows { | ||
| coordStr := "" | ||
| if chk.LastTrxCoords != nil { | ||
| coordStr = chk.LastTrxCoords.String() | ||
| } | ||
| args := sqlutils.Args( | ||
| now, | ||
| chk.TableName, | ||
| coordStr, | ||
| chk.Iteration, | ||
| chk.RowsCopied, | ||
| chk.DMLApplied, | ||
| chk.IsCutover, | ||
| chk.MoveTablesCutOverStarted, | ||
| apl.checkpointDrainGTIDString(chk), | ||
| serializeRangeValues(chk.IterationRangeMin), | ||
| serializeRangeValues(chk.IterationRangeMax), | ||
| ) | ||
| if _, err := apl.checkpointDB().Exec(query, args...); err != nil { | ||
| return err | ||
| } | ||
| } | ||
| return nil |
| if mgtr.sourcePrimaryDB == nil { | ||
| return errors.New("source primary connection not initialized; cannot drop source __del table") | ||
| } |
| func (isp *Inspector) selectUniqueKey(tableName string, candidateKeys []*sql.UniqueKey) *sql.UniqueKey { | ||
| for i, candidateKey := range candidateKeys { | ||
| isp.applyColumnTypes(isp.migrationContext.DatabaseName, isp.originalTableName(), &candidateKey.Columns) | ||
| isp.applyColumnTypes(isp.migrationContext.DatabaseName, tableName, &candidateKey.Columns) | ||
| uniqueKeyIsValid := true | ||
| for _, column := range candidateKey.Columns.Columns() { |
chriskirkland
left a comment
There was a problem hiding this comment.
Looks great! No major concerns (for anything introduced here). Just some questions/comments that are non-blocking
| // One advisory lock per run. In move-tables mode it is keyed on the | ||
| // set-derived run token (not any single table) so two processes moving the | ||
| // same set of tables collide, while a single-table run keeps its table-keyed |
There was a problem hiding this comment.
// One advisory lock per run. In move-tables mode it is keyed on the
// set-derived run token (not any single table) so two processes moving the
// same set of tables collide, while a single-table run keeps its table-keyed
On it's face, I think this implies that intersecting move-table sets could be migrated at the same time. I think the CREATEs for the overlapping tables on the target would break for all but the first run (assuming they use the same database/schema). Not sure if this is a realizable concern at present... is there anything else preventing that kind of conflict?
Entirely possible this is covered by one of the design docs.
There was a problem hiding this comment.
Great callout! I should have mentioned this as an implication to the choice here.
But yea at the very start it checks each table to ensure they aren't created on the target. I felt like that was enough to protect us.
Simple alt would be to create multiple locks if we want to be extra safe?
There was a problem hiding this comment.
Yeah, I think we don't have to overcomplicate. Mostly wanted to make sure I understand the protections here.
| mgtr.migrationContext.SharedColumns = mgtr.migrationContext.OriginalTableColumns | ||
| mgtr.migrationContext.MappedSharedColumns = mgtr.migrationContext.OriginalTableColumns | ||
| // Aggregate the row estimate across all tables for overall progress reporting. | ||
| atomic.StoreInt64(&mgtr.migrationContext.RowsEstimate, totalRowsEstimate) |
There was a problem hiding this comment.
I think there's an existing bug here. If a table has primary key of type UNSIGNED BIGINT, it would need uint64 to avoid overflow here. That's not super likely to happen with single table, but more likely with multi table.
I don't think we'll be in the business of moving > 2^64 rows in a single cohort, but we might want to file an issue or something 🤷🏼
cc @danieljoos
There was a problem hiding this comment.
I think we'll never reach more than 2^63 rows even across multiple tables.
Not even sure if that's possible at all in InnoDB.
There was a problem hiding this comment.
hmmm. I can change RowsEstimate to a uint64 if we are concerned enough. The usage of that field seems minimal so probably low risk / easy effort.
| // accessor that has no meaning in move-tables mode (every table is handled | ||
| // through its own MoveTable container), so calling it there is a programmer | ||
| // error and panics to fail fast rather than silently operate on the wrong table. | ||
| func (apl *Applier) originalTableName() string { |
There was a problem hiding this comment.
Same here, the originalTableName() func was introduced by our move-table changes. Maybe we can just get rid of it again.
There was a problem hiding this comment.
I don't see as much harm in leaving this one here. There's a lot of callsites to it from our original PR that introduces it for single-table move. And if i remove it, it means all those callsites switch to apl.migrationContext.OriginalTableName directly and we lose our guard against calling accidentally calling OriginalTableName in move-tables mode
| @@ -131,7 +137,7 @@ func (isp *Inspector) InspectOriginalTable() (err error) { | |||
|
|
|||
| func (isp *Inspector) originalTableName() string { | |||
There was a problem hiding this comment.
and another one here: the originalTableName() method has been introduced with move-tables.
Whole func could removed, imo.
| mgtr.migrationContext.SharedColumns = mgtr.migrationContext.OriginalTableColumns | ||
| mgtr.migrationContext.MappedSharedColumns = mgtr.migrationContext.OriginalTableColumns | ||
| // Aggregate the row estimate across all tables for overall progress reporting. | ||
| atomic.StoreInt64(&mgtr.migrationContext.RowsEstimate, totalRowsEstimate) |
There was a problem hiding this comment.
I think we'll never reach more than 2^63 rows even across multiple tables.
Not even sure if that's possible at all in InnoDB.
re: https://github.com/github/database-infrastructure/issues/8205
Multi-table
--move-tablesExtends
feature-move-tablesto move N tables in one run (--move-tables=a,b,c): one binlog stream, per-table interleaved row copy, a single per-run crash-safe checkpoint, and one atomic multi-table RENAME at cutover. A single table is just N=1 — no "representative" table, no index-0 special-casing.Key decisions: per-table state containers (
MoveTable) instead of a representative table; run-wide artifacts named from a stable hash of the table set (_gho_<token>_ghk); one checkpoint table with a hex-serialized per-table key range; GTID-based drain + atomicRENAME TABLE t1 TO …, t2 TO …. Single-table-only paths fail fast (panic/error) in move-tables mode.Read in this order:
go/base/context.go—MoveTable+InitMoveTableContainers,OrderedMoveTables,MoveTablesRunToken, checkpoint/_delnaming (start here)applier.go— per-table query builders +buildDMLEventQueryrouting; target-table create/validatemigrator.go: iterateChunksMoveTables— interleaved per-table copycheckpoint.go+ applierWrite/ReadMoveTableCheckpoints+ migrator resume — checkpoint redesignmigrator.go: moveTablesCutOver/drainMoveTablesCutOver/resume…FromCheckpoint— atomic cutoverlocaltests/move-tables/*,*_test.go) +script/move-tables/*Validated: unit (checkpoint serialize/run-token), integration (1/2/3-table idle + DML, resume-after-crash ×3, atomic-cutover), and manual e2e on 2 clusters / 3–4 tables with live writes.