Skip to content

dm: add safe-mode upsert DML strategy#12519

Open
GMHDBJD wants to merge 1 commit intopingcap:masterfrom
GMHDBJD:fixSafemode
Open

dm: add safe-mode upsert DML strategy#12519
GMHDBJD wants to merge 1 commit intopingcap:masterfrom
GMHDBJD:fixSafemode

Conversation

@GMHDBJD
Copy link
Contributor

@GMHDBJD GMHDBJD commented Feb 5, 2026

Add syncer.safe-mode-dml-strategy (replace|upsert). In upsert mode, safe-mode INSERT uses INSERT ON DUPLICATE KEY UPDATE and UPDATE stays UPDATE, and we keep foreign_key_checks enabled to preserve FK cascade semantics.

Also add an integration test case for ON UPDATE CASCADE under safe mode.

What problem does this PR solve?

Issue Number: close #xxx

What is changed and how it works?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Please refer to [Release Notes Language Style Guide](https://pingcap.github.io/tidb-dev-guide/contribute-to-tidb/release-notes-style-guide.html) to write a quality release note.

If you don't think this PR needs a release note then fill it with `None`.

Add syncer.safe-mode-dml-strategy (replace|upsert). In upsert mode, safe-mode INSERT uses INSERT ON DUPLICATE KEY UPDATE and UPDATE stays UPDATE, and we keep foreign_key_checks enabled to preserve FK cascade semantics.

Also add an integration test case for ON UPDATE CASCADE under safe mode.
@ti-chi-bot ti-chi-bot bot added do-not-merge/needs-linked-issue release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Feb 5, 2026
@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Feb 5, 2026

[FORMAT CHECKER NOTIFICATION]

Notice: To remove the do-not-merge/needs-linked-issue label, please provide the linked issue number on one line in the PR body, for example: Issue Number: close #123 or Issue Number: ref #456.

📖 For more info, you can check the "Contribute Code" section in the development guide.

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Feb 5, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign gmhdbjd, yudongusa for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added area/dm Issues or PRs related to DM. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 5, 2026
@gemini-code-assist
Copy link

Summary of Changes

Hello @GMHDBJD, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the data migration (DM) tool by providing a more flexible and robust DML strategy for safe mode operations. The new 'upsert' strategy addresses limitations of the existing 'replace' strategy, particularly concerning foreign key constraints and cascade actions. By allowing INSERT ON DUPLICATE KEY UPDATE and maintaining foreign_key_checks, it ensures better data integrity and compatibility with complex schema designs during data synchronization.

Highlights

  • New Safe Mode DML Strategy: Introduced a new configuration option syncer.safe-mode-dml-strategy which allows users to choose between replace (default) and upsert.
  • Upsert Strategy Behavior: When upsert strategy is enabled, INSERT statements in safe mode are transformed into INSERT ON DUPLICATE KEY UPDATE, and UPDATE statements remain standard UPDATEs.
  • Foreign Key Checks Preservation: In upsert safe mode, foreign_key_checks are kept enabled in the downstream database to correctly preserve foreign key cascade semantics, which was not possible with the replace strategy.
  • Integration Test for FK Cascade: Added a new integration test case specifically to verify the ON UPDATE CASCADE behavior under the new upsert safe mode strategy.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • dm/config/subtask.go
    • Added validation logic for the new SafeModeDMLStrategy field, ensuring it's either 'replace' or 'upsert', or falls back to default.
  • dm/config/task.go
    • Introduced SafeModeDMLStrategy field to the SyncerConfig struct, along with documentation explaining its two possible values: replace and upsert.
    • Added validation for the SafeModeDMLStrategy within the adjust method of TaskConfig.
  • dm/syncer/dml.go
    • Modified genDMLsWithSameTable and genDMLsWithSameOp functions to accept the safeModeDMLStrategy parameter.
    • Updated DML generation logic within genDMLsWithSameOp to conditionally use DMLInsertOnDuplicateUpdate for INSERT operations when safeModeDMLStrategy is 'upsert'.
  • dm/syncer/dml_test.go
    • Updated TestGenDMLWithSameOp to pass the safeModeDMLStrategy argument.
    • Added a new test TestGenSQLWithSafeModeUpsertStrategy to specifically verify the SQL generation for INSERT ON DUPLICATE KEY UPDATE and standard UPDATE when the upsert strategy is active.
  • dm/syncer/dml_worker.go
    • Added safeModeDMLStrategy field to the DMLWorker struct to store the configured strategy.
    • Modified shouldDisableForeignKeyChecksForJob to prevent disabling foreign key checks when the upsert strategy is in use, ensuring FK cascade semantics are preserved.
    • Updated dmlWorkerWrap to pass the SafeModeDMLStrategy from the syncer configuration to the DML worker.
    • Adjusted the genSQLs method to generate DML statements based on the safeModeDMLStrategy, differentiating between REPLACE and INSERT ON DUPLICATE KEY UPDATE for inserts, and REPLACE or UPDATE for updates.
  • dm/tests/run_group.sh
    • Added the new integration test group safe_mode_foreign_key_update_cascade to the test suite.
  • dm/tests/safe_mode_foreign_key_update_cascade/conf/diff_config.toml
    • Added a new configuration file for the diff tool for the safe_mode_foreign_key_update_cascade test.
  • dm/tests/safe_mode_foreign_key_update_cascade/conf/dm-master.toml
    • Added a new DM master configuration file for the safe_mode_foreign_key_update_cascade test.
  • dm/tests/safe_mode_foreign_key_update_cascade/conf/dm-task.yaml
    • Added a new DM task configuration file, explicitly setting safe-mode-dml-strategy: "upsert" and foreign_key_checks: 1 for the safe_mode_foreign_key_update_cascade test.
  • dm/tests/safe_mode_foreign_key_update_cascade/conf/dm-worker1.toml
    • Added a new DM worker configuration file for the safe_mode_foreign_key_update_cascade test.
  • dm/tests/safe_mode_foreign_key_update_cascade/conf/source1.yaml
    • Added a new source configuration file for the safe_mode_foreign_key_update_cascade test.
  • dm/tests/safe_mode_foreign_key_update_cascade/data/db1.increment.sql
    • Added an SQL script for incremental data changes, including an UPDATE on a parent table's primary key to trigger foreign key cascade.
  • dm/tests/safe_mode_foreign_key_update_cascade/data/db1.prepare.sql
    • Added an SQL script to prepare the database schema, including parent and child tables with a foreign key constraint using ON UPDATE CASCADE.
  • dm/tests/safe_mode_foreign_key_update_cascade/run.sh
    • Added the main shell script to execute the safe_mode_foreign_key_update_cascade integration test, setting up DM, running SQL, and verifying results.
Activity
  • The author GMHDBJD initiated this pull request to introduce a new DML strategy for safe mode.
  • The PR includes modifications to core DML generation logic within the syncer component.
  • A new integration test has been added to validate the foreign key cascade behavior with the new 'upsert' safe mode strategy.
  • Configuration files and test scripts were added to support the new integration test.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new upsert strategy for safe-mode DMLs, which is a valuable addition for handling foreign key constraints with cascading operations. The implementation appears correct and is well-supported by a new integration test. I've included a few suggestions to enhance code readability and maintainability, particularly in the configuration validation and DML generation logic.

Comment on lines +397 to +402
if c.SyncerConfig.SafeModeDMLStrategy != "" && c.SyncerConfig.SafeModeDMLStrategy != "replace" && c.SyncerConfig.SafeModeDMLStrategy != "upsert" {
log.L().Warn("unknown safe-mode-dml-strategy, fallback to default",
zap.String("strategy", c.SyncerConfig.SafeModeDMLStrategy),
zap.String("default", "replace"))
c.SyncerConfig.SafeModeDMLStrategy = ""
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better readability and maintainability, you could use a switch statement to validate SafeModeDMLStrategy. This makes it easier to add more strategies in the future.

 	switch c.SyncerConfig.SafeModeDMLStrategy {
 	case "", "replace", "upsert":
 		// valid
 	default:
 		log.L().Warn("unknown safe-mode-dml-strategy, fallback to default",
 			zap.String("strategy", c.SyncerConfig.SafeModeDMLStrategy),
 			zap.String("default", "replace"))
 		c.SyncerConfig.SafeModeDMLStrategy = ""
 	}

Comment on lines +900 to +905
if inst.Syncer.SafeModeDMLStrategy != "" && inst.Syncer.SafeModeDMLStrategy != "replace" && inst.Syncer.SafeModeDMLStrategy != "upsert" {
log.L().Warn("unknown safe-mode-dml-strategy, fallback to default",
zap.String("strategy", inst.Syncer.SafeModeDMLStrategy),
zap.String("default", "replace"))
inst.Syncer.SafeModeDMLStrategy = ""
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This validation logic is similar to the one in subtask.go. Using a switch statement here would also improve readability and make the code more extensible.

 		switch inst.Syncer.SafeModeDMLStrategy {
 		case "", "replace", "upsert":
 			// valid
 		default:
 			log.L().Warn("unknown safe-mode-dml-strategy, fallback to default",
 				zap.String("strategy", inst.Syncer.SafeModeDMLStrategy),
 				zap.String("default", "replace"))
 			inst.Syncer.SafeModeDMLStrategy = ""
 		}

Comment on lines 399 to 411
if j.safeMode {
if j.dml.IsPrimaryOrUniqueKeyUpdated() {
query, arg = j.dml.GenSQL(sqlmodel.DMLDelete)
appendQueryAndArg()
if w.safeModeDMLStrategy == "upsert" {
query, arg = j.dml.GenSQL(sqlmodel.DMLUpdate)
} else {
if j.dml.IsPrimaryOrUniqueKeyUpdated() {
query, arg = j.dml.GenSQL(sqlmodel.DMLDelete)
appendQueryAndArg()
}
query, arg = j.dml.GenSQL(sqlmodel.DMLReplace)
}
query, arg = j.dml.GenSQL(sqlmodel.DMLReplace)
} else {
query, arg = j.dml.GenSQL(sqlmodel.DMLUpdate)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for handling RowChangeUpdate in safe mode can be simplified by using a switch statement on w.safeModeDMLStrategy. This would make the code cleaner and easier to understand.

Suggested change
if j.safeMode {
if j.dml.IsPrimaryOrUniqueKeyUpdated() {
query, arg = j.dml.GenSQL(sqlmodel.DMLDelete)
appendQueryAndArg()
if w.safeModeDMLStrategy == "upsert" {
query, arg = j.dml.GenSQL(sqlmodel.DMLUpdate)
} else {
if j.dml.IsPrimaryOrUniqueKeyUpdated() {
query, arg = j.dml.GenSQL(sqlmodel.DMLDelete)
appendQueryAndArg()
}
query, arg = j.dml.GenSQL(sqlmodel.DMLReplace)
}
query, arg = j.dml.GenSQL(sqlmodel.DMLReplace)
} else {
query, arg = j.dml.GenSQL(sqlmodel.DMLUpdate)
}
if j.safeMode {
switch w.safeModeDMLStrategy {
case "upsert":
query, arg = j.dml.GenSQL(sqlmodel.DMLUpdate)
default: // "replace" or ""
if j.dml.IsPrimaryOrUniqueKeyUpdated() {
query, arg = j.dml.GenSQL(sqlmodel.DMLDelete)
appendQueryAndArg()
}
query, arg = j.dml.GenSQL(sqlmodel.DMLReplace)
}
} else {
query, arg = j.dml.GenSQL(sqlmodel.DMLUpdate)
}

@ti-chi-bot
Copy link
Contributor

ti-chi-bot bot commented Feb 5, 2026

@GMHDBJD: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-verify 8380225 link true /test pull-verify
pull-dm-integration-test 8380225 link true /test pull-dm-integration-test

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/dm Issues or PRs related to DM. do-not-merge/needs-linked-issue release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant