ci: temporary go1.25.8 toolchain pin for tiflow/ticdc unit jobs#4366
Conversation
Force GOTOOLCHAIN=go1.25.8 and reset GOCACHE in ghpr_verify to avoid mixed go1.25.6/go1.25.8 compile errors in CI.\n\nThis is a temporary workaround and should be reverted once the ghpr_verify golang image/environment is consistently on go1.25.8.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a temporary workaround to address build failures in the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
I have already done a preliminary review for you, and I hope to help you do a better job.
Summary
This PR introduces a temporary workaround in the ghpr_verify.groovy pipeline script to pin the Go toolchain to version go1.25.8 during test commands execution. It clears the Go build cache per test matrix entry to avoid stale or mixed artifacts from different Go patch versions, and prints Go environment details for debugging purposes. The approach is straightforward and well-isolated, with clear comments indicating that this is a temporary measure. Overall, the change is low-risk and improves CI stability under the current environment inconsistency.
Code Improvements
-
Avoid hardcoding toolchain version multiple times
- File:
pipelines/pingcap/tiflow/latest/ghpr_verify.groovy(lines ~50-60) - Issue: The
GOTOOLCHAINversiongo1.25.8is hardcoded directly in the script. If this version needs to be updated, it requires editing the script in multiple places (e.g., comment and export). - Suggestion: Extract the version string into a variable at the start of the pipeline or script block for easier updates and reduce duplication. For example:
def goToolchainVersion = "go1.25.8" ... export GOTOOLCHAIN=${goToolchainVersion}
- File:
-
Ensure that cache clearing is robust
- File: same as above
- Issue: The cache clearing step uses
rm -rf "$GOCACHE"but does not verify if$GOCACHEis set or non-empty, which could be risky if the environment changes. - Suggestion: Add a guard in the script to confirm
$GOCACHEis non-empty and points to an expected directory before removing it, e.g.:This avoids accidental deletion of unexpected directories.if [[ -n "$GOCACHE" && "$GOCACHE" == /tmp/go-build-* ]]; then rm -rf "$GOCACHE" fi
-
Use consistent quoting and escaping
- File: same
- Issue: The
rm -rf "\$GOCACHE"uses escaped quotes inside a Groovy triple-quoted string, which can be confusing and error-prone. - Suggestion: Since this is a triple double-quoted Groovy string, you can use single quotes inside the shell script to simplify:
rm -rf '$GOCACHE'
Best Practices
-
Add a TODO issue or link for reverting this temporary fix
- File:
ghpr_verify.groovy(lines ~45-65) - Issue: The comment notes this is temporary and should be reverted, but no tracking or issue reference is provided.
- Suggestion: Add a TODO with a GitHub issue or PR number to ensure this workaround is not forgotten, e.g.:
// TODO (pingcap/tiflow#XXXX): Remove this pin after ghpr_verify golang image is fully upgraded to go1.25.8.
- File:
-
Improve logging clarity
- File: same
- Issue: The
go versionandgo env ...output is printed but not explicitly labeled in logs, which might make it harder to identify in CI output. - Suggestion: Add echo statements or log prefixes to clarify what is being printed, e.g.:
echo "Using Go version:" go version echo "Go environment:" go env GOTOOLCHAIN GOROOT GOCACHE
-
Testing coverage
- Since this is a pipeline script change, ensure that the pipeline has been tested in the CI environment to verify that the mixed Go versions issue is resolved and that no new failures occur due to toolchain pinning.
Minor Style Suggestions
- Add a newline at the end of the script block for better readability.
- Keep consistent indentation inside the multiline shell script block.
Overall, the PR is concise and addresses the urgent problem effectively. Incorporating the above improvements will increase maintainability and safety of this temporary workaround.
There was a problem hiding this comment.
Code Review
This pull request introduces a temporary workaround to pin the Go toolchain version in the ghpr_verify pipeline, addressing build failures caused by mixed Go versions. The changes are clear and well-documented with comments explaining their temporary nature. The suggestion to use go clean -cache for clearing the Go cache is retained as it promotes a more robust and idiomatic approach.
Address review feedback in ghpr_verify workaround:\nreplace manual rm -rf of GOCACHE with canonical go clean -cache.
There was a problem hiding this comment.
I have already done a preliminary review for you, and I hope to help you do a better job.
Summary
This PR temporarily pins the Go toolchain version to 1.25.8 in the ghpr_verify.groovy Jenkins pipeline script to avoid compilation errors arising from mixed Go versions (1.25.6 and 1.25.8) artifacts. It also clears the Go build cache before make and prints Go version info for debugging. The approach is straightforward and limited in scope, mostly modifying the shell script that runs tests. Overall, the change is clear, purposeful, and well-documented within the script and PR description.
Code Improvements
-
Pipeline shell script robustness (
pipelines/pingcap/tiflow/latest/ghpr_verify.groovy, lines ~49-63)- The script uses multiline shell commands inside
shstep. To improve maintainability and error visibility, consider:- Adding explicit error messages on failure of key commands (e.g.,
go version,go clean,make). - Avoid exporting
GOCACHEto a static directory that could cause race conditions if tests run concurrently. Instead, use a unique temporary directory per run/command, e.g.:export GOCACHE=$(mktemp -d /tmp/go-build-${TEST_CMD}-XXXXXX)
- Adding
go clean -cacheis good, but also considergo clean -modcacheif module cache is suspected to cause stale artifacts.
- Adding explicit error messages on failure of key commands (e.g.,
- This will further reduce potential flaky failures caused by cache pollution.
- The script uses multiline shell commands inside
-
Use of
set -euo pipefail- It's a good practice to avoid silent failures. Ensure this is consistently used in all similar shell commands in the pipeline for uniformity.
Best Practices
-
Documentation (
pipelines/pingcap/tiflow/latest/ghpr_verify.groovy, around line 50)- The inline comments are clear and helpful.
- However, add a reminder or TODO comment about removing this workaround once the environment upgrade is complete, ideally linking to a tracking issue or PR to avoid forgetting temporary fixes.
Example:
# TODO: Remove this pin once ghpr_verify environment is fully on go1.25.8 (see PR #12560) -
Testing coverage
- Since this change affects the CI pipeline itself, ensure that the pipeline is tested/re-run fully after this change. The PR description links a replay test, which is good. Make sure the test covers multiple matrix configurations to validate no cross-version caching issues remain.
Minor Suggestions
-
Print the Go environment variables in a single command
Instead of separatego versionandgo env GOTOOLCHAIN GOROOT GOCACHE, consider:go version && go env GOTOOLCHAIN GOROOT GOCACHEfor slightly cleaner output and execution.
-
Avoid potential shell injection
If${TEST_CMD}can come from untrusted input in the future, consider quoting or sanitizing it to prevent shell injection. Currently, this is probably safe.
No critical issues found as this is a temporary, isolated fix.
Overall, the PR is well-constructed and addresses a specific CI pain point effectively.
Add a temporary workaround for pull-unit-test and pull-unit-test-next-gen to avoid mixed go1.25.5/go1.25.8 artifacts in Prow.\n\n- export GOTOOLCHAIN=go1.25.8\n- set job-specific GOCACHE\n- run go clean -cache before make\n\nTODO: revert after ticdc unit-test image/environment is consistently on go1.25.8.
There was a problem hiding this comment.
I have already done a preliminary review for you, and I hope to help you do a better job.
Summary:
This PR introduces a temporary workaround to pin the Go toolchain version to go1.25.8 in multiple CI-related scripts (ghpr_verify.groovy and two ticdc presubmit YAML files) to avoid version mismatch errors caused by mixed Go versions (go1.25.6 vs go1.25.8). It also clears per-matrix GOCACHE before running make and adds debugging output for Go environment details. The changes are clearly scoped as temporary and include TODO comments for later removal. Overall, the approach is practical and addresses the immediate CI instability issue effectively.
1. Critical Issues
- No critical bugs or security issues found.
2. Code Improvements
-
Consistency in
go cleancommand:- Files updated:
pipelines/pingcap/tiflow/latest/ghpr_verify.groovyline ~51prow-jobs/pingcap/ticdc/latest-presubmits.yamlline ~160prow-jobs/pingcap/ticdc/latest-presubmits-next-gen.yamlline ~112
- Issue:
go clean -cacheis used, but the correct flag is-cache(single dash). The current usage seems correct, but to clarify:go clean -cacheis valid, but usuallygo clean -cacheis equivalent togo clean -cache=true.
- However, calling
go clean -cacheis fine as pergo cleandocs.
- Suggestion:
- Add
-modcachecleanup if module cache could also cause stale artifacts, e.g.go clean -cache -modcache. - Or explicitly clear
GOCACHEdirectory by deleting it (e.g.,rm -rf $GOCACHE) before running make for stronger guarantee.
- Add
- Files updated:
-
Avoid redundant environment variable export:
- In
ghpr_verify.groovy,export GOCACHE=/tmp/go-build-${TEST_CMD}is set, butGOCACHEis also set in ticdc presubmit scripts to fixed paths. - Suggestion:
- Consider consistent naming or document why different paths are used per context.
- In
-
Shell script improvements in groovy:
- In
ghpr_verify.groovy(around line 51), the shell block usesset -euo pipefailbut noIFSsetting. - Suggestion:
- Add
IFS=$'\n\t'to make shell scripts more robust to whitespace issues.
- Add
- In
-
Print full Go environment variables for debugging:
- Currently only prints
GOTOOLCHAIN,GOROOT, andGOCACHE. - Suggestion:
- Consider printing
go envfully or key variables likeGOPATH,GO111MODULEto help debugging.
- Consider printing
- Currently only prints
3. Best Practices
-
Documentation & comments:
- Good TODO comments are present.
- Suggestion:
- Add a short comment on why
GOCACHEis set to a unique path per matrix (/tmp/go-build-${TEST_CMD}) to help future maintainers.
- Add a short comment on why
-
Testing coverage:
- Since this is a CI pipeline change, verify that the pipeline runs succeed and are stable.
- Suggestion:
- Add a note or link to test runs in PR description (which is done, good).
-
Styling:
- In YAML files, indentations appear consistent.
- In Groovy, multi-line shell script is well formatted.
-
Naming:
- Variables like
GOTOOLCHAIN,GOCACHEare standard and used appropriately.
- Variables like
Summary of suggestions for next steps:
# In shell scripts, consider stronger cache cleanup:
export GOCACHE=/tmp/go-build-${TEST_CMD}
rm -rf "$GOCACHE"
go clean -cache -modcache
# Add IFS setting for robustness:
set -euo pipefail
IFS=$'\n\t'
# Print more go env info:
go envAnd add clarifying comments about the per-matrix cache paths.
This PR addresses an urgent CI flakiness problem with a clean and revertible workaround. The above improvements can make the temporary changes more robust and easier to maintain.
dillon-zheng
left a comment
There was a problem hiding this comment.
lgtm, for Urgent code freese
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dillon-zheng The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
Summary
This PR adds a temporary CI-side workaround for mixed Go patch toolchain artifacts (
go1.25.8vs oldergo tool) that cause compile failures in PR jobs.Why
Some jobs are still running in environments/images that are not fully aligned to Go
1.25.8, while repo-side changes already pull1.25.8artifacts. This can trigger errors like:Changes
pipelines/pingcap/tiflow/latest/ghpr_verify.groovyGOTOOLCHAIN=go1.25.8GOCACHEgo clean -cachebeforemakeprow-jobs/pingcap/ticdc/latest-presubmits.yamlprow-jobs/pingcap/ticdc/latest-presubmits-next-gen.yamlpull-unit-testandpull-unit-test-next-genNotes
1.25.8.