Skip to content

fix: Prevent config option from reverting modified configurations#1694

Open
sunjq1 wants to merge 2 commits intoNVIDIA:mainfrom
sunjq1:fix-config
Open

fix: Prevent config option from reverting modified configurations#1694
sunjq1 wants to merge 2 commits intoNVIDIA:mainfrom
sunjq1:fix-config

Conversation

@sunjq1
Copy link

@sunjq1 sunjq1 commented Mar 4, 2026

Summary

This PR fixes an issue where uncommented default configuration options (such as nvidia-container-runtime.debug) were being automatically re-commented when running nvidia-ctk config(without the --set flag), even if the user had explicitly enabled them in the configuration file.

Problem

The current implementation of commentDefaults() relies on a valuesSet map to track which configuration keys have been explicitly modified. However, when loading an existing configuration file from disk, the valuesSet map remained empty.

As a result, if a user manually uncommented a default setting without changing its value(e.g., nvidia-container-runtime.debug = "/var/log/nvidia-container-runtime.log"), or uncommented with the --set flag, the toolkit would:

  • Fail to recognize the key as "user-set" during the loading phase.
  • Match the value against the internal hardcoded defaults.
  • Re-apply the # comment prefix during the Save() operation, effectively reverting the user's manual configuration.

Changes

  • Modified the configuration loading logic to differentiate between loading an existing file and generating a fresh default configuration.
  • Updated loadConfigToml() to traverse the TOML tree after successfully reading from a file, populating the valuesSet map with all keys present in the file.

Testing Performed

  • Verified that running nvidia-ctk config --in-place no longer re-comments uncommented lines in /etc/nvidia-container-runtime/config.toml.
  • Verified that nvidia-ctk config default still generates a standard template with the correct comments and default values.

Before Fix

$ ./nvidia-ctk config default -o /etc/nvidia-container-runtime/config.toml 
$ ./nvidia-ctk config |grep debug
#debug = "/var/log/nvidia-container-toolkit.log"
#debug = "/var/log/nvidia-container-runtime.log"
$ ./nvidia-ctk config --set nvidia-container-runtime.debug=/var/log/nvidia-container-runtime.log --in-place
$ ./nvidia-ctk config |grep debug
#debug = "/var/log/nvidia-container-toolkit.log"
#debug = "/var/log/nvidia-container-runtime.log"

After Fix

$ ./nvidia-ctk config default -o /etc/nvidia-container-runtime/config.toml 
$ ./nvidia-ctk config |grep debug
#debug = "/var/log/nvidia-container-toolkit.log"
#debug = "/var/log/nvidia-container-runtime.log"
$ ./nvidia-ctk config --set nvidia-container-runtime.debug=/var/log/nvidia-container-runtime.log --in-place
$ ./nvidia-ctk config |grep debug
#debug = "/var/log/nvidia-container-toolkit.log"
debug = "/var/log/nvidia-container-runtime.log"

Signed-off-by: Sun Jianqiang <sunjianqiangv@foxmail.com>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 4, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Copy link
Member

@elezar elezar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general comment, could we add unit tests to validate the original behaviour and then verify that the fix addresses this.

Signed-off-by: Sun Jianqiang <sunjianqiangv@foxmail.com>
@sunjq1
Copy link
Author

sunjq1 commented Mar 5, 2026

As a general comment, could we add unit tests to validate the original behaviour and then verify that the fix addresses this.

I have updated the PR with a unit test: TestLoadConfigToml in api/config/v1/toml_test.go.

Before Fix

$ go test
--- FAIL: TestLoadConfigToml (0.00s)
    --- FAIL: TestLoadConfigToml/save_uncommented_configuration_items_to_file (0.00s)
        toml_test.go:388: 
                Error Trace:    nvidia-container-toolkit/api/config/v1/toml_test.go:388
                Error:          Not equal: 
                                ...
                                Diff:
                                --- Expected
                                +++ Actual
                                @@ -7,3 +7,3 @@
                                 [nvidia-container-cli]
                                -debug = "/var/log/nvidia-container-toolkit.log"
                                +#debug = "/var/log/nvidia-container-toolkit.log"
                                 environment = []
                                @@ -18,3 +18,3 @@
                                 [nvidia-container-runtime]
                                -debug = "/var/log/nvidia-container-runtime.log"
                                +#debug = "/var/log/nvidia-container-runtime.log"
                                 log-level = "info"
                Test:           TestLoadConfigToml/save_uncommented_configuration_items_to_file
FAIL
exit status 1
FAIL    github.com/NVIDIA/nvidia-container-toolkit/api/config/v1        0.010s

After Fix

$ go test
PASS
ok      github.com/NVIDIA/nvidia-container-toolkit/api/config/v1        0.009s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants