[featured]: Skip redundant CONFIG_DB writes in sync_feature_scope#368
Conversation
* Add read-before-write check in sync_feature_scope to only write scope fields when values actually differ from CONFIG_DB * Add unit tests covering unchanged, changed, and missing entry cases Signed-off-by: Bojun Feng <bojundf@gmail.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
* Move tests from TestFeatureDaemon to TestFeatureHandler to avoid class-level mock.patch argument injection issues * Improve assertions and add partial-change test case Signed-off-by: Bojun Feng <bojundf@gmail.com>
e947a49 to
6d4557f
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
* Merge skip/write/missing-entry tests into one multi-scenario method following the test_feature_resync pattern * Remove redundant partial-change and empty-dict sub-cases Signed-off-by: Bojun Feng <bojundf@gmail.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This PR optimizes the featured host daemon by avoiding redundant FEATURE|<name> scope-field writes to CONFIG_DB, reducing unnecessary Redis churn during startup/config reload and on runtime feature updates.
Changes:
- Added a read-before-write check in
FeatureHandler.sync_feature_scope()to only update scope fields when values differ. - Applied the conditional write to both host
CONFIG_DBand per-namespaceCONFIG_DB(multi-ASIC). - Added unit tests for unchanged/changed/missing-entry behavior and namespace propagation.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
scripts/featured |
Adds conditional scope syncing to avoid redundant CONFIG_DB writes. |
tests/featured/featured_test.py |
Adds unit tests validating conditional scope writes and namespace propagation. |
|
Hi @Bojun-Feng, |
Signed-off-by: Bojun-Feng <bojundf@gmail.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
tirupatihemanth
left a comment
There was a problem hiding this comment.
NIT: can you move the duplicated getting and update fields if things changed into a function as it is being implemented twice now?
Signed-off-by: Bojun-Feng <bojundf@gmail.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
Hi @vmittal-msft @tirupatihemanth — just checking in on this PR. Seems like all checks are passing and it is approved for 202511 Branch. Please let me know if there's anything else needed from my side. |
Is this written the other way around? |
@saiarcot895 It is correct. The phrasing there could definitely be clearer though. What I meant is that scope write completes much faster without redundant operations, and config reload is expected to take roughly the same amount of time as before. The referenced issue in |
rookie-who
left a comment
There was a problem hiding this comment.
LGTM. Clean read-before-write optimization:
_conditional_update_scopehelper avoids redundant Redis writes- Host and namespace checks are independent (namespace can be stale even when host matches)
- Single
mod_entrycall with only changed fields - Tests cover: no-change, partial-change, missing entry, and namespace-stale-while-host-matches scenarios
No correctness concerns. The get_entry + mod_entry is not atomic, but the existing code was already non-atomic (unconditional mod_entry) and the race window is negligible for scope fields that rarely change.
|
Cherry-pick PR to 202511: #382 |
|
The cherry pick conflict has been handled manually. Removing cherry pick conflict label... ---Powered by SONiC BuildBot
|
Why I did it
Fix #369
Partially address sonic-net/sonic-buildimage#26350
sync_feature_scope()unconditionally writes to CONFIG_DB for every feature at startup and on feature state changes at runtime without checking whether the values have actually changed.We should avoid redundant writes that does not update CONFIG_DB.
How I did it
sync_feature_scope(): read the currentFEATURE|<name>entry from CONFIG_DB viaget_entry(), and only callmod_entry()whenhas_per_asic_scopeorhas_global_scopeactually differs from the intended value.get_entry()return withor {}to safely handleNonereturns, consistent with the defensive pattern already used inresync_feature_state()andsync_feature_delay_state()in the same file.How to verify it
I used the following script to test the update on a virtual switch in GNS3. We can iterate quickly since the modified Python file is at
/usr/local/bin/featuredand can be updated on the fly.The virtual switch image is downloaded from official pipeline build on 2026/03/30 on commit ab081ee.
Bash Script & Setup
I had three files on the switch:
old.pyis copied from/usr/local/bin/featuredon the switch. It is almost identical to the latest previous commit that modified the script, except due to compilation the first line changed from#!/usr/bin/env python3to#!/usr/bin/python3.new.pyis almost identical to the script in PR, except the first line also changed from#!/usr/bin/env python3to#!/usr/bin/python3.The bash script
test_featured.shuse eitherold.pyornew.pyto overwrite/usr/local/bin/featured, use systemctl to reload featured, then runconfig reloadand see how many feature writes there are & how long did it take until the last write.Full Logs Running the Script on VS
TLDR: The number of scope writes dropped from 84 to 38. The time from first CONFIG_DB write to last scope write dropped from ~30–32s to ~21s, so scope synchronization completes about 10s earlier, shortening the potential race-condition window.
Description for the changelog
Skip redundant CONFIG_DB writes in featured daemon's sync_feature_scope to speed up config reload.