fix: skip hard examples with no unique label match in incremental learning#415
fix: skip hard examples with no unique label match in incremental learning#415dev-aditya-hub wants to merge 1 commit into
Conversation
…rning Signed-off-by: dev-aditya-hub <premjadhvar95@gmail.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: dev-aditya-hub The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Welcome @dev-aditya-hub! It looks like this is your first PR to kubeedge/ianvs 🎉 |
There was a problem hiding this comment.
Code Review
The pull request updates the _get_train_dataset method in incremental_learning.py to skip hard examples that do not have a unique match in the dataset, while also removing unnecessary pylint disable comments. The reviewer suggests adding a warning log when multiple matches are encountered to help identify potential dataset inconsistencies.
| if len(index[0]) != 1: | ||
| continue |
There was a problem hiding this comment.
The current logic skips hard examples if multiple matches are found in the label file (len(index[0]) != 1). While this aligns with the PR's goal of ensuring unique matches, it might be beneficial to log a warning when multiple matches are found, as this could indicate inconsistencies or duplicates in the dataset that the user should be aware of.
|
/assign @MooreZheng |
Summary
IncrementalLearning._get_train_dataset()assignedlabelinside anif len(index[0]) == 1:block but calledfile.write(f"{new} {label}\n")unconditionally outside itnp.where()finds no unique path match for a hard example,labelretains the value from the prior loop iteration — silently writing the wrong label to the training file; on the very first iteration it raisesUnboundLocalErrorif len(index[0]) != 1: continueso unmatched samples are skipped andlabelis only assigned on the valid path, matching the intent of the surrounding logicTest plan
UnboundLocalErroron the first unmatched hard exampleSummary by CodeRabbit
Bug Fixes
Corrected label assignment in hard-example training dataset construction, preventing stale labels from prior loop iterations being silently written to the training file and corrupting incremental model training.
Tests
Added regression test validating that unmatched hard examples are skipped and matched ones are written with the correct label to the training dataset file.
Signed-off-by: dev-aditya-hub premjadhvar95@gmail.com