fix(core): enable fault-tolerant execution and incremental result persistence by Rishabh-git10 · Pull Request #424 · kubeedge/ianvs

Rishabh-git10 · 2026-05-10T18:24:29Z

What type of PR is this?
/kind bug

What this PR does / why we need it:
Enables fault-tolerant execution and incremental data persistence to prevent data loss during multi-configuration benchmarks.

Previously, a downstream configuration failure caused a fatal RuntimeError, terminating the execution loop and erasing all prior successful results from memory.

Changes:

Exception Isolation: Modified run_testcases in testcasecontroller.py to catch, log, and isolate individual test exceptions, allowing the loop to continue safely.
Incremental Persistence: Introduced an incremental_save_cb callback in benchmarkingjob.py connected to the Rank module's concatenation logic. Successful results are now written to disk immediately after each test completes.

Which issue(s) this PR fixes:
Fixes #423

…sistence Signed-off-by: Rishabh Dewangan <107680241+Rishabh-git10@users.noreply.github.com>

kubeedge-bot · 2026-05-10T18:24:41Z

Welcome @Rishabh-git10! It looks like this is your first PR to kubeedge/ianvs 🎉

kubeedge-bot · 2026-05-10T18:24:45Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Rishabh-git10
To complete the pull request process, please assign jaypume after the PR has been reviewed.
You can assign the PR to them by writing /assign @jaypume in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gemini-code-assist

Code Review

This pull request introduces incremental saving of test results and enhances fault tolerance by allowing the benchmark to continue after individual test case failures. Feedback highlights a significant performance bottleneck and potential data loss issue caused by frequent I/O operations in the saving logic. Additionally, it is recommended to wrap the incremental save callback in a try-except block to ensure that persistence failures do not interrupt the overall execution.

…hed save with fault-tolerant execution Signed-off-by: Rishabh Dewangan <107680241+Rishabh-git10@users.noreply.github.com>

fix(core): enable fault-tolerant execution and incremental result per…

fc4ba7b

…sistence Signed-off-by: Rishabh Dewangan <107680241+Rishabh-git10@users.noreply.github.com>

kubeedge-bot added the kind/bug Categorizes issue or PR as related to a bug. label May 10, 2026

kubeedge-bot requested review from MooreZheng and hsj576 May 10, 2026 18:24

kubeedge-bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label May 10, 2026

gemini-code-assist Bot reviewed May 10, 2026

View reviewed changes

Comment thread core/cmd/obj/benchmarkingjob.py

Comment thread core/testcasecontroller/testcasecontroller.py Outdated

refactor(core): address PR review, optimize persistence by using batc…

7a44f37

…hed save with fault-tolerant execution Signed-off-by: Rishabh Dewangan <107680241+Rishabh-git10@users.noreply.github.com>

Rishabh-git10 mentioned this pull request May 20, 2026

Issue summarization for example of cloud-edge-collaborative-inference-for-llm #430

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(core): enable fault-tolerant execution and incremental result persistence#424

fix(core): enable fault-tolerant execution and incremental result persistence#424
Rishabh-git10 wants to merge 2 commits into
kubeedge:mainfrom
Rishabh-git10:fix/intermediate-result-persistence

Rishabh-git10 commented May 10, 2026

Uh oh!

kubeedge-bot commented May 10, 2026

Uh oh!

kubeedge-bot commented May 10, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Rishabh-git10 commented May 10, 2026

Uh oh!

kubeedge-bot commented May 10, 2026

Uh oh!

kubeedge-bot commented May 10, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants