Python: Stop swallowing skill script and resource errors so the model can self-correct#6755
Python: Stop swallowing skill script and resource errors so the model can self-correct#6755giles17 wants to merge 2 commits into
Conversation
Port the .NET fix from microsoft#6680. SkillsProvider previously swallowed exceptions from skill script execution and resource reading, returning a generic error string so the model could not self-correct. - Add an include_detailed_errors option to SkillsProvider.__init__ and from_paths. When True, script-execution failures return an error string with the exception message appended; when False (default), the exception is logged and re-raised, delegating to the function-invocation pipeline's own include_detailed_errors policy. - _read_skill_resource now logs and re-raises instead of returning a generic error string. Resources take no model arguments, so a swallowed generic error is not actionable by the model. - Update and add tests covering the new propagation and detailed-error behavior. Fixes microsoft#6681 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Python Test Coverage Report •
Python Unit Test Overview
|
||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Pull request overview
This PR updates the Python SkillsProvider error-handling behavior for skill script execution and resource reading, adding an include_detailed_errors option to better surface actionable failure details to the model (or delegate to the function-invocation pipeline by default), aligning behavior with the related .NET change.
Changes:
- Added
include_detailed_errorstoSkillsProvider.__init__andSkillsProvider.from_paths, and updated_run_skill_scriptto either re-raise (default) or return an error string that includes the exception message when enabled. - Updated
_read_skill_resourceto log and re-raise exceptions instead of returning a generic error string. - Updated/added unit tests to cover propagation-by-default and the detailed-error string path.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| python/packages/core/agent_framework/_skills.py | Adds include_detailed_errors option and changes script/resource failure behavior to re-raise by default (or include exception details when enabled). |
| python/packages/core/tests/core/test_skills.py | Updates tests to assert exception propagation by default and validates the detailed-error string behavior. |
There was a problem hiding this comment.
Automated Code Review
Reviewers: 5 | Confidence: 91%
✓ Correctness
The implementation is correct and well-structured. The
include_detailed_errorsoption is properly threaded through both constructors, stored as an instance attribute, and used in the exception handler. The_run_skill_scriptmethod correctly either returns a detailed error string (when enabled) or re-raises (when disabled), and_read_skill_resourcealways re-raises as designed. No correctness issues found.
✓ Security Reliability
This PR is well-implemented from a security and reliability standpoint. The
include_detailed_errorsoption defaults to False (secure-by-default), the documentation explicitly warns about prompt-injection risks from untrusted scripts, and the re-raise path correctly preserves exception context for the outer pipeline. Input validation (skill/script/resource name checks) occurs before the execution try-block, ensuring error-format strings only interpolate registry-validated names. No significant security or reliability issues found.
✓ Test Coverage
Test coverage for this PR is solid. The two critical behaviors are well-tested: exception propagation by default (tested across sync code scripts, file scripts without runner, path scripts without runner, and inline scripts with list args) and the include_detailed_errors=True path returning error strings with exception messages. Resource exception propagation is tested for both sync and async callable resources. Validation failures (empty/unknown names) continue to return error strings via existing unmodified tests. The only minor gap is the lack of a test documenting that _read_skill_resource always re-raises regardless of include_detailed_errors (confirming the intentional asymmetry between scripts and resources), but this is a low-severity suggestion since the resource path simply doesn't consult the flag.
✓ Failure Modes
The PR cleanly replaces exception-swallowing with a raise-by-default policy for script execution and resource reading, with an opt-in
include_detailed_errorsmode for scripts. The error-handling logic is straightforward, well-documented, and thoroughly tested. No silent failures, lost errors, partial writes, or cancellation races are introduced. The only minor observation is thatlogger.exception()in_run_skill_scriptwill produce a traceback log entry that may be duplicated when the outer pipeline catches and logs the re-raised exception, but this is not a correctness issue.
✗ Design Approach
The new
include_detailed_errors=Truepath inSkillsProviderintroduces a real design regression: script failures are converted into ordinary successful return values instead of staying classified as tool errors by the shared function-invocation layer. I found one blocking issue on that path and no other design problems that met the evidence bar.
Flagged Issues
-
python/packages/core/agent_framework/_skills.py:2379turns script exceptions into plain string returns, which bypasses the shared tool-error contract used elsewhere in the repo (python/packages/core/agent_framework/_tools.py:1516-1520,182,2215-2224) and means these failures are no longer marked withfunction_result.exceptionor counted toward consecutive-error handling.
Automated review by giles17's agents
|
Flagged issue
Source: automated DevFlow PR review |
…tion Address PR review: returning a plain error string from the skill provider bypassed the shared tool-error contract (no exception metadata, not counted toward consecutive-error limits), risking infinite retries. Instead of porting the .NET provider-level IncludeDetailedErrors option, _run_skill_script and _read_skill_resource now always log and re-raise on failure. This delegates error handling to the function-invocation pipeline, whose existing include_detailed_errors policy is the Python equivalent of .NET's FunctionInvokingChatClient.IncludeDetailedErrors and correctly preserves exception metadata and consecutive-error counting. Validation failures (empty/unknown skill, script, or resource names) still return user-facing error strings. Tests updated accordingly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Motivation & Context
The Python
SkillsProviderswallowed exceptions raised during skill script execution and resource reading, returning a generic, identical error string every time ("Error: Failed to run script..."). Because the underlying exception never reached the model, the model could not learn why a call failed and could not self-correct by retrying. This addresses #6681, which asks to port the .NET fix from #6680.Description & Review Guide
What are the major changes?
_run_skill_scriptand_read_skill_resourceno longer swallow execution/read failures. They now log and re-raise, delegating error handling to the function-invocation pipeline.Divergence from the .NET port (please confirm): The .NET PR added a provider-level
IncludeDetailedErrorsoption. This Python PR intentionally does not add that option, for two reasons:exceptionmetadata on the function result — that is set exclusively by the pipeline'sexceptblock (_tools.py:1516-1536). Returning a detailed error string therefore goes through the success path, leavingexception=None. The pipeline detects errors viafunction_result.exception is not None(_tools.py:1822) to enforce the consecutive-error limit, so a perpetually-failing script would be treated as success and never counted toward the retry cap — a potential infinite loop.FunctionInvokingChatClient.IncludeDetailedErrors: the pipeline'sinclude_detailed_errorsconfiguration (FunctionInvocationConfiguration,_tools.py:1311/1365). By re-raising, skill scripts and resources now flow through that existing policy, which surfaces the real exception message to the model and preserves exception metadata and consecutive-error counting. Users enable it the same way as for any other tool:client.function_invocation_configuration["include_detailed_errors"] = True.What is the impact of these changes?
What do you want reviewers to focus on?
include_detailed_errors) is the right Python-idiomatic resolution of Python: Add include_detailed_errors option for skill script execution #6681, given the tool-error-contract constraint above.Related Issue
Fixes #6681
Contribution Checklist
breaking changelabel (or add "[BREAKING]" to the title prefix, before or after any language prefix) — a workflow keeps the label and title prefix in sync automatically.