Python: Stop swallowing skill script and resource errors so the model can self-correct by giles17 · Pull Request #6755 · microsoft/agent-framework

giles17 · 2026-06-26T01:56:46Z

Motivation & Context

The Python SkillsProvider swallowed exceptions raised during skill script execution and resource reading, returning a generic, identical error string every time ("Error: Failed to run script..."). Because the underlying exception never reached the model, the model could not learn why a call failed and could not self-correct by retrying. This addresses #6681, which asks to port the .NET fix from #6680.

Description & Review Guide

What are the major changes?
- _run_skill_script and _read_skill_resource no longer swallow execution/read failures. They now log and re-raise, delegating error handling to the function-invocation pipeline.
- Validation failures (empty or unknown skill/script/resource names) still return user-facing error strings, unchanged.
- Tests updated to assert propagation for scripts (code, file-without-runner, list-args, conflicting-args) and resources (sync/async callables).
Divergence from the .NET port (please confirm): The .NET PR added a provider-level IncludeDetailedErrors option. This Python PR intentionally does not add that option, for two reasons:
1. It cannot be implemented correctly at the provider level in Python. A tool function can only return a value; it cannot set the exception metadata on the function result — that is set exclusively by the pipeline's except block (_tools.py:1516-1536). Returning a detailed error string therefore goes through the success path, leaving exception=None. The pipeline detects errors via function_result.exception is not None (_tools.py:1822) to enforce the consecutive-error limit, so a perpetually-failing script would be treated as success and never counted toward the retry cap — a potential infinite loop.
2. It is redundant. Python already has the exact equivalent of .NET's FunctionInvokingChatClient.IncludeDetailedErrors: the pipeline's include_detailed_errors configuration (FunctionInvocationConfiguration, _tools.py:1311/1365). By re-raising, skill scripts and resources now flow through that existing policy, which surfaces the real exception message to the model and preserves exception metadata and consecutive-error counting. Users enable it the same way as for any other tool: client.function_invocation_configuration["include_detailed_errors"] = True.
What is the impact of these changes?
- Skill script/resource failures now surface actionable detail to the model via the framework-wide detailed-errors policy, and are correctly marked as errors. The default behavior changes from returning a generic string to re-raising.
What do you want reviewers to focus on?
- Whether dropping the provider-level option (in favor of the existing pipeline-level include_detailed_errors) is the right Python-idiomatic resolution of Python: Add include_detailed_errors option for skill script execution #6681, given the tool-error-contract constraint above.

Related Issue

Fixes #6681

Contribution Checklist

The code builds clean without any errors or warnings
All unit tests pass, and I have added new tests where possible
The PR follows the Contribution Guidelines
This PR is linked to an issue and there is no other open PR for this issue (see Related Issue above).
This is not a breaking change. If it is a breaking change, add the breaking change label (or add "[BREAKING]" to the title prefix, before or after any language prefix) — a workflow keeps the label and title prefix in sync automatically.

Port the .NET fix from microsoft#6680. SkillsProvider previously swallowed exceptions from skill script execution and resource reading, returning a generic error string so the model could not self-correct. - Add an include_detailed_errors option to SkillsProvider.__init__ and from_paths. When True, script-execution failures return an error string with the exception message appended; when False (default), the exception is logged and re-raised, delegating to the function-invocation pipeline's own include_detailed_errors policy. - _read_skill_resource now logs and re-raises instead of returning a generic error string. Resources take no model arguments, so a swallowed generic error is not actionable by the model. - Update and add tests covering the new propagation and detailed-error behavior. Fixes microsoft#6681 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-06-26T02:00:04Z

Python Test Coverage Report •

File	Stmts	Miss	Cover	Missing
packages/core/agent_framework
_skills.py	1019	40	96%	294, 541, 1054, 1069, 1071–1072, 1432–1433, 1445–1446, 1677, 1706, 2174, 2635–2636, 2733, 2741, 2746, 2749, 2754, 2774, 2786, 2791, 2887, 2895, 2900, 2903, 2908, 2928, 2937, 2942, 3203–3204, 3553, 3780–3781, 3808–3809, 3816–3817
TOTAL	42563	5087	88%

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
8339	37 💤	0 ❌	0 🔥	2m 17s ⏱️

Copilot

Pull request overview

This PR updates the Python SkillsProvider error-handling behavior for skill script execution and resource reading, adding an include_detailed_errors option to better surface actionable failure details to the model (or delegate to the function-invocation pipeline by default), aligning behavior with the related .NET change.

Changes:

Added include_detailed_errors to SkillsProvider.__init__ and SkillsProvider.from_paths, and updated _run_skill_script to either re-raise (default) or return an error string that includes the exception message when enabled.
Updated _read_skill_resource to log and re-raise exceptions instead of returning a generic error string.
Updated/added unit tests to cover propagation-by-default and the detailed-error string path.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
python/packages/core/agent_framework/_skills.py	Adds `include_detailed_errors` option and changes script/resource failure behavior to re-raise by default (or include exception details when enabled).
python/packages/core/tests/core/test_skills.py	Updates tests to assert exception propagation by default and validates the detailed-error string behavior.

github-actions

Automated Code Review

Reviewers: 5 | Confidence: 91%

✓ Correctness

The implementation is correct and well-structured. The include_detailed_errors option is properly threaded through both constructors, stored as an instance attribute, and used in the exception handler. The _run_skill_script method correctly either returns a detailed error string (when enabled) or re-raises (when disabled), and _read_skill_resource always re-raises as designed. No correctness issues found.

✓ Security Reliability

This PR is well-implemented from a security and reliability standpoint. The include_detailed_errors option defaults to False (secure-by-default), the documentation explicitly warns about prompt-injection risks from untrusted scripts, and the re-raise path correctly preserves exception context for the outer pipeline. Input validation (skill/script/resource name checks) occurs before the execution try-block, ensuring error-format strings only interpolate registry-validated names. No significant security or reliability issues found.

✓ Test Coverage

Test coverage for this PR is solid. The two critical behaviors are well-tested: exception propagation by default (tested across sync code scripts, file scripts without runner, path scripts without runner, and inline scripts with list args) and the include_detailed_errors=True path returning error strings with exception messages. Resource exception propagation is tested for both sync and async callable resources. Validation failures (empty/unknown names) continue to return error strings via existing unmodified tests. The only minor gap is the lack of a test documenting that _read_skill_resource always re-raises regardless of include_detailed_errors (confirming the intentional asymmetry between scripts and resources), but this is a low-severity suggestion since the resource path simply doesn't consult the flag.

✓ Failure Modes

The PR cleanly replaces exception-swallowing with a raise-by-default policy for script execution and resource reading, with an opt-in include_detailed_errors mode for scripts. The error-handling logic is straightforward, well-documented, and thoroughly tested. No silent failures, lost errors, partial writes, or cancellation races are introduced. The only minor observation is that logger.exception() in _run_skill_script will produce a traceback log entry that may be duplicated when the outer pipeline catches and logs the re-raised exception, but this is not a correctness issue.

✗ Design Approach

The new include_detailed_errors=True path in SkillsProvider introduces a real design regression: script failures are converted into ordinary successful return values instead of staying classified as tool errors by the shared function-invocation layer. I found one blocking issue on that path and no other design problems that met the evidence bar.

Flagged Issues

python/packages/core/agent_framework/_skills.py:2379 turns script exceptions into plain string returns, which bypasses the shared tool-error contract used elsewhere in the repo (python/packages/core/agent_framework/_tools.py:1516-1520, 182, 2215-2224) and means these failures are no longer marked with function_result.exception or counted toward consecutive-error handling.

Automated review by giles17's agents

github-actions · 2026-06-26T02:01:44Z

Flagged issue

python/packages/core/agent_framework/_skills.py:2379 turns script exceptions into plain string returns, which bypasses the shared tool-error contract used elsewhere in the repo (python/packages/core/agent_framework/_tools.py:1516-1520, 182, 2215-2224) and means these failures are no longer marked with function_result.exception or counted toward consecutive-error handling.

Source: automated DevFlow PR review

…tion Address PR review: returning a plain error string from the skill provider bypassed the shared tool-error contract (no exception metadata, not counted toward consecutive-error limits), risking infinite retries. Instead of porting the .NET provider-level IncludeDetailedErrors option, _run_skill_script and _read_skill_resource now always log and re-raise on failure. This delegates error handling to the function-invocation pipeline, whose existing include_detailed_errors policy is the Python equivalent of .NET's FunctionInvokingChatClient.IncludeDetailedErrors and correctly preserves exception metadata and consecutive-error counting. Validation failures (empty/unknown skill, script, or resource names) still return user-facing error strings. Tests updated accordingly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions

Automated Code Review

Reviewers: 5 | Confidence: 89% | Result: All clear

Reviewed: Correctness, Security Reliability, Test Coverage, Failure Modes, Design Approach

Automated review by giles17's agents

Copilot AI review requested due to automatic review settings June 26, 2026 01:56

moonbox3 added the python Usage: [Issues, PRs], Target: Python label Jun 26, 2026

giles17 marked this pull request as ready for review June 26, 2026 01:57

Copilot started reviewing on behalf of giles17 June 26, 2026 01:57 View session

Copilot AI reviewed Jun 26, 2026

View reviewed changes

Comment thread python/packages/core/agent_framework/_skills.py Outdated

Comment thread python/packages/core/agent_framework/_skills.py

github-actions Bot reviewed Jun 26, 2026

View reviewed changes

Comment thread python/packages/core/agent_framework/_skills.py Outdated

giles17 marked this pull request as draft June 26, 2026 02:08

giles17 changed the title ~~Python: Add include_detailed_errors option for skill script execution~~ Python: Stop swallowing skill script and resource errors so the model can self-correct Jun 26, 2026

giles17 marked this pull request as ready for review June 26, 2026 21:14

github-actions Bot reviewed Jun 26, 2026

View reviewed changes

giles17 enabled auto-merge June 26, 2026 22:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Python: Stop swallowing skill script and resource errors so the model can self-correct#6755

Python: Stop swallowing skill script and resource errors so the model can self-correct#6755
giles17 wants to merge 2 commits into
microsoft:mainfrom
giles17:skill-detailed-errors

giles17 commented Jun 26, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 26, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

giles17 commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation & Context

Description & Review Guide

Related Issue

Contribution Checklist

Uh oh!

github-actions Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python Unit Test Overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✓ Security Reliability

✓ Test Coverage

✓ Failure Modes

✗ Design Approach

Flagged Issues

Uh oh!

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Automated Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

giles17 commented Jun 26, 2026 •

edited

Loading

github-actions Bot commented Jun 26, 2026 •

edited

Loading