Skip to content

Python: Stop swallowing skill script and resource errors so the model can self-correct#6755

Open
giles17 wants to merge 2 commits into
microsoft:mainfrom
giles17:skill-detailed-errors
Open

Python: Stop swallowing skill script and resource errors so the model can self-correct#6755
giles17 wants to merge 2 commits into
microsoft:mainfrom
giles17:skill-detailed-errors

Conversation

@giles17

@giles17 giles17 commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Motivation & Context

The Python SkillsProvider swallowed exceptions raised during skill script execution and resource reading, returning a generic, identical error string every time ("Error: Failed to run script..."). Because the underlying exception never reached the model, the model could not learn why a call failed and could not self-correct by retrying. This addresses #6681, which asks to port the .NET fix from #6680.

Description & Review Guide

  • What are the major changes?

    • _run_skill_script and _read_skill_resource no longer swallow execution/read failures. They now log and re-raise, delegating error handling to the function-invocation pipeline.
    • Validation failures (empty or unknown skill/script/resource names) still return user-facing error strings, unchanged.
    • Tests updated to assert propagation for scripts (code, file-without-runner, list-args, conflicting-args) and resources (sync/async callables).
  • Divergence from the .NET port (please confirm): The .NET PR added a provider-level IncludeDetailedErrors option. This Python PR intentionally does not add that option, for two reasons:

    1. It cannot be implemented correctly at the provider level in Python. A tool function can only return a value; it cannot set the exception metadata on the function result — that is set exclusively by the pipeline's except block (_tools.py:1516-1536). Returning a detailed error string therefore goes through the success path, leaving exception=None. The pipeline detects errors via function_result.exception is not None (_tools.py:1822) to enforce the consecutive-error limit, so a perpetually-failing script would be treated as success and never counted toward the retry cap — a potential infinite loop.
    2. It is redundant. Python already has the exact equivalent of .NET's FunctionInvokingChatClient.IncludeDetailedErrors: the pipeline's include_detailed_errors configuration (FunctionInvocationConfiguration, _tools.py:1311/1365). By re-raising, skill scripts and resources now flow through that existing policy, which surfaces the real exception message to the model and preserves exception metadata and consecutive-error counting. Users enable it the same way as for any other tool: client.function_invocation_configuration["include_detailed_errors"] = True.
  • What is the impact of these changes?

    • Skill script/resource failures now surface actionable detail to the model via the framework-wide detailed-errors policy, and are correctly marked as errors. The default behavior changes from returning a generic string to re-raising.
  • What do you want reviewers to focus on?

Related Issue

Fixes #6681

Contribution Checklist

  • The code builds clean without any errors or warnings
  • All unit tests pass, and I have added new tests where possible
  • The PR follows the Contribution Guidelines
  • This PR is linked to an issue and there is no other open PR for this issue (see Related Issue above).
  • This is not a breaking change. If it is a breaking change, add the breaking change label (or add "[BREAKING]" to the title prefix, before or after any language prefix) — a workflow keeps the label and title prefix in sync automatically.

Port the .NET fix from microsoft#6680. SkillsProvider previously swallowed
exceptions from skill script execution and resource reading, returning a
generic error string so the model could not self-correct.

- Add an include_detailed_errors option to SkillsProvider.__init__ and
  from_paths. When True, script-execution failures return an error string
  with the exception message appended; when False (default), the exception
  is logged and re-raised, delegating to the function-invocation pipeline's
  own include_detailed_errors policy.
- _read_skill_resource now logs and re-raises instead of returning a
  generic error string. Resources take no model arguments, so a swallowed
  generic error is not actionable by the model.
- Update and add tests covering the new propagation and detailed-error
  behavior.

Fixes microsoft#6681

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 26, 2026 01:56
@moonbox3 moonbox3 added the python Usage: [Issues, PRs], Target: Python label Jun 26, 2026
@giles17 giles17 marked this pull request as ready for review June 26, 2026 01:57
@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/core/agent_framework
   _skills.py10194096%294, 541, 1054, 1069, 1071–1072, 1432–1433, 1445–1446, 1677, 1706, 2174, 2635–2636, 2733, 2741, 2746, 2749, 2754, 2774, 2786, 2791, 2887, 2895, 2900, 2903, 2908, 2928, 2937, 2942, 3203–3204, 3553, 3780–3781, 3808–3809, 3816–3817
TOTAL42563508788% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
8339 37 💤 0 ❌ 0 🔥 2m 17s ⏱️

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Python SkillsProvider error-handling behavior for skill script execution and resource reading, adding an include_detailed_errors option to better surface actionable failure details to the model (or delegate to the function-invocation pipeline by default), aligning behavior with the related .NET change.

Changes:

  • Added include_detailed_errors to SkillsProvider.__init__ and SkillsProvider.from_paths, and updated _run_skill_script to either re-raise (default) or return an error string that includes the exception message when enabled.
  • Updated _read_skill_resource to log and re-raise exceptions instead of returning a generic error string.
  • Updated/added unit tests to cover propagation-by-default and the detailed-error string path.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
python/packages/core/agent_framework/_skills.py Adds include_detailed_errors option and changes script/resource failure behavior to re-raise by default (or include exception details when enabled).
python/packages/core/tests/core/test_skills.py Updates tests to assert exception propagation by default and validates the detailed-error string behavior.

Comment thread python/packages/core/agent_framework/_skills.py Outdated
Comment thread python/packages/core/agent_framework/_skills.py

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 5 | Confidence: 91%

✓ Correctness

The implementation is correct and well-structured. The include_detailed_errors option is properly threaded through both constructors, stored as an instance attribute, and used in the exception handler. The _run_skill_script method correctly either returns a detailed error string (when enabled) or re-raises (when disabled), and _read_skill_resource always re-raises as designed. No correctness issues found.

✓ Security Reliability

This PR is well-implemented from a security and reliability standpoint. The include_detailed_errors option defaults to False (secure-by-default), the documentation explicitly warns about prompt-injection risks from untrusted scripts, and the re-raise path correctly preserves exception context for the outer pipeline. Input validation (skill/script/resource name checks) occurs before the execution try-block, ensuring error-format strings only interpolate registry-validated names. No significant security or reliability issues found.

✓ Test Coverage

Test coverage for this PR is solid. The two critical behaviors are well-tested: exception propagation by default (tested across sync code scripts, file scripts without runner, path scripts without runner, and inline scripts with list args) and the include_detailed_errors=True path returning error strings with exception messages. Resource exception propagation is tested for both sync and async callable resources. Validation failures (empty/unknown names) continue to return error strings via existing unmodified tests. The only minor gap is the lack of a test documenting that _read_skill_resource always re-raises regardless of include_detailed_errors (confirming the intentional asymmetry between scripts and resources), but this is a low-severity suggestion since the resource path simply doesn't consult the flag.

✓ Failure Modes

The PR cleanly replaces exception-swallowing with a raise-by-default policy for script execution and resource reading, with an opt-in include_detailed_errors mode for scripts. The error-handling logic is straightforward, well-documented, and thoroughly tested. No silent failures, lost errors, partial writes, or cancellation races are introduced. The only minor observation is that logger.exception() in _run_skill_script will produce a traceback log entry that may be duplicated when the outer pipeline catches and logs the re-raised exception, but this is not a correctness issue.

✗ Design Approach

The new include_detailed_errors=True path in SkillsProvider introduces a real design regression: script failures are converted into ordinary successful return values instead of staying classified as tool errors by the shared function-invocation layer. I found one blocking issue on that path and no other design problems that met the evidence bar.

Flagged Issues

  • python/packages/core/agent_framework/_skills.py:2379 turns script exceptions into plain string returns, which bypasses the shared tool-error contract used elsewhere in the repo (python/packages/core/agent_framework/_tools.py:1516-1520, 182, 2215-2224) and means these failures are no longer marked with function_result.exception or counted toward consecutive-error handling.

Automated review by giles17's agents

Comment thread python/packages/core/agent_framework/_skills.py Outdated
@github-actions

Copy link
Copy Markdown
Contributor

Flagged issue

python/packages/core/agent_framework/_skills.py:2379 turns script exceptions into plain string returns, which bypasses the shared tool-error contract used elsewhere in the repo (python/packages/core/agent_framework/_tools.py:1516-1520, 182, 2215-2224) and means these failures are no longer marked with function_result.exception or counted toward consecutive-error handling.


Source: automated DevFlow PR review

@giles17 giles17 marked this pull request as draft June 26, 2026 02:08
…tion

Address PR review: returning a plain error string from the skill provider
bypassed the shared tool-error contract (no exception metadata, not counted
toward consecutive-error limits), risking infinite retries.

Instead of porting the .NET provider-level IncludeDetailedErrors option,
_run_skill_script and _read_skill_resource now always log and re-raise on
failure. This delegates error handling to the function-invocation pipeline,
whose existing include_detailed_errors policy is the Python equivalent of
.NET's FunctionInvokingChatClient.IncludeDetailedErrors and correctly
preserves exception metadata and consecutive-error counting.

Validation failures (empty/unknown skill, script, or resource names) still
return user-facing error strings. Tests updated accordingly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@giles17 giles17 changed the title Python: Add include_detailed_errors option for skill script execution Python: Stop swallowing skill script and resource errors so the model can self-correct Jun 26, 2026
@giles17 giles17 marked this pull request as ready for review June 26, 2026 21:14

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 5 | Confidence: 89% | Result: All clear

Reviewed: Correctness, Security Reliability, Test Coverage, Failure Modes, Design Approach


Automated review by giles17's agents

@giles17 giles17 enabled auto-merge June 26, 2026 22:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Usage: [Issues, PRs], Target: Python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: Add include_detailed_errors option for skill script execution

3 participants