Python: Fix FunctionShellTool throw and empty streaming shell command by westey-m · Pull Request #6763 · microsoft/agent-framework

westey-m · 2026-06-26T11:41:36Z

Motivation & Context

The OpenAI Responses shell tool was unusable in two ways:

Hard failure when a shell tool was attached. Invoking an agent that had a
shell tool raised ChatClientException: 'FunctionShellTool' object is not subscriptable. The client constructed the OpenAI pydantic model
FunctionShellTool for the request tools payload, but the Responses request
path expects the corresponding TypedDict param (FunctionShellToolParam)
and indexes it like a dict, so a model instance blew up. This made every shell
tool call fail (e.g. "value MSFT for me" in the build_your_own_claw sample).
Empty command on streaming shell calls. When streaming, a shell call
surfaced with an empty command — the run failed and the UI printed the
tool/parameter name with no value. The streaming parser read shell items off
the response.output_item.added event, whose action is an in-progress
skeleton with no commands yet. The command (and output) are only populated on
the completed response.output_item.done item, and there are no shell-specific
streaming delta events to fill the gap.

Description & Review Guide

What are the major changes?
- Use the TypedDict param type FunctionShellToolParam (instead of the pydantic
  model FunctionShellTool) when building the Responses tools payload — two
  call sites and the import in _chat_client.py. This resolves the
  not subscriptable throw.
- Move streaming shell-item parsing from response.output_item.added to
  response.output_item.done, where the item carries the fully-populated
  action/output. The .added case is now a no-op (pass) with an
  explanatory comment. This mirrors how the client already defers mcp_call
  results to .done.
- Extract a shared _shell_item_to_contents() helper that converts
  shell_call / local_shell_call / shell_call_output items into framework
  Content. Both the non-streaming parser and the new streaming .done handler
  call it, removing three near-duplicate parsing blocks.
- Tests: added test_prepared_local_shell_tool_survives_make_tools,
  test_parse_chunk_from_openai_shell_call_added_defers_command, and
  test_parse_chunk_from_openai_shell_call_done_emits_command; updated the two
  get_shell_tool tests for dict (param) access.
What is the impact of these changes?
- Shell tools no longer throw when attached, and streaming shell calls carry the
  actual command and output. Hosted shell, local shell, and shell output items
  are all handled. Non-streaming behavior is unchanged (it now goes through the
  shared helper). No public API changes.
What do you want reviewers to focus on?
- That deferring shell parsing to response.output_item.done is correct for
  both hosted (shell_call/shell_call_output) and local (local_shell_call)
  items, and that the shared helper preserves the previous non-streaming output.

Related Issue

Fixes #6761

Contribution Checklist

The code builds clean without any errors or warnings
All unit tests pass, and I have added new tests where possible
The PR follows the Contribution Guidelines
This PR is linked to an issue and there is no other open PR for this issue (see Related Issue above).
This is not a breaking change. If it is a breaking change, add the breaking change label (or add "[BREAKING]" to the title prefix, before or after any language prefix) — a workflow keeps the label and title prefix in sync automatically.

github-actions

Automated Code Review

Reviewers: 5 | Confidence: 88%

✓ Correctness

The PR correctly fixes two real bugs: (1) replacing the pydantic model FunctionShellTool with the TypedDict FunctionShellToolParam to fix the 'not subscriptable' error, and (2) deferring streaming shell-item parsing from response.output_item.added to response.output_item.done where the command/output is actually populated. The extracted _shell_item_to_contents helper faithfully reproduces the logic of the old inline code in both the non-streaming and streaming paths, with a minor improvement: using getattr(item, "call_id", None) or " instead of item.call_id if hasattr(item, "call_id") else ", which also handles None values. Both call sites for the helper correctly receive local_shell_tool_name from the enclosing method scope. No correctness issues found.

✓ Security Reliability

This PR correctly fixes two bugs: (1) switching from FunctionShellTool (pydantic model) to FunctionShellToolParam (TypedDict) to avoid 'not subscriptable' errors in the OpenAI SDK's _make_tools path, and (2) deferring streaming shell-item parsing from response.output_item.added to response.output_item.done where commands are actually populated. The shared _shell_item_to_contents helper faithfully consolidates the previous duplicate parsing logic. The local_shell_tool_name variable is correctly in scope at the .done handler (resolved at line 2590, used at line 3086). No security, reliability, or resource leak issues were identified.

✓ Test Coverage

The PR adds 4 new tests and updates 2 existing tests to cover the FunctionShellTool→FunctionShellToolParam fix and the streaming shell deferral. The non-streaming path has solid coverage for all three shell item types (shell_call, local_shell_call, shell_call_output) via pre-existing tests that now go through the new _shell_item_to_contents helper. However, the streaming path's new response.output_item.done handler only has a test for shell_call — there are no streaming tests for local_shell_call done events or shell_call_output done events. Additionally, the streaming done test doesn't verify additional_properties metadata.

✓ Failure Modes

The PR correctly fixes two shell tool bugs: (1) replacing the pydantic model FunctionShellTool with the TypedDict FunctionShellToolParam to avoid 'not subscriptable' errors, and (2) deferring streaming shell-item parsing from response.output_item.added to response.output_item.done where commands are actually populated. The shared _shell_item_to_contents helper is a faithful extraction of the existing logic. The local_shell_tool_name variable is properly in scope at the new call site in the done handler. No silent failure modes, lost errors, or missing cleanup were identified in the changed code.

✓ Design Approach

The shell-tool fix itself looks consistent with the surrounding parsing and tests. The only design concern I found is that this PR also carries a broad unrelated uv.lock refresh/downgrade, which does not match the stated shell-focused scope and goes against the repo’s documented guidance to keep runtime dependency lock updates targeted.

Suggestions

Split the unrelated python/uv.lock dependency refresh out of this shell-tool fix, or regenerate only the specific dependency update that is actually required. The repo guidance says targeted runtime dependency changes should use uv lock --upgrade-package <dependency-name> and that full lock refreshes are for repo-wide dev tooling updates (python/DEV_SETUP.md:236-238,395-399; python/.github/skills/python-package-management/SKILL.md:42-43,70-72).

Automated review by westey-m's agents

github-actions · 2026-06-26T11:50:20Z

Python Test Coverage Report •

File	Stmts	Miss	Cover	Missing
packages/openai/agent_framework_openai
_chat_client.py	1235	127	89%	279, 292, 642–646, 654–657, 663–667, 717–724, 726–728, 735–737, 795, 803, 826, 944, 1043, 1102, 1104, 1106, 1108, 1174, 1188, 1268, 1278, 1283, 1326, 1437–1438, 1453, 1680, 1685, 1689–1691, 1695–1696, 1779, 1789, 1816, 1822, 1832, 1838, 1843, 1849, 1854–1855, 1935, 1979, 1982–1985, 1999, 2001, 2009–2010, 2022, 2064, 2129, 2146, 2149, 2176–2178, 2217, 2234, 2237, 2298–2299, 2334, 2372–2373, 2391–2392, 2435, 2550–2551, 2569, 2652–2660, 2690, 2800, 2835, 2850, 2899–2902, 2912–2914, 2925–2927, 2941–2943, 2953–2954, 2960, 2975
TOTAL	42539	5056	88%

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
8345	37 💤	0 ❌	0 🔥	2m 12s ⏱️

Copilot

Pull request overview

Fixes the OpenAI Responses shell tool integration in the Python OpenAI chat client by ensuring shell tools are built using the SDK’s dict/param types (avoiding “not subscriptable” failures) and by deferring streaming shell parsing until the completed response.output_item.done events so commands aren’t emitted empty.

Changes:

Switch shell tool construction from the Pydantic model to FunctionShellToolParam when preparing the OpenAI Responses tools payload.
Refactor shell output-item parsing into a shared helper and move streaming shell parsing from response.output_item.added to response.output_item.done.
Add/adjust unit tests for shell tool preparation and streaming shell-call parsing behavior.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.

File	Description
python/uv.lock	Updates the Python lockfile, including dependency specifier alignment and resolved-version changes.
python/packages/openai/agent_framework_openai/_chat_client.py	Fixes shell tool payload typing and corrects streaming parsing to emit shell command/output only once items are complete.
python/packages/openai/tests/openai/test_openai_chat_client.py	Updates shell tool tests for dict access and adds regressions for the streaming `.added` vs `.done` shell-call behavior.

github-actions

Automated Code Review

Reviewers: 5 | Confidence: 92% | Result: All clear

Reviewed: Correctness, Security Reliability, Test Coverage, Failure Modes, Design Approach

Automated review by westey-m's agents

Fix shell tool bug

739c4e3

Copilot AI review requested due to automatic review settings June 26, 2026 11:41

moonbox3 added the python Usage: [Issues, PRs], Target: Python label Jun 26, 2026

Copilot started reviewing on behalf of westey-m June 26, 2026 11:45 View session

github-actions Bot reviewed Jun 26, 2026

View reviewed changes

Copilot AI reviewed Jun 26, 2026

View reviewed changes

Comment thread python/packages/openai/tests/openai/test_openai_chat_client.py

Address PR feedback

1b896f4

westey-m marked this pull request as ready for review June 26, 2026 13:11

github-actions Bot reviewed Jun 26, 2026

View reviewed changes

eavanvalkenburg approved these changes Jun 26, 2026

View reviewed changes

westey-m enabled auto-merge June 26, 2026 14:15

alliscode approved these changes Jun 26, 2026

View reviewed changes

westey-m added this pull request to the merge queue Jun 26, 2026

github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 26, 2026

westey-m added this pull request to the merge queue Jun 26, 2026

westey-m removed this pull request from the merge queue due to a manual request Jun 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Python: Fix FunctionShellTool throw and empty streaming shell command#6763

Python: Fix FunctionShellTool throw and empty streaming shell command#6763
westey-m wants to merge 2 commits into
microsoft:mainfrom
westey-m:python-shelltool-bugfix

westey-m commented Jun 26, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot commented Jun 26, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

westey-m commented Jun 26, 2026

Motivation & Context

Description & Review Guide

Related Issue

Contribution Checklist

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✓ Security Reliability

✓ Test Coverage

✓ Failure Modes

✓ Design Approach

Suggestions

Uh oh!

github-actions Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python Unit Test Overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Automated Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

github-actions Bot commented Jun 26, 2026 •

edited

Loading