Skip to content

feat(qwen-agent): add instrumentation for Qwen-Agent#154

Open
sipercai wants to merge 2 commits intoalibaba:mainfrom
sipercai:liuyu/qwen-agent
Open

feat(qwen-agent): add instrumentation for Qwen-Agent#154
sipercai wants to merge 2 commits intoalibaba:mainfrom
sipercai:liuyu/qwen-agent

Conversation

@sipercai
Copy link
Copy Markdown
Collaborator

@sipercai sipercai commented Mar 24, 2026

Description

This PR adds loongsuite-instrumentation-qwen-agent: OpenTelemetry instrumentation for Qwen-Agent using ExtendedTelemetryHandler from opentelemetry-util-genai. It emits GenAI semantic-convention traces for agent runs, LLM calls, ReAct steps, and tool execution (invoke_agent, chat, react step, execute_tool), registers an opentelemetry_instrumentor entry point, and includes unit tests plus VCR-based integration tests.

Fixes #11

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

pip install -e ".[test]" under instrumentation-loongsuite/loongsuite-instrumentation-qwen-agent, then pytest on that package’s tests/
From repo root: tox -c tox-loongsuite.ini -e py312-test-loongsuite-instrumentation-qwen-agent-oldest, py312-test-loongsuite-instrumentation-qwen-agent-latest, lint-loongsuite-instrumentation-qwen-agent

  • Test A

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

Change-Id: I5ab30c79fea5e6f2070ef182da505084092632b2
Co-developed-by: Cursor <noreply@cursor.com>
@sipercai sipercai added enhancement New feature or request instrumentaion The instrumentation label represents issues related to instrumentation. genai The genai label represents issues related to generative AI. labels Mar 24, 2026
Change-Id: I6e70c5771f05d4915546bf6c2d4b96dc50582ce7
Co-developed-by: Cursor <noreply@cursor.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new LoongSuite OpenTelemetry instrumentation package for the Qwen-Agent framework, integrating ExtendedTelemetryHandler to emit GenAI semantic-convention spans for agent runs, LLM chat calls, ReAct steps, and tool execution, along with CI/tox wiring and test coverage (unit + VCR-based integration).

Changes:

  • Introduces loongsuite-instrumentation-qwen-agent with wrappers around Agent.run, BaseChatModel.chat, Agent._call_llm, and Agent._call_tool.
  • Adds unit tests validating span creation/hierarchy and VCR-backed integration tests with recorded HTTP cassettes.
  • Wires the new package into tox-loongsuite.ini and GitHub Actions test/lint workflows.

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tox-loongsuite.ini Adds tox envs/deps/commands for qwen-agent tests + lint.
.github/workflows/loongsuite_test_0.yml Adds CI test jobs for qwen-agent across py39–py313 (oldest/latest).
.github/workflows/loongsuite_lint_0.yml Adds CI lint job for qwen-agent package.
instrumentation-loongsuite/loongsuite-instrumentation-qwen-agent/pyproject.toml New package metadata, deps, and opentelemetry_instrumentor entry point.
instrumentation-loongsuite/loongsuite-instrumentation-qwen-agent/src/opentelemetry/instrumentation/qwen_agent/init.py Instrumentor implementation wiring wrapt wrappers + uninstrumentation.
instrumentation-loongsuite/loongsuite-instrumentation-qwen-agent/src/opentelemetry/instrumentation/qwen_agent/patch.py Core wrappers emitting spans for agent/LLM/tools/ReAct.
instrumentation-loongsuite/loongsuite-instrumentation-qwen-agent/src/opentelemetry/instrumentation/qwen_agent/utils.py Message/tool/provider conversion helpers for GenAI types.
instrumentation-loongsuite/loongsuite-instrumentation-qwen-agent/tests/* Unit tests, VCR integration tests, conftest, and VCR cassettes.
instrumentation-loongsuite/loongsuite-instrumentation-qwen-agent/README.md Usage/config docs for the new instrumentation.
instrumentation-loongsuite/loongsuite-instrumentation-qwen-agent/CHANGELOG.md Changelog entry for initial release.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

_close_active_react_step(handler)

handler.stop_invoke_agent(invocation)

Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrap_agent_run starts the invoke_agent span but only calls stop_invoke_agent after the wrapped generator is fully exhausted. If a caller stops iterating early (breaks out, generator is GC’d, etc.), the span (and any open react_step) will remain open and context vars won’t be finalized correctly. Consider wrapping the iteration in a try/finally that always closes any active react_step and stops/fails the invoke_agent span when the generator is closed early (including GeneratorExit).

Suggested change
except GeneratorExit as e:
# Generator was closed early (e.g., consumer stopped iterating).
# Ensure any open react_step and the invoke_agent span are finalized.
_close_active_react_step(handler)
handler.fail_invoke_agent(
invocation, Error(message=str(e), type=type(e))
)
raise

Copilot uses AI. Check for mistakes.
Comment on lines +200 to +235
def _wrap_streaming_llm_response(
response_iter: Iterator, invocation: Any, handler: ExtendedTelemetryHandler
) -> Iterator:
"""Wrap a streaming LLM response iterator to capture output on completion."""
try:
last_response = None
first_token = True
for response in response_iter:
if first_token:
invocation.monotonic_first_token_s = timeit.default_timer()
first_token = False
last_response = response
yield response

if last_response:
invocation.output_messages = (
convert_qwen_messages_to_output_messages(last_response)
)
invocation.response_model_name = invocation.request_model
invocation.finish_reasons = ["stop"]

# Check for function calls
for msg in last_response:
fc = (
msg.function_call
if hasattr(msg, "function_call")
else msg.get("function_call")
if isinstance(msg, dict)
else None
)
if fc:
invocation.finish_reasons = ["tool_calls"]
break

handler.stop_llm(invocation)

Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_wrap_streaming_llm_response only calls handler.stop_llm(invocation) after the streaming iterator is fully consumed. If the user doesn’t exhaust the stream (common with streaming UIs) and the iterator is closed early, the LLM span will remain open and never be ended/failed. This should use a try/finally around the for loop (and handle GeneratorExit) to ensure stop_llm (or fail_llm) is always called.

Copilot uses AI. Check for mistakes.
Comment on lines +169 to +174
# Handle function role (tool response)
if role == "function" and content:
text = _extract_content_text(content)
parts.append(
ToolCallResponse(
id=name or "",
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

convert_qwen_messages_to_input_messages treats tool results only when role == "function", but qwen-agent/DashScope tool messages use role == "tool" (see the VCR cassettes in this PR). As written, tool responses will be recorded as plain Text parts, losing GenAI tool response semantics. Update conversion to recognize role == "tool" (and/or both), and populate ToolCallResponse.id from the tool call id (e.g., msg.id or msg.extra.function_id) rather than the tool name.

Suggested change
# Handle function role (tool response)
if role == "function" and content:
text = _extract_content_text(content)
parts.append(
ToolCallResponse(
id=name or "",
# Handle function/tool role (tool response)
if role in ("function", "tool") and content:
text = _extract_content_text(content)
# Prefer a stable tool call id from the message when available.
tool_call_id: str = ""
# Try direct id attribute/key.
if hasattr(msg, "id"):
tool_call_id = getattr(msg, "id", "") or ""
elif isinstance(msg, dict):
tool_call_id = msg.get("id") or ""
# Fallback to extra.function_id if present.
if not tool_call_id:
extra = getattr(msg, "extra", None)
if extra is None and isinstance(msg, dict):
extra = msg.get("extra")
if extra is not None:
if isinstance(extra, dict):
tool_call_id = extra.get("function_id") or ""
else:
tool_call_id = getattr(extra, "function_id", "") or ""
# Final fallback: use the tool name (existing behavior) or empty.
if not tool_call_id:
tool_call_id = name or ""
parts.append(
ToolCallResponse(
id=tool_call_id,

Copilot uses AI. Check for mistakes.
Comment on lines +148 to +160
}
}
]
},
{
"role": "tool",
"content": "An error occurred when calling tool `get_current_weather_test`:\nAttributeError: 'str' object has no attribute 'get'\nTraceback:\n File \"/Users/sipercai/miniforge3/lib/python3.12/site-packages/qwen_agent/agent.py\", line 192, in _call_tool\n tool_result = tool.call(tool_args, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/Users/sipercai/project/pyins/team-work/loongsuite-python-agent/instrumentation-loongsuite/loongsuite-instrumentation-qwen-agent/tests/test_real_api.py\", line 204, in call\n city = params.get(\"city\", \"unknown\")\n ^^^^^^^^^^\n",
"name": "get_current_weather_test",
"extra": {
"function_id": "call_355b7a9ccc024c71a743ab"
},
"id": "call_355b7a9ccc024c71a743ab"
}
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VCR cassette contains a full local traceback with absolute paths under a developer’s home directory (e.g., /Users/.../instrumentation-loongsuite/.../test_real_api.py) embedded in the recorded request body. This is sensitive/host-specific data and will also make the cassette brittle if the tool output changes. Re-record the cassette after making the tool return a deterministic success value (or scrub tool error content/tracebacks before recording) so no local paths/tracebacks are checked into the repo.

Copilot uses AI. Check for mistakes.
Comment on lines +553 to +567
def call(self, params, **kwargs):
if isinstance(params, str):
try:
params = json.loads(params)
except Exception:
params = {"city": params}
city = (
params.get("city", "unknown")
if isinstance(params, dict)
else "unknown"
)
return (
f"The weather in {city} is sunny and 22 degrees Celsius."
)

Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetCurrentWeatherTool.call() now returns a normal success string for string/dict params, but the recorded VCR cassette test_qwen_agent_with_tool_call.yaml still contains a tool error payload + traceback (it was recorded when this tool raised AttributeError: 'str' object has no attribute 'get'). With VCR replay, the tool output is part of the next request body, so this change will cause cassette mismatch and flaky/failing tests. Re-record/update the cassette to match the current deterministic tool behavior (and avoid recording tracebacks/local paths).

Copilot uses AI. Check for mistakes.
pip install ./instrumentation-loongsuite/loongsuite-instrumentation-qwen-agent

# Required: GenAI utilities used by the instrumentation
pip install ./util/opentelemetry-util-genai
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be done before install loongsuite-instrumentation-qwen-agent.

license = "Apache-2.0"
requires-python = ">=3.9"
authors = [
{ name = "LoongSuite Python Agent Authors", email = "qp467389@alibaba-inc.com" },
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the authors information, we need your information and opentelemetry authors' one here.

qwen_agent = "opentelemetry.instrumentation.qwen_agent:QwenAgentInstrumentor"

[project.urls]
Homepage = "https://github.com/alibaba/loongsuite-python-agent"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Homepage = "https://github.com/alibaba/loongsuite-python-agent"
Homepage = "https://github.com/alibaba/loongsuite-python-agent/tree/main/instrumentation-loongsuite/loongsuite-instrumentation-qwen-agent"

[project]
name = "loongsuite-instrumentation-qwen-agent"
dynamic = ["version"]
description = "OpenTelemetry Qwen-Agent Instrumentation"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description = "OpenTelemetry Qwen-Agent Instrumentation"
description = "LoongSuite Qwen-Agent Instrumentation"


### Added

- Initial release of `loongsuite-instrumentation-qwen-agent`
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The format of the changelog file should follow the established conventions. Please refer to:
https://github.com/alibaba/loongsuite-python-agent/blob/main/instrumentation-loongsuite/loongsuite-instrumentation-dashscope/CHANGELOG.md

# See the License for the specific language governing permissions and
# limitations under the License.

__version__ = "0.1.0"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
__version__ = "0.1.0"
__version__ = "0.4.0.dev"

return invocation


def create_agent_invocation(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions could be defined as internal ones.

hasattr(agent_instance, "system_message")
and agent_instance.system_message
):
invocation.system_instruction = [
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check whether the system message obtained here conforms to the definition of the gen_ai.system_instructions?



# Map qwen-agent model_type to provider name
_MODEL_TYPE_PROVIDER_MAP = {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please double-check the provider mapping. The provider name depends on the vendor providing model service, not the developer of the model.

try:
wrap_function_wrapper(
module=_LLM_MODULE,
name="BaseChatModel.chat",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is an abstract class, it may lead to duplicate spans reporting in Proxy/Wrapper scenarios. Other wrapper methods should also pay attention to this issue.

You can refer to #149 (comment) for a similar comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request genai The genai label represents issues related to generative AI. instrumentaion The instrumentation label represents issues related to instrumentation.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[feat]Add support for qwen-agent

5 participants