feat: add support for openai agents#161
feat: add support for openai agents#161minimAluminiumalism wants to merge 5 commits intoalibaba:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a new LoongSuite/OpenTelemetry instrumentation package for the OpenAI Agents SDK by registering a custom TracingProcessor that converts SDK trace/span callbacks into OTel spans using GenAI semantic conventions.
Changes:
- Introduces
OTelTracingProcessorto translate OpenAI Agents SDK spans/traces into OpenTelemetry spans. - Adds an
OpenAIAgentsInstrumentorthat registers/unregisters the processor via the SDK tracing API. - Adds packaging/docs/tests scaffolding for the new instrumentation distribution.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| instrumentation-loongsuite/loongsuite-instrumentation-openai-agents/src/opentelemetry/instrumentation/openai_agents/_processor.py | Implements the SDK TracingProcessor bridge and span translation logic. |
| instrumentation-loongsuite/loongsuite-instrumentation-openai-agents/src/opentelemetry/instrumentation/openai_agents/init.py | Implements the instrumentor that registers/unregisters the tracing processor. |
| instrumentation-loongsuite/loongsuite-instrumentation-openai-agents/src/opentelemetry/instrumentation/openai_agents/package.py | Declares instrumented dependency metadata for the distribution. |
| instrumentation-loongsuite/loongsuite-instrumentation-openai-agents/src/opentelemetry/instrumentation/openai_agents/version.py | Adds package version for hatch dynamic versioning. |
| instrumentation-loongsuite/loongsuite-instrumentation-openai-agents/tests/test_processor.py | Unit tests validating span creation/attributes for supported SDK span types. |
| instrumentation-loongsuite/loongsuite-instrumentation-openai-agents/tests/test_instrumentor.py | Tests instrumentor lifecycle and processor registration. |
| instrumentation-loongsuite/loongsuite-instrumentation-openai-agents/tests/conftest.py | Pytest fixtures + env configuration for GenAI semconv/content capture. |
| instrumentation-loongsuite/loongsuite-instrumentation-openai-agents/tests/init.py | Marks tests as a package (license header only). |
| instrumentation-loongsuite/loongsuite-instrumentation-openai-agents/tests/requirements.oldest.txt | Pins oldest supported openai-agents version for test matrix. |
| instrumentation-loongsuite/loongsuite-instrumentation-openai-agents/tests/requirements.latest.txt | Uses latest openai-agents for test matrix. |
| instrumentation-loongsuite/loongsuite-instrumentation-openai-agents/README.rst | Adds end-user installation/usage documentation. |
| instrumentation-loongsuite/loongsuite-instrumentation-openai-agents/pyproject.toml | Adds build metadata, deps, and entry point registration. |
| instrumentation-loongsuite/loongsuite-instrumentation-openai-agents/CHANGELOG.md | Introduces initial changelog entry for the new package. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| otel_span.set_status(StatusCode.ERROR, span.error["message"]) | ||
| otel_span.set_attribute("error.type", span.error["message"]) |
There was a problem hiding this comment.
error.type is being set to the error message. In this repo, error.type is used for an exception/classification (e.g., type(e).__name__) and the message is stored separately (often error.message). Consider setting error.type to a stable type (or a default like Exception) and adding error.message (and ideally record_exception(...)) so error spans are queryable and consistent across instrumentations.
| otel_span.set_status(StatusCode.ERROR, span.error["message"]) | |
| otel_span.set_attribute("error.type", span.error["message"]) | |
| message = span.error.get("message") | |
| error_type = span.error.get("type") or "Exception" | |
| if message: | |
| otel_span.set_status(StatusCode.ERROR, message) | |
| otel_span.set_attribute("error.message", message) | |
| else: | |
| otel_span.set_status(StatusCode.ERROR) | |
| otel_span.set_attribute("error.type", error_type) |
| otel_span, ctx_token = entry | ||
| otel_span.end() | ||
| if ctx_token is not None: | ||
| otel_context.detach(ctx_token) | ||
|
|
There was a problem hiding this comment.
otel_context.detach(ctx_token) can raise (and also emits noisy ERROR logs) in cross-thread/async or out-of-order detach scenarios. The repo already provides opentelemetry.util.genai.handler._safe_detach() to avoid these issues; consider using that here so detach failures don’t leak context or spam logs.
| otel_span.end() | ||
| if ctx_token is not None: | ||
| otel_context.detach(ctx_token) | ||
|
|
There was a problem hiding this comment.
Same as trace detach: prefer opentelemetry.util.genai.handler._safe_detach(ctx_token) over otel_context.detach(ctx_token) to avoid noisy ERROR logs and context leaks if detach is called in an async/cross-thread/out-of-order scenario.
|
|
||
| @_dont_throw | ||
| def shutdown(self) -> None: |
There was a problem hiding this comment.
shutdown() currently clears _span_map/_trace_map without detaching any stored context tokens. If shutdown happens while spans/traces are still active, this can permanently leak the attached context (and you lose the tokens needed to detach). Consider iterating the maps and safely detaching tokens (and optionally ending any still-open spans) before clearing.
| @_dont_throw | |
| def shutdown(self) -> None: | |
| def _detach_all_context_tokens(self) -> None: | |
| """Detach all context tokens stored in internal maps. | |
| This is intended to be called during shutdown to avoid leaking | |
| contexts that were previously attached via otel_context.attach(). | |
| """ | |
| # Detach tokens associated with spans | |
| for token in list(self._span_map.values()): | |
| if token is None: | |
| continue | |
| try: | |
| otel_context.detach(token) | |
| except Exception: | |
| logger.debug( | |
| "Error detaching span context token during shutdown", | |
| exc_info=True, | |
| ) | |
| # Detach tokens associated with traces | |
| for token in list(self._trace_map.values()): | |
| if token is None: | |
| continue | |
| try: | |
| otel_context.detach(token) | |
| except Exception: | |
| logger.debug( | |
| "Error detaching trace context token during shutdown", | |
| exc_info=True, | |
| ) | |
| @_dont_throw | |
| def shutdown(self) -> None: | |
| # Safely detach any remaining context tokens before clearing maps | |
| self._detach_all_context_tokens() |
| try: | ||
| from agents.tracing.setup import ( # noqa: PLC0415 | ||
| get_trace_provider, | ||
| ) | ||
|
|
||
| provider = get_trace_provider() | ||
| if hasattr(provider, "_multi_processor"): | ||
| mp = provider._multi_processor | ||
| if hasattr(mp, "_processors"): | ||
| procs = mp._processors | ||
| if processor in procs: | ||
| procs.remove(processor) | ||
| except Exception as e: | ||
| logger.debug("Failed to remove processor: %s", e) | ||
|
|
There was a problem hiding this comment.
_uninstrument() removes the processor by reaching into private SDK internals (provider._multi_processor._processors). This is brittle across SDK versions and may silently fail (leaving the processor registered and causing duplicate spans on re-instrumentation). If the Agents SDK provides a public removal API, prefer that; otherwise consider making the failure mode explicit (e.g., warn once) and/or add defensive checks around processor duplication on _instrument().
| try: | |
| from agents.tracing.setup import ( # noqa: PLC0415 | |
| get_trace_provider, | |
| ) | |
| provider = get_trace_provider() | |
| if hasattr(provider, "_multi_processor"): | |
| mp = provider._multi_processor | |
| if hasattr(mp, "_processors"): | |
| procs = mp._processors | |
| if processor in procs: | |
| procs.remove(processor) | |
| except Exception as e: | |
| logger.debug("Failed to remove processor: %s", e) | |
| removed = False | |
| # Prefer a public removal API from the Agents SDK if available. | |
| try: | |
| from agents.tracing import ( # type: ignore[attr-defined] # noqa: PLC0415 | |
| remove_trace_processor, | |
| ) | |
| except Exception: # ImportError or absence of public API | |
| remove_trace_processor = None # type: ignore[assignment] | |
| if remove_trace_processor is not None: # type: ignore[truthy-function] | |
| try: | |
| remove_trace_processor(processor) # type: ignore[call-arg] | |
| removed = True | |
| except Exception as e: | |
| logger.warning( | |
| "Failed to remove OpenAI Agents tracing processor via " | |
| "public API: %s. Falling back to SDK internals.", | |
| e, | |
| ) | |
| # Fallback: best-effort removal via SDK internals, with explicit warnings | |
| if not removed: | |
| try: | |
| from agents.tracing.setup import ( # noqa: PLC0415 | |
| get_trace_provider, | |
| ) | |
| provider = get_trace_provider() | |
| if hasattr(provider, "_multi_processor"): | |
| mp = provider._multi_processor | |
| if hasattr(mp, "_processors"): | |
| procs = mp._processors | |
| if processor in procs: | |
| procs.remove(processor) | |
| removed = True | |
| if not removed: | |
| logger.warning( | |
| "OpenAI Agents tracing processor could not be " | |
| "located in the trace provider; it may still be " | |
| "registered, which can lead to duplicate spans on " | |
| "re-instrumentation." | |
| ) | |
| except Exception as e: | |
| logger.warning( | |
| "Failed to remove OpenAI Agents tracing processor via " | |
| "trace provider internals; the processor may still be " | |
| "registered, which can lead to duplicate spans on " | |
| "re-instrumentation: %s", | |
| e, | |
| ) |
There was a problem hiding this comment.
The OpenAI Agents SDK does not expose a public API to remove a trace processor. add_trace_processor exists but there is no corresponding remove_trace_processor.
The private attribute fallback with hasattr guards is the only viable approach.
| def _create_response_span( | ||
| self, | ||
| data: ResponseSpanData, | ||
| parent_ctx: Any | None, | ||
| ) -> OTelSpan: | ||
| model_name = "unknown" | ||
| if data.response and hasattr(data.response, "model"): | ||
| model_name = data.response.model or "unknown" | ||
| span_name = f"chat {model_name}" | ||
| span = self._handler._tracer.start_span( | ||
| name=span_name, | ||
| kind=SpanKind.CLIENT, | ||
| context=parent_ctx, | ||
| ) | ||
| span.set_attribute(GenAI.GEN_AI_OPERATION_NAME, "chat") | ||
| span.set_attribute(GenAI.GEN_AI_SYSTEM, "openai") | ||
| if model_name != "unknown": | ||
| span.set_attribute(GenAI.GEN_AI_REQUEST_MODEL, model_name) | ||
| return span | ||
|
|
There was a problem hiding this comment.
ResponseSpanData is handled in _create_span_for() / _create_response_span() / _apply_response_end(), but the test suite doesn’t currently exercise that branch (tests cover GenerationSpanData but not ResponseSpanData). Adding a focused unit test for ResponseSpanData would prevent regressions around model/usage/id extraction and message capture behavior.
| readme = "README.rst" | ||
| license = "Apache-2.0" | ||
| requires-python = ">=3.10" | ||
| authors = [ |
There was a problem hiding this comment.
Could you please add your information here?
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| __version__ = "0.3.0.dev" |
There was a problem hiding this comment.
| __version__ = "0.3.0.dev" | |
| __version__ = "0.4.0.dev" |
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| """Tests for the OpenAIAgentsInstrumentor lifecycle.""" |
There was a problem hiding this comment.
Please add these tests into tox-loongsuite.ini and run tox -e generate-workflows.
| @@ -0,0 +1 @@ | |||
| openai-agents | |||
There was a problem hiding this comment.
We need license header in this file. Just refer to
.| @@ -0,0 +1 @@ | |||
| openai-agents==0.0.7 | |||
There was a problem hiding this comment.
We need license header in this file. Just refer to
.| "OTEL_SEMCONV_STABILITY_OPT_IN", "gen_ai_latest_experimental" | ||
| ) | ||
| os.environ.setdefault( | ||
| "OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT", "true" |
There was a problem hiding this comment.
OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT should be set as SPAN_ONLY.
| @@ -0,0 +1,49 @@ | |||
| OpenTelemetry OpenAI Agents SDK Instrumentation | |||
There was a problem hiding this comment.
Please add loongsuite-instrumentation-openai-agents into README.md and README-zh.md at the root path of this project.
| @@ -0,0 +1,5 @@ | |||
| # Changelog | |||
There was a problem hiding this comment.
The format of the changelog file should follow the established conventions. Please refer to:
https://github.com/alibaba/loongsuite-python-agent/blob/main/instrumentation-loongsuite/loongsuite-instrumentation-dashscope/CHANGELOG.md
527e2cf to
d73a674
Compare
Description
Add OpenTelemetry instrumentation for the OpenAI Agents SDK, addressing #47.
The OpenAI Agents SDK ships with a built-in
TracingProcessorcallback interface that fires on every agent run, LLM call, and tool execution, etc. Instead of monkey-patching, this instrumentation registers a custom OTelTracingProcessor viaadd_trace_processor()that translates SDK spans into OTel spans following the GenAI semconv.Implementation ref https://github.com/traceloop/openllmetry/tree/main/packages/opentelemetry-instrumentation-openai-agents
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration
Does This PR Require a Core Repo Change?
Checklist:
See contributing.md for styleguide, changelog guidelines, and more.