Revert "feat: auto-detect and mutually exclude native vs third-party agentic instrumentors (#729)"#750
Conversation
…agentic instrumentors (aws-observability#729)" This reverts commit 02ada61.
There was a problem hiding this comment.
Code Review Summary
This PR reverts the auto-detection/mutual-exclusion logic from #729 and replaces it with a simpler two-env-var approach (AGENT_OBSERVABILITY_ENABLED for legacy v1, AWS_AGENTIC_OBSERVABILITY_OPT_IN for v2). The overall design is cleaner and easier to reason about.
Key findings (see inline comments):
-
Behavior change - missing disabled instrumentations:
http,urllib3, andrequestsare no longer in the disabled list, which will re-enable noisy HTTP spans for existing legacy users. -
Behavior change - CREWAI_DISABLE_TELEMETRY: Only set in the v2 path, not legacy. Existing users may get duplicate CrewAI traces.
-
Typo: "brokens" in a comment.
-
.gitignore:
.coverage.*removal seems unrelated to the feature revert.
The refactoring of _configure_common_agent_observability() to share common setup is a good pattern. Tests are comprehensive and cover the new env var combinations well.
| "openai_agents": "aws_openai_agents", | ||
| } | ||
| AGENT_OBSERVABILITY_DISABLED_INSTRUMENTATIONS = ( | ||
| "sqlalchemy,psycopg2,pymysql,sqlite3,aiopg,asyncpg,mysql_connector," |
There was a problem hiding this comment.
Bug/Behavior change: This list no longer includes http, urllib3, and requests, which were present in the old disabled list for the legacy path. Existing users on AGENT_OBSERVABILITY_ENABLED will suddenly get HTTP-level instrumentation re-enabled, producing potentially noisy low-level spans that were previously suppressed. If intentional, this should be documented as a breaking change. If not, consider adding http,urllib3,requests back. The same applies to AWS_AGENTIC_OBSERVABILITY_DISABLED_INSTRUMENTATIONS below.
| self._configure_common_agent_observability(AWS_AGENTIC_OBSERVABILITY_DISABLED_INSTRUMENTATIONS) | ||
| os.environ.setdefault(OTEL_METRICS_EXPORTER, "otlp") | ||
| os.environ.setdefault("CREWAI_DISABLE_TELEMETRY", "true") | ||
| elif is_agent_observability_enabled(): |
There was a problem hiding this comment.
Behavior change: CREWAI_DISABLE_TELEMETRY is only set in the is_aws_agentic_observability_opt_in() branch but NOT in the legacy is_agent_observability_enabled() branch below. The old code set this unconditionally when agent observability was enabled. Existing legacy-path users will now get CrewAI's built-in telemetry re-enabled, which could cause duplicate/conflicting traces. Consider adding it to _configure_common_agent_observability() or to the legacy branch explicitly.
| # Some third-party SDKs register the same entry point name as the upstream | ||
| # OTel packages that we depend on. For Agentic Observability legacy mode, skip our bundled | ||
| # OTel instrumentation so that existing third-party setups are not brokens. | ||
| if ( |
There was a problem hiding this comment.
Typo: "are not brokens" should be "are not broken".
| coverage.xml | ||
| .coverage | ||
| .coverage.* | ||
| .nox |
There was a problem hiding this comment.
Removing .coverage.* from .gitignore means coverage data files from parallel test runs (e.g., .coverage.hostname.12345) will now appear as untracked files. This was likely added in the same commit being reverted but appears unrelated to the agentic instrumentation feature itself. Consider keeping this entry.
18b741f to
fcd39d1
Compare
| from amazon.opentelemetry.distro.aws_opentelemetry_distro import ( | ||
| AGENT_OBSERVABILITY_DISABLED_INSTRUMENTATIONS, | ||
| AwsOpenTelemetryDistro, | ||
| ) |
There was a problem hiding this comment.
Bug (test isolation): CREWAI_DISABLE_TELEMETRY was removed from the env_vars_to_check list, but the production code still sets it in the is_aws_agentic_observability_opt_in() branch. If test_configure_ai_observability_opt_in runs, this env var will be set but never cleaned up in tearDown, potentially polluting subsequent tests. Suggested fix: add CREWAI_DISABLE_TELEMETRY back to the cleanup list.
| _logger.info("AWS Agentic Observability enabled.") | ||
| self._configure_common_agent_observability(AWS_AGENTIC_OBSERVABILITY_DISABLED_INSTRUMENTATIONS) | ||
| os.environ.setdefault(OTEL_METRICS_EXPORTER, "otlp") | ||
| os.environ.setdefault("CREWAI_DISABLE_TELEMETRY", "true") |
There was a problem hiding this comment.
Behavioral subtlety: When both AWS_AGENTIC_OBSERVABILITY_OPT_IN=true AND AGENT_OBSERVABILITY_ENABLED=true are set simultaneously, this elif means only the new path executes -- no region-based endpoint auto-configuration, no awsemf metrics exporter. Meanwhile, is_agentic_observability_enabled() (used in the configurator) returns True for both, enabling LLO handling, unsampled span export, etc. This may be intentional precedence, but the behavior is surprising for a user migrating incrementally (setting the new var while keeping the old). Consider documenting this precedence explicitly, or logging a warning when both are set.
| mock_is_installed.return_value = False | ||
|
|
||
| AwsOpenTelemetryDistro()._configure() | ||
|
|
There was a problem hiding this comment.
Missing test assertion: This test verifies the new AWS_AGENTIC_OBSERVABILITY_OPT_IN path but does not assert that CREWAI_DISABLE_TELEMETRY is set to true, even though the production code explicitly sets it (line 193 in the source). Adding this assertion would catch regressions if the line is accidentally removed.
| """ | ||
| if is_agent_observability_enabled() and self._should_skip_instrumentor(entry_point): | ||
| if self._should_skip_instrumentor(entry_point): | ||
| return |
There was a problem hiding this comment.
Nit (inconsistency): The method is called as self._configure_common_agent_observability(...) but is a @staticmethod. While valid in Python, consider making this a regular method (drop @staticmethod) for consistency with how it is invoked, or call it as AwsOpenTelemetryDistro._configure_common_agent_observability(...).
| @@ -300,11 +305,58 @@ def test_configure_agent_observability_v1( | |||
| self.assertEqual(os.environ.get(OTEL_PYTHON_LOGGING_AUTO_INSTRUMENTATION_ENABLED), "true") | |||
| self.assertEqual(os.environ.get(APPLICATION_SIGNALS_ENABLED_CONFIG), "false") | |||
There was a problem hiding this comment.
Missing test case: The _should_skip_instrumentor logic explicitly checks is_agent_observability_enabled() and not is_aws_agentic_observability_opt_in(). There is no test case for the scenario where BOTH env vars are true simultaneously. In that case, the skip is bypassed (because of the not is_aws_agentic_observability_opt_in() guard), and the opentelemetry-instrumentation-openai-agents-v2 package will be loaded. Adding a case like {"env": {"AGENT_OBSERVABILITY_ENABLED": "true", "AWS_AGENTIC_OBSERVABILITY_OPT_IN": "true"}, "entry_name": "openai_agents", "dist_name": "opentelemetry-instrumentation-openai-agents-v2", "should_load": True} would verify this intended behavior.
This reverts commit 02ada61 (#729)
Issue #, if available:
Description of changes:
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.