-
Notifications
You must be signed in to change notification settings - Fork 33
fix(otlp-aws-exporter): avoid RecursionError with pip_system_certs on… #749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
6dc6b07
20e2462
492e834
cccf8de
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,12 +2,15 @@ | |
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| import logging | ||
| from threading import Lock | ||
|
|
||
| import requests | ||
| from botocore.auth import SigV4Auth | ||
| from botocore.awsrequest import AWSRequest | ||
| from botocore.session import Session | ||
|
|
||
| from amazon.opentelemetry.distro.patches._pip_system_certs_patches import apply_pip_system_certs_compatibility_patch | ||
|
|
||
| _logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
|
|
@@ -39,13 +42,70 @@ def __init__(self, aws_region: str, service: str, session: Session): | |
| self._service: str = service | ||
| self._session: Session = session | ||
|
|
||
| # Cached credentials are resolved on the first ``request()`` call. The returned | ||
| # ``Credentials`` / ``RefreshableCredentials`` object handles its own expiry and | ||
| # rotation when its attributes are accessed, so caching the reference does not | ||
| # cache the underlying credential values. | ||
| self._credentials = None | ||
| self._credentials_resolved = False | ||
| self._credentials_lock = Lock() | ||
|
|
||
| super().__init__() | ||
|
|
||
| def _ensure_initialized(self) -> None: | ||
| """Apply one-time, deferred initialization on the first ``request()`` call. | ||
|
|
||
| This runs after sitecustomize has fully completed (i.e., after any ``.pth`` | ||
| based ``ssl.SSLContext`` injection from packages such as ``pip_system_certs``), | ||
| which is the only point at which we can safely re-align stale ``SSLContext`` | ||
| references captured by ``botocore`` / ``urllib3`` during ADOT startup. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: Double-checked locking and free-threaded Python The double-checked locking pattern is correct for CPython due to the GIL, but self._credentials_resolved is read without the lock on the fast path (two lines above). In CPython, reading/writing a bool is atomic due to the GIL, so this is safe in practice. However, if this code ever runs on a Python implementation without a GIL (e.g., the experimental free-threaded Python 3.13+), this would need a memory barrier or atomic. Fine for now, but worth a brief comment noting the GIL dependency for future maintainers. |
||
|
|
||
| Credentials are also resolved once here. ``RefreshableCredentials`` handles | ||
| rotation internally on attribute access, so caching the reference is safe. | ||
|
|
||
| On a transient credential resolution failure (e.g., IMDS timeout), the | ||
| ``_credentials_resolved`` flag is left ``False`` so the next ``request()`` call | ||
| will retry. Only a successful resolution latches the flag, matching the | ||
| original "retry every request" behavior for the failure path while keeping | ||
| the SSL-context-construction cost amortized to once on the success path. | ||
|
|
||
| Note: the read of ``_credentials_resolved`` outside the lock is safe because | ||
| Python's GIL makes attribute reads/writes atomic. On free-threaded Python | ||
| builds (3.13t+) this would need a memory barrier; revisit if/when we | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Correctness concern: double-checked locking on free-threaded Python 3.13t+. The docstring acknowledges this, which is good. However Python 3.13 is already in pyproject.toml classifiers. If a user runs on a free-threaded 3.13t build, the unsynchronized read of self._credentials_resolved (here, outside the lock) and self._credentials in request() could produce torn reads or stale values. Consider always acquiring the lock (it is uncontended after the first successful call, so performance cost is negligible on CPython) or file a tracking issue for free-threaded build support. |
||
| support those. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Bug (thread-safety of
This is a narrow window but possible when multiple threads share one |
||
| """ | ||
| if self._credentials_resolved: | ||
| return | ||
|
|
||
| with self._credentials_lock: | ||
| if self._credentials_resolved: | ||
| return | ||
|
|
||
| # Realign stale ssl.SSLContext references in botocore / urllib3 before | ||
| # the first credential resolution constructs an SSL context. This is a | ||
| # no-op when pip_system_certs is not installed. | ||
| try: | ||
| apply_pip_system_certs_compatibility_patch() | ||
| except Exception as patch_error: # pylint: disable=broad-except | ||
| _logger.warning("Failed to apply pip_system_certs compatibility patch: %s", patch_error) | ||
|
|
||
| try: | ||
| self._credentials = self._session.get_credentials() | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note (double-checked locking ordering dependency): For correctness of the double-checked locking pattern, the ordering matters: However, if this code is ever refactored (e.g., the assignment order is swapped, or a future maintainer moves code around), the outer reader could see |
||
| except Exception as cred_error: # pylint: disable=broad-except | ||
| # Don't latch _credentials_resolved on failure - leave it False so | ||
| # the next request retries credential resolution. This preserves | ||
| # self-healing behavior on transient errors (e.g., IMDS timeouts). | ||
| _logger.error("Failed to load AWS Credentials: %s", cred_error) | ||
| self._credentials = None | ||
| return | ||
|
|
||
| self._credentials_resolved = True | ||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Bug: Permanent credential failure on transient errors If get_credentials() raises (e.g., IMDS timeout, transient network blip), self._credentials stays None and _credentials_resolved is set to True unconditionally here. All subsequent request() calls will skip signing forever for the lifetime of this session/exporter -- there is no retry path. This is a regression vs. the old behavior where get_credentials() was called on every request, meaning a transient failure on one export cycle would self-heal on the next. Consider either:
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @viw-test1 can you address this? |
||
| def request(self, method, url, *args, data=None, headers=None, **kwargs): | ||
| credentials = self._session.get_credentials() | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Bug: Double-logging on credential failure. When get_credentials() raises inside _ensure_initialized(), the error is already logged with the exception detail via _logger.error. Then here in request(), this else branch fires again and logs 'Failed to load AWS Credentials' without the exception. This produces duplicate error logs per failed request. Consider removing this else log or downgrading to _logger.debug. |
||
| self._ensure_initialized() | ||
|
|
||
| if credentials: | ||
| signer = SigV4Auth(credentials, self._service, self._aws_region) | ||
| if self._credentials: | ||
| signer = SigV4Auth(self._credentials, self._service, self._aws_region) | ||
| request = AWSRequest( | ||
| method="POST", | ||
| url=url, | ||
|
|
@@ -64,6 +124,6 @@ def request(self, method, url, *args, data=None, headers=None, **kwargs): | |
| except Exception as signing_error: # pylint: disable=broad-except | ||
| _logger.error("Failed to sign request: %s", signing_error) | ||
| else: | ||
| _logger.error("Failed to load AWS Credentials: %s") | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Bug: Double-logging on credential failure. When get_credentials() raises inside _ensure_initialized(), the error is already logged with the exception detail via _logger.error. Then here in request(), this else branch fires again and logs 'Failed to load AWS Credentials' without the exception. This produces duplicate error logs per failed request. Consider removing this else log or downgrading to _logger.debug. |
||
| _logger.error("Failed to load AWS Credentials") | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good fix. The original _logger.error with a %s format specifier but no corresponding argument would produce a malformed log message (or raise in strict logging configurations). Removing the orphaned %s is correct. |
||
|
|
||
| return super().request(method=method, url=url, *args, data=data, headers=headers, **kwargs) | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,100 @@ | ||
| # Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| from importlib.metadata import PackageNotFoundError, version | ||
| from logging import Logger, getLogger | ||
|
|
||
| _logger: Logger = getLogger(__name__) | ||
|
|
||
| # Module-level guard so the patch is applied at most once per process. | ||
| # The plain bool is intentional: the patch body itself is idempotent | ||
| # (re-running it produces the same final state), so a benign race between two | ||
| # threads where both observe ``_patch_applied is False`` and both run the rebind | ||
| # costs an extra dict assignment and nothing more. We don't pay for a lock here. | ||
| _patch_applied = False | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thread-safety: module-level _patch_applied guard is not atomic The _patch_applied flag is a plain boolean checked and set without a lock. While AwsAuthSession._ensure_initialized() has its own lock, apply_pip_system_certs_compatibility_patch() is a public function that could be called from multiple threads concurrently (e.g., if multiple exporters are initialized in parallel). Two threads could both read _patch_applied = False before either sets it to True, causing the patch to run twice. This is low-severity since the patch itself is idempotent, but it contradicts the stated guarantee (applied at most once per process). Consider using a threading.Lock here as well, or document that idempotency makes the race benign.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Bug (race condition consideration for free-threaded Python): The comment in Even under the GIL, if two |
||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thread safety: module-level _patch_applied guard is racy. The _patch_applied boolean is read/written without synchronization. Two threads calling apply_pip_system_certs_compatibility_patch() concurrently could both see _patch_applied = False and both execute the patch body. The patch itself is idempotent (rebinding an attribute twice is harmless), so this is not a correctness bug, but it is inconsistent with the careful double-checked locking used in _ensure_initialized(). Since the caller (_ensure_initialized) already holds _credentials_lock before calling this, the function is effectively single-threaded today. Worth a brief comment documenting that caller-side serialization assumption so future callers outside the lock do not introduce a real race.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thread-safety note on the module-level guard: The comment correctly states a benign race is acceptable because the patch body is idempotent. Since the caller (_ensure_initialized) already serializes via its own per-instance lock, this race can only happen if apply_pip_system_certs_compatibility_patch is called from a different call site in the future. Worth documenting the assumption that the caller currently provides serialization. |
||
|
|
||
| def _is_pip_system_certs_installed() -> bool: | ||
| """Is the pip_system_certs package installed?""" | ||
| try: | ||
| dist_version = version("pip_system_certs") | ||
| _logger.debug("pip_system_certs is installed: %s", dist_version) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Code quality (minor): |
||
| return True | ||
| except PackageNotFoundError as exc: | ||
| _logger.debug("pip_system_certs is not installed. %s", exc) | ||
| return False | ||
|
|
||
|
|
||
| def apply_pip_system_certs_compatibility_patch() -> None: | ||
| """Re-bind stale ``ssl.SSLContext`` references in botocore/urllib3. | ||
|
|
||
| When ``pip_system_certs`` is installed, it injects ``truststore.SSLContext`` as the | ||
| process-wide ``ssl.SSLContext`` via a ``.pth`` file. The injection runs in the | ||
| ``finally`` block of a ``site.execsitecustomize`` wrapper, i.e. *after* | ||
| ``sitecustomize.py`` returns. | ||
|
|
||
| OpenTelemetry's auto-instrumentation entry point (``opentelemetry-instrument``) | ||
| runs from ``sitecustomize.py``, which loads the ADOT distro and transitively imports | ||
| ``requests`` (via the upstream OTLP HTTP exporters) and ``botocore``. Both of those | ||
| modules capture a reference to ``ssl.SSLContext`` at import time. Because the import | ||
| happens before ``pip_system_certs``'s injection runs, the captured reference is the | ||
| original C-level ``ssl.SSLContext``, not the truststore-wrapped class. | ||
|
|
||
| On Python 3.12, ``ssl.SSLContext.options.__set__`` is implemented as | ||
| ``super(SSLContext, SSLContext).options.__set__(self, value)`` where ``SSLContext`` | ||
| is resolved from ``ssl``'s module globals at call time. After ``pip_system_certs`` | ||
| runs, that name resolves to ``truststore.SSLContext``, and the ``super()`` chain | ||
| bounces between the original and truststore classes until the recursion limit | ||
| (~978 frames) is exceeded. | ||
|
|
||
| This patch re-binds ``botocore.httpsession.SSLContext`` and | ||
| ``urllib3.util.ssl_.SSLContext`` to the *current* ``ssl.SSLContext`` | ||
| (i.e., truststore's wrapper). truststore's own ``SSLContext.options`` setter does | ||
| not use the recursive ``super()`` pattern, so subsequent SSL context creations | ||
| succeed. | ||
|
|
||
| The patch is idempotent: a module-level guard ensures it only runs once per | ||
| process. It is a no-op when ``pip_system_certs`` is not installed or when the | ||
| references already match ``ssl.SSLContext``. ``ImportError`` is the only | ||
| expected failure (e.g., ``botocore`` or ``urllib3`` not installed in some | ||
| minimal environment) and is silently skipped per library. | ||
| """ | ||
| global _patch_applied # pylint: disable=global-statement | ||
| if _patch_applied: | ||
| return | ||
|
|
||
| # Only apply the patch when pip_system_certs is installed in user application space. | ||
| if not _is_pip_system_certs_installed(): | ||
| _patch_applied = True | ||
| return | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Defensive suggestion: verify ssl.SSLContext is actually the truststore wrapper The patch reads ssl.SSLContext here and binds it to botocore/urllib3. This relies on the assumption that pip_system_certs has already completed its injection by the time this code runs (guaranteed when called from AwsAuthSession.request()). However, if apply_pip_system_certs_compatibility_patch() is ever called from a different call site (e.g., during module import or distro initialization), ssl.SSLContext would still be the original, and this code would be a no-op (since botocore.httpsession.SSLContext is ssl.SSLContext would be True). This is safe but silently ineffective. Consider adding a debug log that verifies ssl.SSLContext is actually the truststore wrapper (e.g., by checking ssl.SSLContext.module or ssl.SSLContext.name) to aid debugging if the call site assumptions change in the future.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Consideration: coverage of other modules that capture ssl.SSLContext at import time. This patch rebinds botocore.httpsession.SSLContext and urllib3.util.ssl_.SSLContext. If other dependencies in ADOT's transitive closure also capture ssl.SSLContext at import time during sitecustomize (e.g., aiohttp, httpx), they could hit the same recursion. Not necessarily something to fix now (YAGNI), but might be worth a brief note in the docstring that the patch covers the known affected modules and may need extension if other libraries are added to the auto-instrumentation. |
||
|
|
||
| # pylint: disable=import-outside-toplevel | ||
| import ssl | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Security consideration: This rebinds Worth noting: truststore delegates certificate verification to the OS trust store rather than certifi's bundled CA certificates. This is the intended behavior of |
||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit/future-proofing: Consider whether other libraries in the dependency tree might also hold stale ssl.SSLContext references (e.g., requests.adapters in some versions). Since requests delegates to urllib3, patching urllib3.util.ssl_.SSLContext should be sufficient for new SSL context creations in the current stack. But if this recursion is reported from a different call site in the future, this module is where to add additional rebindings. |
||
| try: | ||
| # pylint: disable=import-outside-toplevel | ||
| import botocore.httpsession | ||
|
|
||
| if botocore.httpsession.SSLContext is not ssl.SSLContext: | ||
| _logger.debug( | ||
| "Rebinding botocore.httpsession.SSLContext to current ssl.SSLContext (pip_system_certs detected)." | ||
| ) | ||
| botocore.httpsession.SSLContext = ssl.SSLContext | ||
| except ImportError: | ||
| # botocore not installed; nothing to rebind on the botocore side. | ||
| pass | ||
|
|
||
| try: | ||
| # pylint: disable=import-outside-toplevel | ||
| import urllib3.util.ssl_ | ||
|
|
||
| if urllib3.util.ssl_.SSLContext is not ssl.SSLContext: | ||
| _logger.debug( | ||
| "Rebinding urllib3.util.ssl_.SSLContext to current ssl.SSLContext (pip_system_certs detected)." | ||
| ) | ||
| urllib3.util.ssl_.SSLContext = ssl.SSLContext | ||
| except ImportError: | ||
| # urllib3 not installed; nothing to rebind. | ||
| pass | ||
|
|
||
| _patch_applied = True | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -47,3 +47,130 @@ def test_aws_auth_session(self, _, __): | |
| self.assertIn(AUTHORIZATION_HEADER, actual_headers) | ||
| self.assertIn(X_AMZ_DATE_HEADER, actual_headers) | ||
| self.assertIn(X_AMZ_SECURITY_TOKEN_HEADER, actual_headers) | ||
|
|
||
| @patch("requests.Session.request", return_value=requests.Response()) | ||
| @patch("botocore.session.Session.get_credentials", return_value=mock_credentials) | ||
| def test_credentials_are_resolved_once(self, mock_get_credentials, _): | ||
| """Credentials must be resolved only once across multiple ``request()`` calls. | ||
|
|
||
| This is the hot-path mitigation for the pip_system_certs RecursionError: each | ||
| ``get_credentials()`` call walks the credential resolver chain, which constructs | ||
| a urllib3 SSL context. Caching the returned object (``RefreshableCredentials`` | ||
| rotates internally on attribute access) ensures the SSL context is created at | ||
| most once per exporter, not once per export. | ||
| """ | ||
| session = AwsAuthSession("us-east-1", "xray", get_aws_session()) | ||
|
|
||
| for _ in range(5): | ||
| session.request("POST", AWS_OTLP_TRACES_ENDPOINT, data="", headers={}) | ||
|
|
||
| self.assertEqual(mock_get_credentials.call_count, 1) | ||
|
|
||
| @patch("requests.Session.request", return_value=requests.Response()) | ||
| def test_credentials_retry_after_transient_failure(self, _): | ||
| """A transient ``get_credentials()`` failure must NOT latch the resolved | ||
| flag. The next ``request()`` call must retry resolution. This preserves | ||
| self-healing behavior on transient errors (e.g., IMDS timeouts) and matches | ||
| the pre-fix behavior on the failure path. | ||
| """ | ||
| # First call raises, subsequent calls succeed. | ||
| get_credentials_mock = patch( | ||
| "botocore.session.Session.get_credentials", | ||
| side_effect=[RuntimeError("transient"), mock_credentials, mock_credentials], | ||
| ) | ||
| with get_credentials_mock as mock_get_credentials: | ||
| session = AwsAuthSession("us-east-1", "xray", get_aws_session()) | ||
|
|
||
| # 1st request: get_credentials raises, no auth headers added. | ||
| headers_first = {} | ||
| session.request("POST", AWS_OTLP_TRACES_ENDPOINT, data="", headers=headers_first) | ||
| self.assertNotIn(AUTHORIZATION_HEADER, headers_first) | ||
|
|
||
| # 2nd request: get_credentials succeeds, auth headers must appear. | ||
| headers_second = {} | ||
| session.request("POST", AWS_OTLP_TRACES_ENDPOINT, data="", headers=headers_second) | ||
| self.assertIn(AUTHORIZATION_HEADER, headers_second) | ||
|
|
||
| # 3rd request: cached credentials reused, no further get_credentials calls. | ||
| headers_third = {} | ||
| session.request("POST", AWS_OTLP_TRACES_ENDPOINT, data="", headers=headers_third) | ||
| self.assertIn(AUTHORIZATION_HEADER, headers_third) | ||
|
|
||
| # Two resolution attempts: one failed, one succeeded; third request reuses cache. | ||
| self.assertEqual(mock_get_credentials.call_count, 2) | ||
|
|
||
| @patch("requests.Session.request", return_value=requests.Response()) | ||
| @patch("botocore.session.Session.get_credentials", return_value=mock_credentials) | ||
| @patch( | ||
| "amazon.opentelemetry.distro.exporter.otlp.aws.common.aws_auth_session" | ||
| ".apply_pip_system_certs_compatibility_patch" | ||
| ) | ||
| def test_pip_system_certs_patch_invoked_on_first_request(self, mock_apply_patch, _, __): | ||
| """The ssl.SSLContext rebind helper is invoked on the first ``request()`` call | ||
| and not re-invoked on subsequent calls. | ||
|
|
||
| The patch itself is a no-op when pip_system_certs is not installed, so this | ||
| test only asserts the call site, not the patch behavior.""" | ||
| session = AwsAuthSession("us-east-1", "xray", get_aws_session()) | ||
|
|
||
| session.request("POST", AWS_OTLP_TRACES_ENDPOINT, data="", headers={}) | ||
| session.request("POST", AWS_OTLP_TRACES_ENDPOINT, data="", headers={}) | ||
| session.request("POST", AWS_OTLP_TRACES_ENDPOINT, data="", headers={}) | ||
|
|
||
| self.assertEqual(mock_apply_patch.call_count, 1) | ||
|
|
||
| @patch("requests.Session.request", return_value=requests.Response()) | ||
| @patch( | ||
| "amazon.opentelemetry.distro.exporter.otlp.aws.common.aws_auth_session" | ||
| ".apply_pip_system_certs_compatibility_patch", | ||
| side_effect=RuntimeError("simulated patch failure"), | ||
| ) | ||
| @patch("botocore.session.Session.get_credentials", return_value=mock_credentials) | ||
| def test_patch_failure_does_not_break_request(self, _, __, ___): | ||
| """If the SSL-context-rebind helper itself raises, the failure is logged | ||
| but ``request()`` still proceeds and signs successfully. The patch is | ||
| defensive infrastructure, not a hard precondition.""" | ||
| session = AwsAuthSession("us-east-1", "xray", get_aws_session()) | ||
| actual_headers: dict = {} | ||
|
|
||
| session.request("POST", AWS_OTLP_TRACES_ENDPOINT, data="", headers=actual_headers) | ||
|
|
||
| self.assertIn(AUTHORIZATION_HEADER, actual_headers) | ||
|
|
||
| @patch("requests.Session.request", return_value=requests.Response()) | ||
| @patch("botocore.session.Session.get_credentials", return_value=mock_credentials) | ||
| def test_signing_failure_does_not_break_request(self, _, __): | ||
| """If SigV4 signing itself raises, ``request()`` still issues the | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Code quality: Looking at the implementation, Consider either:
|
||
| unauthenticated request rather than crashing the caller.""" | ||
| session = AwsAuthSession("us-east-1", "xray", get_aws_session()) | ||
|
|
||
| with patch("amazon.opentelemetry.distro.exporter.otlp.aws.common.aws_auth_session.SigV4Auth") as mock_sigv4: | ||
| mock_sigv4.return_value.add_auth.side_effect = RuntimeError("signing boom") | ||
| actual_headers: dict = {} | ||
| # Should not raise | ||
| session.request("POST", AWS_OTLP_TRACES_ENDPOINT, data="", headers=actual_headers) | ||
|
|
||
| # No auth header because signing raised before headers could be merged. | ||
| self.assertNotIn(AUTHORIZATION_HEADER, actual_headers) | ||
|
|
||
| @patch("requests.Session.request", return_value=requests.Response()) | ||
| @patch("botocore.session.Session.get_credentials", return_value=mock_credentials) | ||
| def test_concurrent_requests_resolve_credentials_once(self, mock_get_credentials, _): | ||
| """Two threads racing on the first request must both observe a single | ||
| credential resolution. The double-checked locking in ``_ensure_initialized`` | ||
| is what provides this guarantee.""" | ||
| # pylint: disable=import-outside-toplevel | ||
| from threading import Thread | ||
|
|
||
| session = AwsAuthSession("us-east-1", "xray", get_aws_session()) | ||
|
|
||
| def call(): | ||
| session.request("POST", AWS_OTLP_TRACES_ENDPOINT, data="", headers={}) | ||
|
|
||
| threads = [Thread(target=call) for _ in range(8)] | ||
| for t in threads: | ||
| t.start() | ||
| for t in threads: | ||
| t.join() | ||
|
|
||
| self.assertEqual(mock_get_credentials.call_count, 1) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit (future-proofing): double-checked locking and free-threaded Python.
This is a textbook double-checked locking pattern and works correctly in CPython due to the GIL. However, Python 3.13+ has experimental free-threaded builds (PEP 703) where reading _credentials_resolved outside the lock is a data race. Not blocking for now, but worth noting if this project plans to support --disable-gil in the future.