Skip to content

fix AttributePropagatingSpanProcessor with NonRecordingSpan as parent#744

Open
zhongkechen wants to merge 3 commits into
aws-observability:mainfrom
zhongkechen:NonRecordingSpanAsParent
Open

fix AttributePropagatingSpanProcessor with NonRecordingSpan as parent#744
zhongkechen wants to merge 3 commits into
aws-observability:mainfrom
zhongkechen:NonRecordingSpanAsParent

Conversation

@zhongkechen
Copy link
Copy Markdown

@zhongkechen zhongkechen commented May 12, 2026

Issue #, if available:

Fixes: #743

Description of changes:

elif parent_span and _is_server_kind(parent_span):
propagation_data = self._propagation_data_extractor(parent_span)

When the parent_span isn't a ReadableSpan (e.g. a NonRecordingSpan), _is_server_kind would throw an attribute exception because the other types of span may not have the kind attribute.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@zhongkechen zhongkechen requested a review from a team as a code owner May 12, 2026 23:14
@zhongkechen zhongkechen changed the title fix AttributePropagatingSpanProcessor with NonRecordingSPan as parent fix AttributePropagatingSpanProcessor with NonRecordingSpan as parent May 12, 2026
@liustve
Copy link
Copy Markdown
Contributor

liustve commented May 15, 2026

I partially am okay with these changes, though I don't agree with this span processor working with NonRecording Spans since there's no SpanKind.
https://github.com/open-telemetry/opentelemetry-python/blob/main/opentelemetry-api/src/opentelemetry/trace/span.py#L523

Copy link
Copy Markdown
Contributor

@liustve liustve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of isinstance(parent_span, ReadableSpan), could we use parent_span.is_recording() instead?

The OTel spec defines that NonRecordingSpan.is_recording() must return False, so it's a cleaner check that doesn't require importing ReadableSpan and aligns directly with the spec's contract:

        elif parent_span and parent_span.is_recording() and _is_server_kind(parent_span):
            propagation_data = self._propagation_data_extractor(parent_span)
        elif parent_span and parent_span.is_recording():
            propagation_data = parent_span.attributes.get(self._propagation_data_key)

@liustve
Copy link
Copy Markdown
Contributor

liustve commented May 15, 2026

@zhongkechen Please add unit tests for this change as well and a CHANGELOG.md entry

@zhongkechen
Copy link
Copy Markdown
Author

zhongkechen commented May 15, 2026

instead of isinstance(parent_span, ReadableSpan), could we use parent_span.is_recording() instead?

The OTel spec defines that NonRecordingSpan.is_recording() must return False, so it's a cleaner check that doesn't require importing ReadableSpan and aligns directly with the spec's contract:

        elif parent_span and parent_span.is_recording() and _is_server_kind(parent_span):
            propagation_data = self._propagation_data_extractor(parent_span)
        elif parent_span and parent_span.is_recording():
            propagation_data = parent_span.attributes.get(self._propagation_data_key)

I'm using isinstance(parent_span, ReadableSpan) because a few lines above my change is using that:

https://github.com/zhongkechen/aws-otel-python-instrumentation/blob/0ac556982b8a06e456018e3dbd2948898dfe96e7/aws-opentelemetry-distro/src/amazon/opentelemetry/distro/attribute_propagating_span_processor.py#L50

    def on_start(self, span: Span, parent_context: Optional[Context] = None) -> None:
        parent_span: ReadableSpan = get_current_span(parent_context)
        if isinstance(parent_span, ReadableSpan):

I thought I should keep the check logic consistent at least in the same method

@liustve
Copy link
Copy Markdown
Contributor

liustve commented May 15, 2026

instead of isinstance(parent_span, ReadableSpan), could we use parent_span.is_recording() instead?
The OTel spec defines that NonRecordingSpan.is_recording() must return False, so it's a cleaner check that doesn't require importing ReadableSpan and aligns directly with the spec's contract:

        elif parent_span and parent_span.is_recording() and _is_server_kind(parent_span):
            propagation_data = self._propagation_data_extractor(parent_span)
        elif parent_span and parent_span.is_recording():
            propagation_data = parent_span.attributes.get(self._propagation_data_key)

I'm using isinstance(parent_span, ReadableSpan) because a few lines above my change is using that:

https://github.com/zhongkechen/aws-otel-python-instrumentation/blob/0ac556982b8a06e456018e3dbd2948898dfe96e7/aws-opentelemetry-distro/src/amazon/opentelemetry/distro/attribute_propagating_span_processor.py#L50

    def on_start(self, span: Span, parent_context: Optional[Context] = None) -> None:
        parent_span: ReadableSpan = get_current_span(parent_context)
        if isinstance(parent_span, ReadableSpan):

I thought I should keep the check logic consistent at least in the same method

I see though I'd still suggest using what is OTel spec aligned instead of using isinstance imo.

@Aneurysm9
Copy link
Copy Markdown
Member

I don't think parent_span.is_recording() actually conveys the necessary information here. That's an API method intended for allowing instrumentation to skip potentially expensive attribute calculation/creation operations if the result won't be recorded. An API span may be recording and still write-only. Also, OnStart() should only ever be called on recording spans and I'm not aware of any situation in which a recording span would have a non-recording parent span. Maybe if the parent span was ended before the child was started, but even then it may be possible to get a readable span from the parent despite it no longer being in a recording state.

Since this is an SDK plugin it should be fine for it to introspect API spans to get at the underlying SDK spans, which is where ReadableSpan would fit in. The thing I'm not sure of with the Python implementation is whether a ReadableSpan is always read-only, or if a read/write span also satisfies isinstance(ps, ReadableSpan).

@liustve
Copy link
Copy Markdown
Contributor

liustve commented May 16, 2026

I don't think parent_span.is_recording() actually conveys the necessary information here. That's an API method intended for allowing instrumentation to skip potentially expensive attribute calculation/creation operations if the result won't be recorded. An API span may be recording and still write-only. Also, OnStart() should only ever be called on recording spans and I'm not aware of any situation in which a recording span would have a non-recording parent span. Maybe if the parent span was ended before the child was started, but even then it may be possible to get a readable span from the parent despite it no longer being in a recording state.

Since this is an SDK plugin it should be fine for it to introspect API spans to get at the underlying SDK spans, which is where ReadableSpan would fit in. The thing I'm not sure of with the Python implementation is whether a ReadableSpan is always read-only, or if a read/write span also satisfies isinstance(ps, ReadableSpan).

I'm okay with the changes then, @zhongkechen please add unit tests and a changelog entry and we can help merge this

@zhongkechen zhongkechen force-pushed the NonRecordingSpanAsParent branch from 0ac5569 to c6b79ce Compare May 26, 2026 00:02
@zhongkechen
Copy link
Copy Markdown
Author

added a unit test case

from amazon.opentelemetry.distro._aws_attribute_keys import AWS_CONSUMER_PARENT_SPAN_KIND, AWS_SDK_DESCENDANT, \
AWS_TRACE_FLAG_SAMPLED
from amazon.opentelemetry.distro._aws_span_processing_util import get_ingress_operation
from amazon.opentelemetry.distro.attribute_propagating_span_processor import AttributePropagatingSpanProcessor
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: NonRecordingSpan is defined in opentelemetry.trace, not opentelemetry.sdk.trace. This import will raise an ImportError at runtime. The correct import should add NonRecordingSpan to the opentelemetry.trace import line instead (line 11 in the base file).

self.assertEqual(grand_child_span.attributes.get(SpanAttributes.CLOUD_RESOURCE_ID), cloud_resource_id)

def test_attributes_propagation_with_non_recording_span(self):
parent_span: Span = NonRecordingSpan(SpanContext(1, 2, True, TraceFlags.SAMPLED, TraceState.get_default()))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test assertion only checks AWS_TRACE_FLAG_SAMPLED, which is unconditionally set on every span (line 87 of the source) regardless of parent type. This means the test would pass even without the bug fix. To properly validate the fix, consider asserting that the propagation-data attribute is NOT set on the child (since a NonRecordingSpan parent cannot provide it), e.g.: self.assertIsNone(child_span.attributes.get(_SPAN_NAME_KEY)). This directly tests the code path that was broken before the fix.

if not _is_server_kind(span):
propagation_data = self._propagation_data_extractor(span)
elif parent_span and _is_server_kind(parent_span):
elif parent_span and isinstance(parent_span, ReadableSpan) and _is_server_kind(parent_span):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix is correct. Minor suggestion: line 49 assigns parent_span with the type annotation ReadableSpan, but get_current_span() can actually return any Span (including NonRecordingSpan). Consider changing the annotation to Span to make it obvious why the isinstance checks are necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AttributePropagatingSpanProcessor doesn't accept non-Readable spans

3 participants