Skip to content

Comments

GenAI Utils | Embedding Type and Span Creation#4219

Open
shuningc wants to merge 4 commits intoopen-telemetry:mainfrom
shuningc:adding-embedding-type
Open

GenAI Utils | Embedding Type and Span Creation#4219
shuningc wants to merge 4 commits intoopen-telemetry:mainfrom
shuningc:adding-embedding-type

Conversation

@shuningc
Copy link

@shuningc shuningc commented Feb 19, 2026

Description

This PR adds initial embedding lifecycle coverage, to add metrics in the next PR.

Type of change

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How Has This Been Tested?

Ran from repo root with local package path:
PYTHONPATH=util/opentelemetry-util-genai/src pytest -q util/opentelemetry-util-genai/tests/test_utils.py
PYTHONPATH=util/opentelemetry-util-genai/src pytest -q util/opentelemetry-util-genai/tests/test_handler_metrics.py
Results:
[test_utils.py]: 23 passed

Does This PR Require a Core Repo Change?

  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

@shuningc shuningc requested a review from a team as a code owner February 19, 2026 17:08
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Feb 19, 2026

CLA Signed

The committers listed above are authorized under a signed CLA.

@aabmass aabmass marked this pull request as draft February 19, 2026 17:19
- Add `EmbeddingInvocation` span lifecycle support (`start_embedding`,
`stop_embedding`, `fail_embedding`), embedding span tests, and defer
embedding metrics emission (current no-op) to a follow-up PR.
([#<PR_NUMBER>](https://github.com/open-telemetry/opentelemetry-python-contrib/pull/<PR_NUMBER>))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: <PR_NUMBER> placeholder is still present; can we replace this with #4219 before merge?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, updated

- `gen_ai.provider.name`: Str(openai)
- `gen_ai.operation.name`: Str(embeddings)
- `gen_ai.request.model`: Str(text-embedding-3-small)
- `gen_ai.embedding.dimension_count`: Int(1536)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the attribute name here looks outdated; semconv key is gen_ai.embeddings.dimension.count (plural + dotted), not gen_ai.embedding.dimension_count.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

"""Get GenAI request semantic convention attributes."""
optional_attrs = (
(GenAI.GEN_AI_REQUEST_MODEL, invocation.request_model),
(GenAI.GEN_AI_EMBEDDING_DIMENSION_COUNT, invocation.dimension_count),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential blocker: GenAI.GEN_AI_EMBEDDING_DIMENSION_COUNT does not exist in the current semconv module (the available constant is GEN_AI_EMBEDDINGS_DIMENSION_COUNT). This path is exercised in stop_embedding/fail_embedding, so this currently raises AttributeError before cleanup. Could we switch to the ...EMBEDDINGS... constant and align the related assertions/docs?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated


operation_name: str = field(
default=GenAI.GenAiOperationNameValues.EMBEDDINGS.value,
metadata={"semconv": GenAI.GEN_AI_OPERATION_NAME},

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove metadata={"semconv": ...} from all fields — no auto-mapping infrastructure reads this metadata.
The mapping is done explicitly in span_utils.py.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed


server_address: str | None = None
server_port: int | None = None
error_type: str | None = None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove error_type — not a request-time field. Error type is derived from Error.type.qualname in fail_embedding.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

EmbeddingInvocation,
Error,
LLMInvocation,
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing embedding() context manager.

@contextmanager
    def embedding()

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@shuningc shuningc changed the title [WIP] GenAI Utils | Embedding Type and Span Creation GenAI Utils | Embedding Type and Span Creation Feb 23, 2026
@shuningc shuningc marked this pull request as ready for review February 23, 2026 16:04
Comment on lines 91 to +100
def _get_llm_span_name(invocation: LLMInvocation) -> str:
"""Get the span name for an LLM invocation."""
return f"{invocation.operation_name} {invocation.request_model}".strip()


def _get_embedding_span_name(invocation: EmbeddingInvocation) -> str:
"""Get the span name for an Embedding invocation."""
return f"{invocation.operation_name} {invocation.request_model}".strip()


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeating code, a refactor here could happen if we move operation_name and request_model to GenAIInvocation instead of on each type.

Additional attributes to set on metrics. Must be of a low cardinality.
These attributes will not be set on spans or events.
"""
monotonic_start_s: float | None = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these fields are the same as LLMInvocation fields. I think this is an ideal candidate for moving some of these fields to GenAIInvocation.

This could simplify the metrics recording once this supports Embedding Metrics. Also could simplify fetching attributes for spans

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants