Skip to content

feat: add support for langchain embedding#157

Open
minimAluminiumalism wants to merge 7 commits intoalibaba:mainfrom
minimAluminiumalism:main
Open

feat: add support for langchain embedding#157
minimAluminiumalism wants to merge 7 commits intoalibaba:mainfrom
minimAluminiumalism:main

Conversation

@minimAluminiumalism
Copy link
Copy Markdown
Contributor

@minimAluminiumalism minimAluminiumalism commented Mar 26, 2026

Description

Implement #142

Semconv ref https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/#embeddings

Clipboard_Screenshot_1774509500

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

instrumentation-loongsuite/loongsuite-instrumentation-langchain/tests/test_embedding_spans.py

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds LangChain Embeddings instrumentation to loongsuite-instrumentation-langchain, emitting GenAI embedding spans via opentelemetry-util-genai per the GenAI semantic conventions.

Changes:

  • Introduces an Embeddings patch that wraps embed_* / aembed_* on all current and future langchain_core.embeddings.Embeddings subclasses.
  • Adds a comprehensive embedding span test suite (sync/async, error cases, dedup/proxy behavior, late subclassing, uninstrumentation).
  • Documents embedding spans in the README and records the feature in the changelog.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
instrumentation-loongsuite/loongsuite-instrumentation-langchain/src/opentelemetry/instrumentation/langchain/internal/patch_embedding.py New patch module that wraps Embeddings methods and emits embedding spans through ExtendedTelemetryHandler.
instrumentation-loongsuite/loongsuite-instrumentation-langchain/src/opentelemetry/instrumentation/langchain/init.py Wires embedding instrumentation into the instrumentor’s instrument/uninstrument lifecycle.
instrumentation-loongsuite/loongsuite-instrumentation-langchain/tests/test_embedding_spans.py New tests validating span creation, attributes, errors, deduplication, and uninstrument behavior for embeddings.
instrumentation-loongsuite/loongsuite-instrumentation-langchain/README.md Adds “Embedding” span kind and its key attributes to the supported spans table.
instrumentation-loongsuite/loongsuite-instrumentation-langchain/CHANGELOG.md Notes embedding span support in the Unreleased “Added” section.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +434 to +438
# 2. Install an __init_subclass__ hook so future subclasses are
# patched automatically.
_original_init_subclass = Embeddings.__dict__.get("__init_subclass__")

@classmethod # type: ignore[misc]
Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instrument_embeddings() is not idempotent: if the instrumentor’s instrument() is called multiple times (there is an existing test that does this), _original_init_subclass gets overwritten with the already-patched hook. A subsequent uninstrument_embeddings() will then restore Embeddings.__init_subclass__ to the patched hook instead of the true original, meaning future subclasses can still be auto-patched after uninstrumentation. Consider guarding against re-instrumentation (e.g., a module-level “already_instrumented” flag / storing the original only once) and skipping re-installing the hook when it’s already patched by this module.

Copilot uses AI. Check for mistakes.
Comment on lines +99 to +106
"""Extract a model name from an Embeddings instance (if available)."""
for attr in ("model", "model_name", "model_id", "deployment_name"):
val = getattr(instance, attr, None)
if val and isinstance(val, str):
return val
return ""


Copy link

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_extract_embedding_model() returns an empty string when no model attribute is found, but EmbeddingInvocation.request_model is used for span naming and (when non-empty) the required gen_ai.request.model attribute. An empty string yields a span name like "embeddings " and omits the model attribute entirely. Consider falling back to a more informative value (e.g., the embeddings class name) to avoid producing an effectively unnamed model in spans.

Suggested change
"""Extract a model name from an Embeddings instance (if available)."""
for attr in ("model", "model_name", "model_id", "deployment_name"):
val = getattr(instance, attr, None)
if val and isinstance(val, str):
return val
return ""
"""Extract a model name from an Embeddings instance (if available).
If no explicit model attribute is found, fall back to the class name
so that spans and attributes never use an empty model identifier.
"""
for attr in ("model", "model_name", "model_id", "deployment_name"):
val = getattr(instance, attr, None)
if val and isinstance(val, str):
return val
# Fallback: use the embeddings class name as a best-effort model identifier
cls = type(instance)
return cls.__name__ or repr(cls)

Copilot uses AI. Check for mistakes.
@alibaba alibaba deleted a comment from Copilot AI Apr 1, 2026
Copy link
Copy Markdown
Collaborator

@Cirilla-zmh Cirilla-zmh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this great PR! Some comments should be resolved before getting merged.


### Added

- Langchain embedding span support([#157](https://github.com/alibaba/loongsuite-python-agent/pull/157))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Langchain embedding span support([#157](https://github.com/alibaba/loongsuite-python-agent/pull/157))
- LangChain embedding span support
([#157](https://github.com/alibaba/loongsuite-python-agent/pull/157))

return wrapper


def _make_aembed_query_wrapper(
Copy link
Copy Markdown
Collaborator

@Cirilla-zmh Cirilla-zmh Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not reuse _make_aembed_documents_wrapper?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll check these comments tomorrow :)

@Cirilla-zmh
Copy link
Copy Markdown
Collaborator

@minimAluminiumalism Hey! Thank you for your outstanding contributions over time. We are very pleased to invite you to become an approver for loongsuite-python-agent. If you are willing, you can leave an 👍 under this comment, and I’ll be happy to submit a PR for this.

If it’s convenient for you, you’re also welcome to join our DingTalk group and contact “希铭” in the group. We look forward to connecting with you more closely.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants