Skip to content

feat(common): telemetry module for optional OpenTelemetry instrumentation`#437

Draft
aashvgit wants to merge 3 commits intokubeflow:mainfrom
aashvgit:feat/otel-telemetry-model
Draft

feat(common): telemetry module for optional OpenTelemetry instrumentation`#437
aashvgit wants to merge 3 commits intokubeflow:mainfrom
aashvgit:feat/otel-telemetry-model

Conversation

@aashvgit
Copy link
Copy Markdown

What this PR does / why we need it/summary:
Adds kubeflow/common/telemetry.py — the shared foundation module for
optional OpenTelemetry instrumentation across all Kubeflow SDK clients.

This is a draft PR for GSoC 2026 Issue for project 7
Design Decisions
Zero overhead when disabled: get_tracer() returns a _NoOpTracer
when opentelemetry-api is not installed. No import errors, no exceptions.

Opt-out via env var: KUBEFLOW_TRACING_DISABLED=1 disables all
instrumentation with no code changes required.

Configurable exporters:Supports console, otlp, and none via
configure(exporter=...). OTLP endpoint configurable via env var
KUBEFLOW_OTLP_ENDPOINT.

Consistent naming: SpanNames and SpanAttributes constants enforce
naming across all clients. Span names follow kubeflow.sdk.<client>.<op>
convention. Attributes follow OTel semantic conventions for GenAI workload.

Sampling: Configurable via sample_rate parameter f(0.0–1.0).
** Span Hierarchy (taking TrainerClient as example)**

kubeflow.sdk.trainer.train
kubeflow.sdk.trainer.create_trainjob [event: trainjob_submitted]
kubeflow.sdk.trainer.poll_status [× N per polling]
attributes: poll.iteration, kubeflow.trainer.status

Tests
10 unit tests following TestCase + pytest.mark.parametrize pattern, All tests pass without opentelemetry-api installed (no-op path).

Fixes #
#164 as this is the foundation model
Checklist:

  • Docs included if any changes are user facing

…w#388

Signed-off-by: aashvgit <167199295+aashvgit@users.noreply.github.com>
Signed-off-by: aashvgit <167199295+aashvgit@users.noreply.github.com>
Signed-off-by: aashvgit <167199295+aashvgit@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🎉 Welcome to the Kubeflow SDK! 🎉

Thanks for opening your first PR! We're happy to have you as part of our community 🚀

Here's what happens next:

  • If you haven't already, please check out our Contributing Guide for repo-specific guidelines and the Kubeflow Contributor Guide for general community standards
  • Our team will review your PR soon! cc @kubeflow/kubeflow-sdk-team

Join the community:

Feel free to ask questions in the comments if you need any help or clarification!
Thanks again for contributing to Kubeflow! 🙏

@google-oss-prow
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign andreyvelich for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant