-
Notifications
You must be signed in to change notification settings - Fork 17
add support for callgraph profiling #686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
3fe4632
add support for callgraph profiling
seemk 7f0ce87
use set discard and fix the check for parent context
seemk 1770046
remove active_traces from CallgraphsSpanProcessor
seemk 2106073
use constant for splunk.trace.snapshot.volume
seemk d382cb1
rename start_callgraphs_if_enabled to _configure_callgraphs_if_enabled
seemk 5b6d18d
Merge branch 'main' into callgraphs
seemk 67ebe23
add a lock to CallgraphsSpanProcessor
seemk a8e5014
fix the profiler sleep logic
seemk c38b767
fix a race condition when waking up the profiler thread
seemk d1613c7
shut down profiler on span processor shutdown
seemk b3e7e0b
pin virtualenv in ci
pmcollins File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| from opentelemetry import trace | ||
| from opentelemetry.sdk.environment_variables import OTEL_SERVICE_NAME | ||
|
|
||
| from splunk_otel.callgraphs.span_processor import CallgraphsSpanProcessor | ||
| from splunk_otel.env import ( | ||
| Env, | ||
| SPLUNK_SNAPSHOT_PROFILER_ENABLED, | ||
| SPLUNK_SNAPSHOT_SAMPLING_INTERVAL, | ||
| ) | ||
|
|
||
|
|
||
| def _configure_callgraphs_if_enabled(env=None): | ||
| env = env or Env() | ||
| if env.is_true(SPLUNK_SNAPSHOT_PROFILER_ENABLED): | ||
| trace.get_tracer_provider().add_span_processor( | ||
| CallgraphsSpanProcessor(env.getval(OTEL_SERVICE_NAME), env.getint(SPLUNK_SNAPSHOT_SAMPLING_INTERVAL, 10)) | ||
| ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,98 @@ | ||
| # Copyright Splunk Inc. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| from typing import Optional | ||
|
|
||
| from opentelemetry import baggage, trace | ||
| from opentelemetry.context import Context | ||
| from opentelemetry.sdk.trace import ReadableSpan, Span, SpanProcessor | ||
|
|
||
| from splunk_otel.profile import ProfilingContext | ||
| from splunk_otel.propagator import _SPLUNK_TRACE_SNAPSHOT_VOLUME | ||
|
|
||
| import threading | ||
|
|
||
|
|
||
| def _should_process_context(context: Optional[Context]) -> bool: | ||
| parent_span = trace.get_current_span(context).get_span_context() | ||
|
|
||
| is_root_span = not parent_span.is_valid | ||
|
|
||
| return is_root_span or parent_span.is_remote | ||
|
|
||
|
|
||
| class CallgraphsSpanProcessor(SpanProcessor): | ||
| def __init__(self, service_name: str, sampling_interval: Optional[int] = 10): | ||
| self._span_id_to_trace_id: dict[int, int] = {} | ||
| self._lock = threading.Lock() | ||
| self._profiler = ProfilingContext( | ||
| service_name, sampling_interval, self._filter_stacktraces, instrumentation_source="snapshot" | ||
| ) | ||
|
|
||
| def on_start(self, span: Span, parent_context: Optional[Context] = None) -> None: | ||
| if not _should_process_context(parent_context): | ||
| return | ||
|
|
||
| ctx_baggage = baggage.get_baggage(_SPLUNK_TRACE_SNAPSHOT_VOLUME, parent_context) | ||
|
|
||
| if ctx_baggage is None: | ||
| return | ||
|
|
||
| if ctx_baggage == "highest": | ||
| span.set_attribute("splunk.snapshot.profiling", True) | ||
|
|
||
| span_ctx = span.get_span_context() | ||
|
|
||
| if span_ctx is None: | ||
| return | ||
|
|
||
| with self._lock: | ||
| self._span_id_to_trace_id[span_ctx.span_id] = span_ctx.trace_id | ||
| self._profiler.start() | ||
|
|
||
| def on_end(self, span: ReadableSpan) -> None: | ||
| span_id = span.get_span_context().span_id | ||
| trace_id = self._span_id_to_trace_id.get(span_id) | ||
|
|
||
| if trace_id is None: | ||
| return | ||
|
|
||
| with self._lock: | ||
| self._span_id_to_trace_id.pop(span_id, None) | ||
|
|
||
| if len(self._span_id_to_trace_id) == 0: | ||
| self._profiler.pause_after(60.0) | ||
|
|
||
| def shutdown(self) -> None: | ||
| self._profiler.stop() | ||
|
|
||
| def force_flush(self, timeout_millis: int = 30000) -> bool: | ||
| return True | ||
|
|
||
| def _filter_stacktraces(self, stacktraces, active_trace_contexts): | ||
| filtered = [] | ||
| with self._lock: | ||
| trace_ids = set(self._span_id_to_trace_id.values()) | ||
|
|
||
| for stacktrace in stacktraces: | ||
| thread_id = stacktrace["tid"] | ||
|
|
||
| maybe_context = active_trace_contexts.get(thread_id) | ||
|
|
||
| if maybe_context is not None: | ||
| (trace_id, _span_id) = maybe_context | ||
| if trace_id in trace_ids: | ||
| filtered.append(stacktrace) | ||
|
|
||
| return filtered | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.