This repository was archived by the owner on Oct 28, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 56
feat: add basic tracing to mcp #93
Merged
+106
−11
Merged
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
79baed3
tracing.py
liukatkat b492ad6
clean up
liukatkat 9627b08
trace id
liukatkat e487606
link parent span
liukatkat 6bea137
rebase
liukatkat 01076ea
wait for process to finish
liukatkat de12f0e
fix top-level span not spanning the whole time
liukatkat File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,80 @@ | ||
| #!/usr/bin/env python3 | ||
|
|
||
| import logging | ||
| import os | ||
| from collections.abc import Generator | ||
| from contextlib import contextmanager | ||
|
|
||
| from opentelemetry import trace | ||
| from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter | ||
| from opentelemetry.sdk.resources import DEPLOYMENT_ENVIRONMENT, SERVICE_NAME, Resource | ||
| from opentelemetry.sdk.trace import TracerProvider | ||
| from opentelemetry.sdk.trace.export import BatchSpanProcessor | ||
|
|
||
| # coupling: these need to be kept in sync with semgrep-proprietary/tracing.py | ||
| DEFAULT_TRACE_ENDPOINT = "https://telemetry.semgrep.dev/v1/traces" | ||
| DEFAULT_DEV_ENDPOINT = "https://telemetry.dev2.semgrep.dev/v1/traces" | ||
| DEFAULT_LOCAL_ENDPOINT = "http://localhost:4318/v1/traces" | ||
|
|
||
| MCP_SERVICE_NAME = "mcp" | ||
|
|
||
|
|
||
| def get_trace_endpoint() -> tuple[str, str]: | ||
| """Get the appropriate trace endpoint based on environment.""" | ||
| env = os.environ.get("ENVIRONMENT", "dev").lower() | ||
|
|
||
| if env == "prod": | ||
| return (DEFAULT_TRACE_ENDPOINT, "prod") | ||
| elif env == "local": | ||
| return (DEFAULT_LOCAL_ENDPOINT, "local") | ||
| else: | ||
| return (DEFAULT_DEV_ENDPOINT, "dev") | ||
|
liukatkat marked this conversation as resolved.
|
||
|
|
||
|
|
||
| @contextmanager | ||
| def start_tracing(name: str) -> Generator[trace.Span, None, None]: | ||
| """Initialize OpenTelemetry tracing.""" | ||
| (endpoint, env) = get_trace_endpoint() | ||
|
|
||
| # Create resource with basic attributes | ||
| resource = Resource.create( | ||
| { | ||
| SERVICE_NAME: MCP_SERVICE_NAME, | ||
| DEPLOYMENT_ENVIRONMENT: env, | ||
| } | ||
| ) | ||
| # Create tracer provider | ||
| provider = TracerProvider(resource=resource) | ||
|
|
||
| # Create OTLP exporter | ||
| exporter = OTLPSpanExporter(endpoint=endpoint) | ||
|
|
||
| # Create span processor | ||
| processor = BatchSpanProcessor(exporter) | ||
| provider.add_span_processor(processor) | ||
|
|
||
| # Set the global tracer provider | ||
| trace.set_tracer_provider(provider) | ||
|
|
||
| # Get tracer instance | ||
| tracer = trace.get_tracer(MCP_SERVICE_NAME) | ||
|
|
||
| with tracer.start_as_current_span(name) as span: | ||
| trace_id = trace.format_trace_id(span.get_span_context().trace_id) | ||
|
|
||
| logging.info("Tracing initialized") | ||
| logging.info(f"Tracing initialized with trace ID: {trace_id}") | ||
|
|
||
| yield span | ||
|
|
||
|
|
||
| @contextmanager | ||
| def with_span( | ||
| parent_span: trace.Span, | ||
| name: str, | ||
| ) -> Generator[trace.Span, None, None]: | ||
| tracer = trace.get_tracer(MCP_SERVICE_NAME) | ||
|
|
||
| context = trace.set_span_in_context(parent_span) | ||
| with tracer.start_span(name, context=context) as span: | ||
| yield span | ||
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we want to call this here. Similarly to the other PR, if this start_tracing is called here, we will have a single span used for every request sent to the server. The server_lifespan is called only once, and we really want to call
start_as_current_spanwhen a new request is received.That being said, since FastMCP is based on FastAPI, I wonder if it'd be enough to follow these docs 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Flavio, could you confirm your concern with this behavior? My impression (which I have confirmed via local testing) is that the
server_lifespanmanager is invoked once per session registered to the MCP. Meaning, if we run one server, we could service several clients who want to use the MCP, for instance if we had numerous editors open, or if we were inmcp.semgrep.aiand many users wanted to use the MCP.In which case, my understanding is this code will cause one top-level span to be registered for each user's session, and each request of
semgrep_scan_rpctherein will also produce a span.I personally believe this to be the desirable behavior--do you disagree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmh, that may not be what we want. Thikning outloud here.
The way tracing of HTTP requests is normally done is that we would have a trace per request and multiple spans as part of that trace. That way, it is possible to analyze traces independently.
If we create a span per user session, that means all subsequent calls (usages of MCP/tools/etc) for that session will be under the same trace. This may make things like comparing multiple calls to
security_check, as they will all belong to the same trace.By creating a trace per request (tool call, etc), we can forward that trace.id to semgrep CLI and have semgrep's create spans under the same trace.
TL;DR: What I'm proposing is that we should not have a root span per user session, but rather per request.
Does it make sense? Happy to discuss this further
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that makes sense. but i think that having a top-level span gives us more context when analyzing the calls. with the newly added
semgrep mcpcommand, we pull and cache rules per each session. so all the calls in the same session will be using the same set of cached rules. we might also add more features in the future where the context of the session will matter for each call. so, this case might be slightly different from HTTP requests in the sense that each request is not necessarily independent.i also think that we won't necessarily lose anything (in fact, i think we gain info) by nesting the calls under a parent span. but happy to know what you think about this!