[OTEP] Process Context: Sharing Registered Instrumentations with External Readers#5116
Conversation
846bff1 to
3ac9e72
Compare
3ac9e72 to
f85eef0
Compare
|
|
09c33bc to
f85eef0
Compare
Co-authored-by: Robert Pająk <pellared@hotmail.com>
There was a problem hiding this comment.
Pull request overview
This PR introduces a new OTEP extending the existing Process Context sharing mechanism (OTEP 4719) to additionally publish the set of registered instrumentation libraries to out-of-process readers (e.g., OBI), enabling more precise de-duplication/coordination.
Changes:
- Add OTEP 5116 specifying a registration API and a ProcessContext payload extension to publish registered instrumentation scopes.
- Document proto-level changes (new
instrumentationsfield) and expected SDK publication/registry behavior. - Update
CHANGELOG.mdto include the new OTEP under Unreleased.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
oteps/5116-process-ctx-instrumentations.md |
New OTEP defining how SDKs publish registered instrumentations via Process Context. |
CHANGELOG.md |
Adds an Unreleased entry referencing the new OTEP PR. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Could we discuss the option to consider having a dedicated OTel SemConv namespace (e.g. instrumentation), that defines instrumentation records?
These OTel SemConv attributes then could be used in ProcessContext.attributes and don't require changes to the protocol. Existing tooling/SDKs already can handle OTel SemConv quite well and so could benefit from processing these instead of extending the protocol, that needs to be implemented first.
There was a problem hiding this comment.
Great point, Yes thats definitely a possibility
The main reasons i opted for a separate field are:
- the
InstrumentaionScopeis an already defined struct that otel sdks should know how to handle - The protocol is still not fully implemented in many sdks so i guess adding it at the start instead of changing existing implementations
- need to decide on a format to represent a list of instrumentations, and need a convention not only for the key but for the value we send it. since this is an basically an array of structs we can represent it in many different ways (multiple indexed attributes, 1 attribute that contains an array of KeyValue paris, each key needs to have consistent names, even though InstrumentationScope also have attributes so its really nested)
both could work, if there is an a strong opinion to change it i wouldn't mind changing the proposal
There was a problem hiding this comment.
As InstrumentationScope is just a field in the new introduced InstrumentationRecord, I'm still wondering if using SemConv defined attributes ProcessContext.attributes, could be simpler.
Does the producer of the telemetry matter? InstrumentationScope only reveils the name and version of the telemetry producer but does not mention if it produces, logs, metrics, traces or profiles.
How are readers of the InstrumentationScope expected to know which name/version combination produces which telemetry signal?
The protocol is still not fully implemented in many sdks so i guess adding it at the start instead of changing existing implementations
While the specification got merged with #4719, open-telemetry/opentelemetry-proto#783 is still pending. So I think it is fine, if SDKs wait for an official release of the proto first, before they start implementing it.
There was a problem hiding this comment.
As InstrumentationScope is just a field in the new introduced InstrumentationRecord, I'm still wondering if using SemConv defined attributes ProcessContext.attributes, could be simpler.
Im open to discussion on this, i could suggest some naming and structure to this, but i don't know if this warrants a new pr to the semantic conventions repository.
Im wondering if this is a similar usecase to here, that since the communication is between to otel components this wont require a semconv entry.
Does the producer of the telemetry matter? InstrumentationScope only reveils the name and version of the telemetry producer but does not mention if it produces, logs, metrics, traces or profiles.
How are readers of the InstrumentationScope expected to know which name/version combination produces which telemetry signal?
Under Future possibilites iv'e suggested this as a follow up
the issue is that an instrumentation does not self document which telemetry type it exports, so this would need some manual changes from library authors
There was a problem hiding this comment.
Indeed the intention with the attributes is to enable extension backed by the usual semantic conventions mechanisms that OTel uses so +1 Florian's comment that it seems like a nice one to look into?
Having said that, indeed part of decision to go with protobuf was to keep the door open to extension via "adding more stuff on the protobuf" as done here, so if the attributes path for some reason doesn't seem to work very well for this use-case, I think it does make sense to keep "extending the protobuf" on the table for discussion.
There was a problem hiding this comment.
@florianl @ivoanjo I tried seeing what are the viable options to include this data in the attributes since its not trivial to convert this struct into flat attributes:
A. indexed attributes
process.instrumentation.0.name = "otel-instr-redis"
process.instrumentation.0.version = "1.2.0"
process.instrumentation.1.name = "otel-instr-jdbc"
process.instrumentation.1.version = "2.5.0"
process.instrumentation.count = 2
B. Separate arrays:
process.instrumentation.names = ["otel-instr-redis", "otel-instr-jdbc"]
process.instrumentation.versions = ["1.2.0", "2.5.0"]
C. single array with stringified representation:
process.instrumentations = ["otel-instr-redis:1.2.0", "otel-instr-jdbc:2.5.0"]
D. single array with struct (KeyValueList) data representation:
process.instrumentations = [
{ name: "otel-instr-redis", version: "1.2.0" },
{ name: "otel-instr-jdbc", version: "2.5.0" },
]
Option A and B are well typed but attributes are dependent on each other and can drift
Option C and D are simple with a single attribute but both of their schema is not forced, their type is either a "string array" or "KV array" and we need to rely on implicit schema in oder to encode / decode it
Option D looks the best to me out of any of those if any, but i dont think OTEL has any "array of kvlists" attribute in its semantic convention, i would say because of the worry of drift between array items and loose typing
since this is relatively simple type (now) this still might work but adding new fields to each instrumentation can cause complexity
would love to hear your thoughts about this
There was a problem hiding this comment.
Good points!
To me the biggest downsides of A/B/C beyond the downsides you identified, is that they seem very awkward to extend, if in the future we want to provide more information about an instrumentation.
Option D is the closest to the InstrumentationRecord originally proposed, and it seems to me the main choice is indeed those two.
I don't have a lot of experience with defining/validating the semantic conventions, so maybe @florianl can chime in on that part?
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
| message InstrumentationRecord { | ||
| // The instrumentation scope identifying this instrumentation library. | ||
| // Uniqueness is determined by the tuple (name, version). | ||
| opentelemetry.proto.common.v1.InstrumentationScope scope = 1; | ||
|
|
||
| // Future fields may be added for additive evolution | ||
| // (see "Future possibilities"). | ||
| } |
There was a problem hiding this comment.
Would it make sense to define here the language to help uniquely identify the instrumentation? In declarative config we use language + instrumentation name.
There was a problem hiding this comment.
need to think if its better to add this on a per instrumentation record basis, or just add it to the resource object of the process-ctx since it should be shared across all instrumentations of a process
There was a problem hiding this comment.
The language itself is already one of the things we expect would go into the resource as telemetry.sdk.language.
There was a bit of a back-and-forth on the original #4719 PR if we should have a list of things that could be guaranteed to be there or not, and we ended up not having it, yet I don't think that precludes us from making sure every otel sdk includes this info, especially when they implement register_instrumentation.
There was a problem hiding this comment.
I too was thinking about resource but a use case would be what happens if the sdk is one language ie python which can use c++ libraries ie open-telemetry/community#3483
There was a problem hiding this comment.
Ah, I had not considered that, makes sense!
| message ProcessContext { | ||
| // Existing fields from OTEP 4719 (unchanged): | ||
| opentelemetry.proto.resource.v1.Resource resource = 1; | ||
| repeated opentelemetry.proto.common.v1.KeyValue attributes = 2; | ||
|
|
||
| // NEW: List of instrumentation libraries registered in this process. | ||
| // [Optional] Empty list (or absent field) means no instrumentations have | ||
| // been registered, or the SDK does not implement this mechanism. | ||
| repeated InstrumentationRecord instrumentations = 3; | ||
| } |
There was a problem hiding this comment.
Would it make sense to capture the distro used ie Auto-instrumentation agent name? That way you can locate the config.
There was a problem hiding this comment.
what do you mean by the agent? is it the instrumenting process?
There was a problem hiding this comment.
Distro/Agent is the software generating the telemetry. In hindsight it could come from the resource. So dont worry about it.
| message ProcessContext { | ||
| // Existing fields from OTEP 4719 (unchanged): | ||
| opentelemetry.proto.resource.v1.Resource resource = 1; | ||
| repeated opentelemetry.proto.common.v1.KeyValue attributes = 2; | ||
|
|
||
| // NEW: List of instrumentation libraries registered in this process. | ||
| // [Optional] Empty list (or absent field) means no instrumentations have | ||
| // been registered, or the SDK does not implement this mechanism. | ||
| repeated InstrumentationRecord instrumentations = 3; | ||
| } |
There was a problem hiding this comment.
Would it make sense to also share the config file path? that way you can also look up the instrumentation/distro config to know what is enabled.
There was a problem hiding this comment.
config on a per instrumentation level or for the whole process?
this is assuming the sdk is created via config and not programatically
There was a problem hiding this comment.
is this struct defined somewhere in the proto? we can add it in addition
There was a problem hiding this comment.
Yes a field config_file is added to the processcontext so the external reader knows what config file is being used.
| - **Sampler / sampling configuration.** Process-level sampler description and (where applicable) the W3C | ||
| consistent-sampling threshold, enabling external readers to make sampling decisions consistent with the SDK. Deferred | ||
| for simplicity. |
ivoanjo
left a comment
There was a problem hiding this comment.
This looks great!
Other than the ongoing discussion of using attributes vs extending the protobuf that needs to be concluded, I think this is in great shape + makes sense, so I'm happy to pitch in my approval after that one gets resolved :)
| message InstrumentationRecord { | ||
| // The instrumentation scope identifying this instrumentation library. | ||
| // Uniqueness is determined by the tuple (name, version). | ||
| opentelemetry.proto.common.v1.InstrumentationScope scope = 1; | ||
|
|
||
| // Future fields may be added for additive evolution | ||
| // (see "Future possibilities"). | ||
| } |
There was a problem hiding this comment.
The language itself is already one of the things we expect would go into the resource as telemetry.sdk.language.
There was a bit of a back-and-forth on the original #4719 PR if we should have a list of things that could be guaranteed to be there or not, and we ended up not having it, yet I don't think that precludes us from making sure every otel sdk includes this info, especially when they implement register_instrumentation.
There was a problem hiding this comment.
Indeed the intention with the attributes is to enable extension backed by the usual semantic conventions mechanisms that OTel uses so +1 Florian's comment that it seems like a nice one to look into?
Having said that, indeed part of decision to go with protobuf was to keep the door open to extension via "adding more stuff on the protobuf" as done here, so if the attributes path for some reason doesn't seem to work very well for this use-case, I think it does make sense to keep "extending the protobuf" on the table for discussion.
Fixes #5064
Changes
Extends The Process context sharing protocol to allow SDKs to register instrumentations for external readers to consume
CHANGELOG.mdfile updated for non-trivial changes[chore]in the PR title to skip the changelog checkNext steps
ProcessContextmessage from OTEP 4719 opentelemetry-proto#783 to be merged to include the new process context protoinstrumentationsfield