Skip to content

Update resource detection to be entity-aware.#5147

Open
jsuereth wants to merge 1 commit into
open-telemetry:mainfrom
jsuereth:wip-entity-detectors
Open

Update resource detection to be entity-aware.#5147
jsuereth wants to merge 1 commit into
open-telemetry:mainfrom
jsuereth:wip-entity-detectors

Conversation

@jsuereth

@jsuereth jsuereth commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Update resource detectors to generate entities vs. raw attributes.

  • Call out the entities that are generated by named detectors vs. namespaces.
  • Add env detector which will support the "environment variable" resource context propagation defined for entities.
  • Updates naming guidance to involve entities.

Note: This should be a non-breaking change as the OTLP produces will only contain additional information, but Resource generated using these named detectors should be exactly the same as before.

Prototype:

Relevant Issues:

What this does NOT do:

  • Figure out how to have env detection built-in. We believe we need to move from OTEL_RESOURCE_ATTRIBUTES to OTEL_ENTITIES to safely allow runtime environments to populate ENV variable in ways that will not clobber other runtimes. We want this migration to be seamless. I expect this change to declare the named resource detector and we'll later specify default-on resource detectors. This name would allow configuration to be used to disable ENV-based propagation.
  • Add necessary SDK components as these are part of [entities] SDK startup specification #5057

@jsuereth jsuereth requested a review from a team as a code owner June 9, 2026 18:14
Populates [container.*](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/resource/container.md)
attributes.
Populates [container](https://opentelemetry.io/docs/specs/semconv/registry/entities/container/)
entity.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these need some language that explains the mechanics in both terms of old world (i.e. only resources) and new world (i.e. resources with entities attached). A plain read of this as is suggests that if I'm working in a language that hasn't yet implemented entities, that container resource detector does nothing.

I think it should say something like "Populates container entity, or container.* attributes if SDK does not yet support entities."

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - I should say "all relevant attributes of the entity, if entity support is not included yet"

entities.
* `service`: Populates `service` and `service.instance` entities described
[here](https://opentelemetry.io/docs/specs/semconv/registry/entities/service/).
* `env`: Populates entities based on [Entity Propagation](../entities/entity-propagation.md).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name env here might imply that OTEL_RESOURCE_ATTRIBUTES is supported. I'm ok holding the line and leaning on users to read the description to understand that env is only for the entities world and only respects OTEL_ENTITIES.

But just wanted to call it out in case you have any names in mind that might avoid the pitfall. I.e. entities_env?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not - and had the same concern. I think we have some options in what we can do here, but I hadn't thought of anything I liked better than what's proposed.

entities.
* `service`: Populates `service` and `service.instance` entities described
[here](https://opentelemetry.io/docs/specs/semconv/registry/entities/service/).
* `env`: Populates entities based on [Entity Propagation](../entities/entity-propagation.md).

@dmitryax dmitryax Jun 10, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should to define a plan for reconciling OTEL_RESOURCE_ATTRIBUTES and OTEL_ENTITIES before adding this detector. Having separate controls for disabling them can be confusing in my opinion. Maybe we leave this detector out for now and have OTEL_ENTITIES controlled/applied the same way as OTEL_RESOURCE_ATTRIBUTES? Then we can define migration path to this detector?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned that if we don't give users a way to control priority now we'll struggle to add it later

@MatthieuNoirbusson

Copy link
Copy Markdown

This is a welcome change — naming detectors after the entities they populate makes the Resource side much more legible for consumers.

One consumer-side question this amplifies (happy to move it to an issue if out of scope for this PR): with entity-aware detection, the same entity can now reach a consumer through two channels — attached to a Resource on ordinary telemetry, and as entity state events (entity-events.md describes the two as complementary). For cases where both exist for the same entity — say an SDK detects host while an infra agent emits host entity state events for the same machine — is consumer-side reconciliation guidance planned? Resource-borne entities carry no lifecycle/timestamps, while events do; and the entity-events examples hint at an "authoritative source" notion (e.g. the SDK for service.instance) without generalizing it.

Our working assumption while building a consumer (a temporal entity graph — still in development, not yet field-validated) is: events are authoritative for state/lifecycle; Resource-borne entities associate telemetry to an identity and may bootstrap presence. If that matches the intent, one sentence to that effect (here or in entity-events.md) would help consumer implementers; if it doesn't, even better to learn now.

Also +1 on env being a named detector so ENV-based propagation can be disabled by configuration — from a consumer standpoint, detector provenance on entities is genuinely useful.

@jsuereth

jsuereth commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

One consumer-side question this amplifies (happy to move it to an issue if out of scope for this PR): with entity-aware detection, the same entity can now reach a consumer through two channels — attached to a Resource on ordinary telemetry, and as entity state events (entity-events.md describes the two as complementary). For cases where both exist for the same entity — say an SDK detects host while an infra agent emits host entity state events for the same machine — is consumer-side reconciliation guidance planned? Resource-borne entities carry no lifecycle/timestamps, while events do; and the entity-events examples hint at an "authoritative source" notion (e.g. the SDK for service.instance) without generalizing it.

Yes - the entity merge algorithm is defined in the data model and should allow merging entities appropriately. Effectively we only expect the description to change, and we have a merge algorithm to help you determine what labels to use.

We may want to update this to always prefer the entity-relationship event channel for descriptive attributes.

Our working assumption while building a consumer (a temporal entity graph — still in development, not yet field-validated) is: events are authoritative for state/lifecycle; Resource-borne entities associate telemetry to an identity and may bootstrap presence. If that matches the intent, one sentence to that effect (here or in entity-events.md) would help consumer implementers; if it doesn't, even better to learn now.

This is accurate, but I'd say it's more accurate to say Resource-borne entities are mostly about identity and descriptive attributes are an "opt-out" for storage optimisations in systems which do not engage with the Entity relationship signal or where joins are inefficient.

@MatthieuNoirbusson

Copy link
Copy Markdown

Thanks, that settles it for us: merge by exact identity is what we implement, so the two channels converge cleanly.

From a consumer standpoint, the "always prefer the event channel for descriptive attributes" update would be welcome: events carry timestamps and lifecycle, entities carried on the Resource do not, so the precedence falls out naturally. There may be entity types where the Resource side is the authoritative source, for example service.instance coming from its own SDK, so a notion of authoritative source per entity type, which the examples in entity-events.md already hint at, might be the generalizable form.

We'll track this PR.

Comment on lines 170 to +181
* `container`:
Populates [container.*](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/resource/container.md)
attributes.
Populates [container](https://opentelemetry.io/docs/specs/semconv/registry/entities/container/)
entity.
* `host`:
Populates [host.*](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/resource/host.md) and [os.*](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/resource/os.md)
attributes.
Populates [host](https://opentelemetry.io/docs/specs/semconv/registry/entities/host/) and [os](https://opentelemetry.io/docs/specs/semconv/registry/entities/os/)
entities.
* `process`:
Populates [process.*](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/resource/process.md)
attributes.
* `service`: Populates `service.name` based
on [OTEL_SERVICE_NAME](../configuration/sdk-environment-variables.md#general-sdk-configuration)
environment variable; populates `service.instance.id`
as [defined here](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/registry/attributes/service.md#service-attributes).
Populates [process](https://opentelemetry.io/docs/specs/semconv/registry/entities/process/)
entities.
* `service`: Populates `service` and `service.instance` entities described
[here](https://opentelemetry.io/docs/specs/semconv/registry/entities/service/).
* `env`: Populates entities based on [Entity Propagation](../entities/entity-propagation.md).

@thompson-tomo thompson-tomo Jun 12, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if from the spec we could just point the reader to a resource detector registry in semconv. We could possibly generate it now using annotations but something like open-telemetry/weaver#1230 would make it easier to define and more of a first class definition.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - I think that's follow on work. For now - we DO need the specification to reserve these names as it's the "central truth" between the config spec and semconv.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants