EP: content citation and attribution telemetry

## Summary

`org.openattribution.telemetry` is a vendor extension that tracks which content influenced AI agent outcomes in commerce conversations. It embeds an `attribution` object in UCP checkout sessions, recording which content was retrieved and cited during the conversation that led to a purchase.

This starts as a vendor extension under the `org.openattribution.*` namespace, per UCP's governance model for new use cases.

## Motivation

Content creators — reviewers, guide authors, comparison sites — invest significantly in producing the expert content that AI shopping agents rely on to generate recommendations. This content is the raw material of AI-assisted commerce: without reviews, there are no credible recommendations; without guides, there is no informed comparison.

Yet creators currently have **zero visibility** into whether their work influences purchases. This is a market failure:

1. **No measurement exists.** There is no standardised way to track which content was retrieved and cited in the conversation that led to a purchase.
2. **Attribution is impossible.** Commerce outcomes (purchases, cart additions) cannot be linked to specific content pieces, so creators cannot demonstrate ROI to partners or advertisers.
3. **Privacy is unaddressed.** Without a purpose-built signal, ad-hoc tracking solutions will emerge that compromise user privacy rather than preserving it.
4. **Multi-session journeys are invisible.** Research, comparison, and purchase often happen across separate conversations over days. Without temporal signals on content retrieval, these journeys are invisible to attribution.

Without measurement, the rational response from creators is to restrict AI access to their content — threatening the content supply chain that the entire AI commerce ecosystem depends on. Attribution telemetry is the feedback loop that makes the ecosystem sustainable.

### Value to UCP

- **First-mover advantage.** UCP becomes the first commerce protocol with native content attribution, differentiating it from competitors.
- **Merchant insight.** Merchants gain content partnership ROI data they cannot get from any other source, enabling data-driven content investment.
- **Zero-risk adoption.** The extension is additive and non-blocking — it cannot interfere with existing checkout flows. Merchants and agents can adopt incrementally.

### Goals

* Define an optional attribution extension that embeds content citation data in UCP checkout sessions
* Support privacy-preserving data sharing with lightweight conversation context (`turn_count`, `topics`)
* Enable multi-conversation attribution by accumulating content from prior conversations into the final checkout's `content_retrieved` array — timestamps on entries provide the temporal signal
* Support negative attribution via `contradiction` citation type (content retrieved but disagreed with)
* Provide citation quality signals (`citation_type`, `excerpt_tokens`, `position`, `content_hash`) for weighted attribution

### Non-Goals

* Defining specific attribution algorithms (left to implementers)
* Mandating payment structures or compensation models
* Requiring specific privacy policies (left to agreements between parties)

## Detailed Design

### Capability Declaration

```json
{
  "capabilities": [
    {
      "name": "org.openattribution.telemetry",
      "version": "2026-02-17",
      "spec": "https://github.com/openattribution-org/telemetry/blob/main/ucp/EXTENSION.md",
      "schema": "https://raw.githubusercontent.com/openattribution-org/telemetry/main/ucp/extension-schema.json",
      "extends": "dev.ucp.shopping.checkout"
    }
  ]
}
```

### Checkout Extension

Adds an `attribution` object to UCP checkout sessions:

```json
{
  "id": "chk_123",
  "line_items": ["..."],
  "attribution": {
    "content_scope": "electronics-reviews",
    "content_retrieved": [
      {
        "content_url": "https://www.wirecutter.com/reviews/best-wireless-headphones",
        "timestamp": "2026-01-15T10:30:01Z"
      }
    ],
    "content_cited": [
      {
        "content_url": "https://www.wirecutter.com/reviews/best-wireless-headphones",
        "timestamp": "2026-01-15T10:30:05Z",
        "citation_type": "paraphrase",
        "excerpt_tokens": 85,
        "position": "primary"
      }
    ],
    "conversation_summary": {
      "turn_count": 3,
      "topics": ["headphones", "noise-cancelling"]
    }
  }
}
```

For complete field definitions, citation types, and implementation notes, see [EXTENSION.md](https://github.com/openattribution-org/telemetry/blob/main/ucp/EXTENSION.md).

### Standalone Capability

For use cases beyond checkout — browse sessions, multi-agent attribution chains, conversation analytics — the extension also defines a standalone REST/MCP capability with session lifecycle endpoints. See [`org.openattribution.telemetry.yaml`](https://github.com/openattribution-org/telemetry/blob/main/ucp/org.openattribution.telemetry.yaml) for the full specification.

The two approaches share the same underlying schema and are interoperable: a session tracked via standalone endpoints can link to a UCP checkout via `checkout_id` in the session outcome.

### Negotiation

* When both agent and merchant declare `org.openattribution.telemetry`: full bidirectional attribution flow
* When only one party supports it: graceful degradation. Checkout proceeds normally; attribution is additive, not blocking.

## Risks and Mitigations

**Privacy risk:** Conversation data could leak through attribution signals.
* *Mitigation:* `conversation_summary` is limited to `turn_count` and `topics` — no raw text, no subjective classification. `content_scope` must be opaque (not PII). `external_id` must be hashed, not raw PII.

**Adoption risk:** Vendor extension may not gain traction.
* *Mitigation:* Open-source reference implementation (Apache 2.0), Python SDK, and reference server lower the barrier.

**Schema evolution risk:** Breaking changes could fragment implementations.
* *Mitigation:* Schema version field (`0.4`) and date-based capability versions (`2026-02-17`). Deprecation policy requires 6 months notice. All nested objects allow additional properties for forward compatibility.

### Fraud Mitigation

**Signal integrity risk:** Agents could report false citations or inflated quality signals.

Mitigations:
- `content_hash` (SHA-256) provides an integrity audit trail — agents hash whatever content they processed, useful for dispute resolution when attribution involves commission payments
- Citation quality signals (`citation_type`, `excerpt_tokens`, `position`) are agent-reported metadata, not trusted assertions — consumers apply their own confidence weighting
- Cross-reference `content_retrieved` timestamps against `content_cited` timestamps: an agent cannot legitimately cite content before retrieving it
- Rate limiting and anomaly detection at the consumer level (unusual citation volume, repeated patterns, suspiciously high excerpt_tokens)
- The extension provides attribution *signals*, not attribution *decisions* — the consuming platform is responsible for fraud detection in its attribution model

## Test Plan

**Unit tests:**
* Schema validation for all models (session, event, outcome, conversation turn)
* `content_retrieved` required with `minItems: 1` enforcement
* Citation type and position enum validation

**Integration tests:**
* Checkout extension: `attribution` object in checkout request/response
* Multi-conversation attribution via accumulated `content_retrieved` entries with timestamps
* Capability negotiation: graceful degradation when only one party supports extension

**End-to-end tests:**
* Shopping conversation with content retrieval, citation, and purchase
* Privacy level enforcement across the full flow

## Graduation Criteria

**Working Draft to Candidate:**

- [ ] Schema validation passing against all published examples
- [ ] Reference implementation with ≥80% test coverage
- [ ] At least two independent implementations (one agent, one merchant)
- [ ] 3-month feedback period with no unresolved blocking issues
- [ ] TC majority vote to advance

**Candidate to Stable:**

- [ ] At least three production deployments processing real attribution data
- [ ] Interoperability testing across at least 2 agents × 2 merchants
- [ ] Published migration guide for version upgrades
- [ ] TC majority vote to advance

### Implementation History

* 2026-02-11: Initial vendor extension specification drafted
* 2026-02-15: Reference implementation published (Python SDK, FastAPI server, JSON Schemas)
* 2026-02-17: Checkout extension updated — removed `prior_session_ids` from checkout payload (privacy concern; agents accumulate content into `content_retrieved` instead), stripped `conversation_summary` to `turn_count` + `topics` only, added `content_retrieved` as required with `minItems: 1`.

## References

* [openattribution-org/telemetry](https://github.com/openattribution-org/telemetry) (repository)
* [SPECIFICATION.md](https://github.com/openattribution-org/telemetry/blob/main/SPECIFICATION.md) (OpenAttribution Telemetry v0.4)
* [schema.json](https://github.com/openattribution-org/telemetry/blob/main/schema.json) (JSON Schema)
* [ucp/EXTENSION.md](https://github.com/openattribution-org/telemetry/blob/main/ucp/EXTENSION.md) (UCP checkout extension spec)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EP: content citation and attribution telemetry #185

Summary

Motivation

Value to UCP

Goals

Non-Goals

Detailed Design

Capability Declaration

Checkout Extension

Standalone Capability

Negotiation

Risks and Mitigations

Fraud Mitigation

Test Plan

Graduation Criteria

Implementation History

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

EP: content citation and attribution telemetry #185

Description

Summary

Motivation

Value to UCP

Goals

Non-Goals

Detailed Design

Capability Declaration

Checkout Extension

Standalone Capability

Negotiation

Risks and Mitigations

Fraud Mitigation

Test Plan

Graduation Criteria

Implementation History

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions