Skip to content

[bot] Cohere: v2 streaming chat drops RAG citations (citation-start/citation-end events) and omits documents/citation_options from metadata #386

@braintrust-bot

Description

@braintrust-bot

Summary

The Cohere v2 streaming chat integration silently drops citation-start and citation-end SSE events during stream aggregation. When users call client.v2.chat_stream() with the documents parameter for RAG, the resulting Braintrust trace contains the model's text output but zero citation data. Additionally, the documents and citation_options request parameters are not captured in span metadata.

This is an asymmetry within the Cohere integration itself: non-streaming v2 chat (client.v2.chat()) returns the full message object including any citations array, but streaming v2 chat loses all citation information.

What is missing

1. Streaming citation events dropped

_aggregate_chat_stream() in py/src/braintrust/integrations/cohere/tracing.py (lines 459–529) handles these v2 event types:

Event type Handled?
message-start Yes (extracts id, role)
content-delta Yes (accumulates text)
tool-call-start Yes
tool-call-delta Yes
message-end Yes (extracts finish_reason, usage)
citation-start No — silently dropped
citation-end No — silently dropped

Each citation-start event contains the citation's start/end character indices, the cited text, and the list of source document references. These fall through all the if event_type == ... / continue blocks and are discarded.

2. RAG request parameters not in metadata

_CHAT_METADATA_KEYS (line 34) does not include:

  • documents — the RAG context documents passed to the model
  • citation_options — controls citation generation mode (fast vs accurate)

These are generative-execution-relevant parameters that directly affect model output quality and citation behavior.

Comparison with other integrations

Integration RAG/grounding metadata captured? Citation/grounding in streaming output?
Google GenAI Yes (google_search tool config in metadata, grounding metadata in output) Yes (grounding metadata extracted from final streaming chunk)
OpenAI Responses Yes (web search tool config, annotations in streaming output) Yes (annotation.added events aggregated)
Anthropic Yes (server tool use metrics + tool spans for search results) Yes (accumulated via accumulate_event())
Cohere No (documents/citation_options missing from metadata) No (citation events dropped in streaming)

Minimum fix

  1. Add citation-start handling to _aggregate_chat_stream() to accumulate citations into the output dict (e.g., output["citations"] = [...])
  2. Add documents and citation_options to _CHAT_METADATA_KEYS
  3. Add VCR-backed test for v2 streaming chat with documents and citation_options

Braintrust docs status

not_found — Cohere is not listed on the Braintrust integrations directory and no Cohere-specific docs page mentions RAG citation support.

Upstream sources

Local files inspected

  • py/src/braintrust/integrations/cohere/tracing.py:
    • _CHAT_METADATA_KEYS (line 34) — missing documents and citation_options
    • _aggregate_chat_stream() (lines 459–529) — handles message-start, content-delta, tool-call-start, tool-call-delta, message-end; no handling for citation-start or citation-end
    • _chat_output() (line 263) — for non-streaming v2, returns the full message object (citations preserved); for streaming, returns the manually aggregated dict (citations lost)
  • py/src/braintrust/integrations/cohere/test_cohere.py — no RAG/citation test cases
  • py/src/braintrust/integrations/cohere/cassettes/latest/test_wrap_cohere_chat_stream_v2_sync.yaml — v2 streaming cassette shows citations data in the response but no test assertions on citation content

Metadata

Metadata

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions