Summary
The Cohere v2 streaming chat integration silently drops citation-start and citation-end SSE events during stream aggregation. When users call client.v2.chat_stream() with the documents parameter for RAG, the resulting Braintrust trace contains the model's text output but zero citation data. Additionally, the documents and citation_options request parameters are not captured in span metadata.
This is an asymmetry within the Cohere integration itself: non-streaming v2 chat (client.v2.chat()) returns the full message object including any citations array, but streaming v2 chat loses all citation information.
What is missing
1. Streaming citation events dropped
_aggregate_chat_stream() in py/src/braintrust/integrations/cohere/tracing.py (lines 459–529) handles these v2 event types:
| Event type |
Handled? |
message-start |
Yes (extracts id, role) |
content-delta |
Yes (accumulates text) |
tool-call-start |
Yes |
tool-call-delta |
Yes |
message-end |
Yes (extracts finish_reason, usage) |
citation-start |
No — silently dropped |
citation-end |
No — silently dropped |
Each citation-start event contains the citation's start/end character indices, the cited text, and the list of source document references. These fall through all the if event_type == ... / continue blocks and are discarded.
2. RAG request parameters not in metadata
_CHAT_METADATA_KEYS (line 34) does not include:
documents — the RAG context documents passed to the model
citation_options — controls citation generation mode (fast vs accurate)
These are generative-execution-relevant parameters that directly affect model output quality and citation behavior.
Comparison with other integrations
| Integration |
RAG/grounding metadata captured? |
Citation/grounding in streaming output? |
| Google GenAI |
Yes (google_search tool config in metadata, grounding metadata in output) |
Yes (grounding metadata extracted from final streaming chunk) |
| OpenAI Responses |
Yes (web search tool config, annotations in streaming output) |
Yes (annotation.added events aggregated) |
| Anthropic |
Yes (server tool use metrics + tool spans for search results) |
Yes (accumulated via accumulate_event()) |
| Cohere |
No (documents/citation_options missing from metadata) |
No (citation events dropped in streaming) |
Minimum fix
- Add
citation-start handling to _aggregate_chat_stream() to accumulate citations into the output dict (e.g., output["citations"] = [...])
- Add
documents and citation_options to _CHAT_METADATA_KEYS
- Add VCR-backed test for v2 streaming chat with
documents and citation_options
Braintrust docs status
not_found — Cohere is not listed on the Braintrust integrations directory and no Cohere-specific docs page mentions RAG citation support.
Upstream sources
Local files inspected
py/src/braintrust/integrations/cohere/tracing.py:
_CHAT_METADATA_KEYS (line 34) — missing documents and citation_options
_aggregate_chat_stream() (lines 459–529) — handles message-start, content-delta, tool-call-start, tool-call-delta, message-end; no handling for citation-start or citation-end
_chat_output() (line 263) — for non-streaming v2, returns the full message object (citations preserved); for streaming, returns the manually aggregated dict (citations lost)
py/src/braintrust/integrations/cohere/test_cohere.py — no RAG/citation test cases
py/src/braintrust/integrations/cohere/cassettes/latest/test_wrap_cohere_chat_stream_v2_sync.yaml — v2 streaming cassette shows citations data in the response but no test assertions on citation content
Summary
The Cohere v2 streaming chat integration silently drops
citation-startandcitation-endSSE events during stream aggregation. When users callclient.v2.chat_stream()with thedocumentsparameter for RAG, the resulting Braintrust trace contains the model's text output but zero citation data. Additionally, thedocumentsandcitation_optionsrequest parameters are not captured in span metadata.This is an asymmetry within the Cohere integration itself: non-streaming v2 chat (
client.v2.chat()) returns the fullmessageobject including anycitationsarray, but streaming v2 chat loses all citation information.What is missing
1. Streaming citation events dropped
_aggregate_chat_stream()inpy/src/braintrust/integrations/cohere/tracing.py(lines 459–529) handles these v2 event types:message-startid,role)content-deltatool-call-starttool-call-deltamessage-endfinish_reason,usage)citation-startcitation-endEach
citation-startevent contains the citation'sstart/endcharacter indices, the cited text, and the list of source document references. These fall through all theif event_type == .../continueblocks and are discarded.2. RAG request parameters not in metadata
_CHAT_METADATA_KEYS(line 34) does not include:documents— the RAG context documents passed to the modelcitation_options— controls citation generation mode (fastvsaccurate)These are generative-execution-relevant parameters that directly affect model output quality and citation behavior.
Comparison with other integrations
google_searchtool config in metadata, grounding metadata in output)annotation.addedevents aggregated)accumulate_event())documents/citation_optionsmissing from metadata)Minimum fix
citation-starthandling to_aggregate_chat_stream()to accumulate citations into the output dict (e.g.,output["citations"] = [...])documentsandcitation_optionsto_CHAT_METADATA_KEYSdocumentsandcitation_optionsBraintrust docs status
not_found — Cohere is not listed on the Braintrust integrations directory and no Cohere-specific docs page mentions RAG citation support.
Upstream sources
citation-startevent contains: start/end indices, cited text, source document referencescitation-endevent marks the end of each citation blockcitation_optionsparameter supportsmode: "fast"(inline) andmode: "accurate"(post-generation)Local files inspected
py/src/braintrust/integrations/cohere/tracing.py:_CHAT_METADATA_KEYS(line 34) — missingdocumentsandcitation_options_aggregate_chat_stream()(lines 459–529) — handlesmessage-start,content-delta,tool-call-start,tool-call-delta,message-end; no handling forcitation-startorcitation-end_chat_output()(line 263) — for non-streaming v2, returns the fullmessageobject (citations preserved); for streaming, returns the manually aggregated dict (citations lost)py/src/braintrust/integrations/cohere/test_cohere.py— no RAG/citation test casespy/src/braintrust/integrations/cohere/cassettes/latest/test_wrap_cohere_chat_stream_v2_sync.yaml— v2 streaming cassette showscitationsdata in the response but no test assertions on citation content