Summary
The SDK provides usage extractors for OpenAI (extract_openai_usage) and Anthropic (extract_anthropic_usage) but has no equivalent for Cohere. Cohere uses a structurally different usage format (usage.billed_units and usage.tokens with nested objects, plus cached_tokens) that neither existing extractor can parse. Braintrust documents Cohere as a supported provider with chat, embeddings, and rerank tracing in other SDKs.
What is missing
Cohere's v2 Chat API returns a usage object with a different structure than OpenAI or Anthropic:
{
"usage": {
"billed_units": {
"input_tokens": 50,
"output_tokens": 25,
"search_units": 3,
"classifications": 0
},
"tokens": {
"input_tokens": 62,
"output_tokens": 30
},
"cached_tokens": 15
}
}
Key differences from OpenAI/Anthropic formats:
- Nested structure: Token counts are inside
usage.tokens and usage.billed_units sub-objects, not flat at usage level
- Billing-specific fields:
billed_units.input_tokens differs from tokens.input_tokens because Cohere adds internal tokens users aren't charged for
- Search and classification units:
search_units and classifications track non-token billing dimensions (relevant for RAG and classify endpoints)
- Cache field:
cached_tokens is a flat integer at the usage level, not nested in a details sub-object
Passing a Cohere response through extract_openai_usage() would return empty metrics because the function looks for usage.prompt_tokens / usage.completion_tokens at the top level, which don't exist in Cohere's format.
Cohere also has dedicated Embed API and Rerank API responses with their own usage structures that would need extraction support.
Braintrust docs status
supported — Braintrust's Cohere integration page documents: "instruments the native Cohere Python SDK so you can inspect prompts, responses, streaming behavior, embeddings, and rerank calls in Braintrust." Other Braintrust SDKs (Python, TypeScript) provide wrap_cohere() / wrapCohere() that capture token usage from all Cohere API surfaces.
Upstream sources
Relationship to existing issues
Local files inspected
src/extractors.rs — only extract_openai_usage() and extract_anthropic_usage() exist; no Cohere extractor
src/types.rs — UsageMetrics struct could represent Cohere token data if mapped, but no mapping exists; no fields for search_units or classifications
src/stream.rs — stream aggregator only parses OpenAI Chat Completions chunk format
src/lib.rs — public API exports; no Cohere references
- Full codebase grep for "cohere", "billed_units", "search_units" — zero results
Summary
The SDK provides usage extractors for OpenAI (
extract_openai_usage) and Anthropic (extract_anthropic_usage) but has no equivalent for Cohere. Cohere uses a structurally different usage format (usage.billed_unitsandusage.tokenswith nested objects, pluscached_tokens) that neither existing extractor can parse. Braintrust documents Cohere as a supported provider with chat, embeddings, and rerank tracing in other SDKs.What is missing
Cohere's v2 Chat API returns a
usageobject with a different structure than OpenAI or Anthropic:{ "usage": { "billed_units": { "input_tokens": 50, "output_tokens": 25, "search_units": 3, "classifications": 0 }, "tokens": { "input_tokens": 62, "output_tokens": 30 }, "cached_tokens": 15 } }Key differences from OpenAI/Anthropic formats:
usage.tokensandusage.billed_unitssub-objects, not flat atusagelevelbilled_units.input_tokensdiffers fromtokens.input_tokensbecause Cohere adds internal tokens users aren't charged forsearch_unitsandclassificationstrack non-token billing dimensions (relevant for RAG and classify endpoints)cached_tokensis a flat integer at theusagelevel, not nested in a details sub-objectPassing a Cohere response through
extract_openai_usage()would return empty metrics because the function looks forusage.prompt_tokens/usage.completion_tokensat the top level, which don't exist in Cohere's format.Cohere also has dedicated Embed API and Rerank API responses with their own usage structures that would need extraction support.
Braintrust docs status
supported — Braintrust's Cohere integration page documents: "instruments the native Cohere Python SDK so you can inspect prompts, responses, streaming behavior, embeddings, and rerank calls in Braintrust." Other Braintrust SDKs (Python, TypeScript) provide
wrap_cohere()/wrapCohere()that capture token usage from all Cohere API surfaces.Upstream sources
usage): https://docs.cohere.com/v2/reference/chatRelationship to existing issues
usageMetadatawith camelCase fields. This covers Cohere's nestedusage.tokens/usage.billed_unitsstructure — a different provider with a different response schema.Local files inspected
src/extractors.rs— onlyextract_openai_usage()andextract_anthropic_usage()exist; no Cohere extractorsrc/types.rs—UsageMetricsstruct could represent Cohere token data if mapped, but no mapping exists; no fields forsearch_unitsorclassificationssrc/stream.rs— stream aggregator only parses OpenAI Chat Completions chunk formatsrc/lib.rs— public API exports; no Cohere references