Skip to content

[bot] No Cohere usage extraction for chat, embeddings, or rerank responses #49

@braintrust-bot

Description

@braintrust-bot

Summary

The SDK provides usage extractors for OpenAI (extract_openai_usage) and Anthropic (extract_anthropic_usage) but has no equivalent for Cohere. Cohere uses a structurally different usage format (usage.billed_units and usage.tokens with nested objects, plus cached_tokens) that neither existing extractor can parse. Braintrust documents Cohere as a supported provider with chat, embeddings, and rerank tracing in other SDKs.

What is missing

Cohere's v2 Chat API returns a usage object with a different structure than OpenAI or Anthropic:

{
  "usage": {
    "billed_units": {
      "input_tokens": 50,
      "output_tokens": 25,
      "search_units": 3,
      "classifications": 0
    },
    "tokens": {
      "input_tokens": 62,
      "output_tokens": 30
    },
    "cached_tokens": 15
  }
}

Key differences from OpenAI/Anthropic formats:

  1. Nested structure: Token counts are inside usage.tokens and usage.billed_units sub-objects, not flat at usage level
  2. Billing-specific fields: billed_units.input_tokens differs from tokens.input_tokens because Cohere adds internal tokens users aren't charged for
  3. Search and classification units: search_units and classifications track non-token billing dimensions (relevant for RAG and classify endpoints)
  4. Cache field: cached_tokens is a flat integer at the usage level, not nested in a details sub-object

Passing a Cohere response through extract_openai_usage() would return empty metrics because the function looks for usage.prompt_tokens / usage.completion_tokens at the top level, which don't exist in Cohere's format.

Cohere also has dedicated Embed API and Rerank API responses with their own usage structures that would need extraction support.

Braintrust docs status

supported — Braintrust's Cohere integration page documents: "instruments the native Cohere Python SDK so you can inspect prompts, responses, streaming behavior, embeddings, and rerank calls in Braintrust." Other Braintrust SDKs (Python, TypeScript) provide wrap_cohere() / wrapCohere() that capture token usage from all Cohere API surfaces.

Upstream sources

Relationship to existing issues

Local files inspected

  • src/extractors.rs — only extract_openai_usage() and extract_anthropic_usage() exist; no Cohere extractor
  • src/types.rsUsageMetrics struct could represent Cohere token data if mapped, but no mapping exists; no fields for search_units or classifications
  • src/stream.rs — stream aggregator only parses OpenAI Chat Completions chunk format
  • src/lib.rs — public API exports; no Cohere references
  • Full codebase grep for "cohere", "billed_units", "search_units" — zero results

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions