Skip to content

[bot] Streaming aggregator collapses multiple choices (n>1) into a single output #48

@braintrust-bot

Description

@braintrust-bot

Summary

The BraintrustStream streaming aggregator merges all choice indices into a single OutputChoice at index 0. When a Chat Completions request uses n > 1 to generate multiple parallel completions, the streamed chunks from different choice indices are concatenated into one string, producing a corrupted single output instead of preserving each choice separately.

What is missing

OpenAI Chat Completions streaming with n > 1 sends chunks tagged with a choices[].index field to distinguish parallel completions:

{"choices": [{"index": 0, "delta": {"role": "assistant", "content": "Hello"}}]}
{"choices": [{"index": 1, "delta": {"role": "assistant", "content": "Hi"}}]}
{"choices": [{"index": 0, "delta": {"content": " world"}}]}
{"choices": [{"index": 1, "delta": {"content": " there"}}]}

The expected aggregated output should be two choices:

  • Choice 0: "Hello world"
  • Choice 1: "Hi there"

Currently in src/stream.rs:727-807, the aggregate() function:

  1. Ignores the choice index — loops over all chunk.choices (line 771) but accumulates into a single aggregated_content: String and single role: Option<String> regardless of which choice index the delta belongs to
  2. Hardcodes a single output — creates one OutputChoice at index 0 (line 804) and wraps it in a single-element vector (line 806)
  3. Corrupts content — text from choice 0 and choice 1 are interleaved into one string ("Hello Hi world there")
  4. Loses finish reasons — only the last finish_reason seen across all choices is kept (line 773-777), discarding per-choice finish reasons

The fix would involve:

  • Tracking per-index state (content, role, tool_calls, finish_reason) using a HashMap<usize, ChoiceState> or similar
  • Building one OutputChoice per distinct index seen in the stream
  • Returning all choices in the FinalizedStream::output vector

Braintrust docs status

supported — Braintrust's streaming documentation states that streaming responses are fully supported and automatically aggregated. The OpenAI integration docs do not specifically mention n > 1 handling. Status: unclear for multi-choice streaming specifically.

Upstream sources

Relationship to existing issues

Local files inspected

  • src/stream.rs:727-807aggregate() uses single aggregated_content, role, finish_reason variables for all choices; creates one OutputChoice at index 0 on line 804
  • src/stream.rs:658-664StreamChoice struct does have an implicit index from the array position but no explicit index field parsed from the JSON
  • src/stream.rs:354-395OutputChoice struct supports an index field, so the output type can represent multiple choices
  • src/stream.rs:530-540FinalizedStream stores output: Vec<OutputChoice>, capable of holding multiple choices

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions