Summary
The SDK instruments OpenAI chat completions, Responses API, and moderations for both the openai (official) and ruby-openai gems, but does not instrument any of the Audio APIs — transcription (speech-to-text), speech synthesis (text-to-speech), or translation. These are stable generative execution APIs that use AI models (Whisper for transcription/translation, TTS models for speech) and return structured results with usage metrics.
What is missing
openai gem (official)
Three resource classes under OpenAI::Resources::Audio:
client.audio.transcriptions.create — Speech-to-text using Whisper. Accepts audio file + model, returns text transcription with optional token/segment detail.
client.audio.speech.create — Text-to-speech using TTS models (tts-1, tts-1-hd). Accepts text input + voice + model, returns audio binary.
client.audio.translations.create — Audio translation to English using Whisper. Accepts audio file + model, returns translated text.
Source: lib/openai/resources/audio/ in openai/openai-ruby contains transcriptions.rb, speech.rb, translations.rb.
ruby-openai gem
Equivalent methods:
client.audio.transcribe(parameters: {...})
client.audio.speech(parameters: {...})
client.audio.translate(parameters: {...})
Source: Documented under the "Whisper" section in the ruby-openai README.
Expected instrumentation
New patchers (similar to the existing ModerationsPatcher) for each audio surface:
Transcription spans should capture:
- Input: audio file reference, language hint
- Metadata: model (e.g.,
whisper-1), response_format, language, provider, endpoint
- Metrics: duration (audio length), tokens if available
- Output: transcription text
Speech spans should capture:
- Input: text to synthesize
- Metadata: model (e.g.,
tts-1), voice, speed, response_format, provider, endpoint
- Output: audio format/size metadata (not the binary audio itself)
Translation spans should capture the same fields as transcription.
Braintrust docs status
not_found — Braintrust docs at https://www.braintrust.dev/docs/instrument/trace-llm-calls list both openai and ruby-openai as supported Ruby libraries but do not mention audio API instrumentation. All examples focus on chat completions.
Upstream sources
Local repo files inspected
lib/braintrust/contrib/openai/integration.rb — registers ChatPatcher, ResponsesPatcher, ModerationsPatcher only; no audio patchers
lib/braintrust/contrib/openai/instrumentation/ — contains chat.rb, responses.rb, moderations.rb, common.rb; no audio files
lib/braintrust/contrib/ruby_openai/integration.rb — registers ChatPatcher, ResponsesPatcher, ModerationsPatcher only; no audio patchers
lib/braintrust/contrib/ruby_openai/instrumentation/ — contains chat.rb, responses.rb, moderations.rb, common.rb; no audio files
- Grep for
audio, transcri, speech, whisper across lib/braintrust/ returns zero matches
Summary
The SDK instruments OpenAI chat completions, Responses API, and moderations for both the
openai(official) andruby-openaigems, but does not instrument any of the Audio APIs — transcription (speech-to-text), speech synthesis (text-to-speech), or translation. These are stable generative execution APIs that use AI models (Whisper for transcription/translation, TTS models for speech) and return structured results with usage metrics.What is missing
openaigem (official)Three resource classes under
OpenAI::Resources::Audio:client.audio.transcriptions.create— Speech-to-text using Whisper. Accepts audio file + model, returns text transcription with optional token/segment detail.client.audio.speech.create— Text-to-speech using TTS models (tts-1,tts-1-hd). Accepts text input + voice + model, returns audio binary.client.audio.translations.create— Audio translation to English using Whisper. Accepts audio file + model, returns translated text.Source:
lib/openai/resources/audio/in openai/openai-ruby containstranscriptions.rb,speech.rb,translations.rb.ruby-openaigemEquivalent methods:
client.audio.transcribe(parameters: {...})client.audio.speech(parameters: {...})client.audio.translate(parameters: {...})Source: Documented under the "Whisper" section in the ruby-openai README.
Expected instrumentation
New patchers (similar to the existing
ModerationsPatcher) for each audio surface:Transcription spans should capture:
whisper-1), response_format, language, provider, endpointSpeech spans should capture:
tts-1), voice, speed, response_format, provider, endpointTranslation spans should capture the same fields as transcription.
Braintrust docs status
not_found— Braintrust docs at https://www.braintrust.dev/docs/instrument/trace-llm-calls list bothopenaiandruby-openaias supported Ruby libraries but do not mention audio API instrumentation. All examples focus on chat completions.Upstream sources
Local repo files inspected
lib/braintrust/contrib/openai/integration.rb— registersChatPatcher,ResponsesPatcher,ModerationsPatcheronly; no audio patcherslib/braintrust/contrib/openai/instrumentation/— containschat.rb,responses.rb,moderations.rb,common.rb; no audio fileslib/braintrust/contrib/ruby_openai/integration.rb— registersChatPatcher,ResponsesPatcher,ModerationsPatcheronly; no audio patcherslib/braintrust/contrib/ruby_openai/instrumentation/— containschat.rb,responses.rb,moderations.rb,common.rb; no audio filesaudio,transcri,speech,whisperacrosslib/braintrust/returns zero matches