Skip to content

[draft][VoiceLive] Add support for built-in web search and file search tools#49060

Open
xitzhang wants to merge 5 commits intomainfrom
voicelive/ga-1.0.0
Open

[draft][VoiceLive] Add support for built-in web search and file search tools#49060
xitzhang wants to merge 5 commits intomainfrom
voicelive/ga-1.0.0

Conversation

@xitzhang
Copy link
Copy Markdown
Member

@xitzhang xitzhang commented May 5, 2026

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

If an SDK is being regenerated based on a new swagger spec, a link to the pull request containing these swagger spec changes has been included above.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

Xiting Zhang added 2 commits May 5, 2026 14:03
First General Availability release of the Azure VoiceLive client library for Java.

Highlights:

- Avatar voice sync (AzureVoiceType.AVATAR_VOICE_SYNC, AzureAvatarVoiceSyncVoice), avatar lifecycle events (session.avatar.switch_to_speaking/idle), response.video.delta, and output_audio_buffer.clear/cleared events.

- Web search and file search tool calls (ResponseWebSearchCallItem, ResponseFileSearchCallItem with FileSearchResult) plus full searching/in_progress/completed lifecycle server events.

- Transcription enhancements: TranscriptionPhrase / TranscriptionWord with timing and confidence, getLogprobs() / getPhrases() on transcription-completed, response.audio_transcript.annotation.added event, and new gpt-4o-transcribe-diarize / mai-transcribe-1 models.

- Reasoning token usage (OutputTokenDetails.getReasoningTokens) and per-request interim response (ResponseCreateParams.setInterimResponse).

- Session include options (SessionIncludeOption) and metadata on VoiceLiveSessionOptions / VoiceLiveSessionResponse.

- Personal voice catalog: added DRAGON_HDOMNI_LATEST_NEURAL and MAI_VOICE_1, removed PHOENIX_V2NEURAL.

- Version bumped to 1.0.0 in pom.xml, README.md, eng/versioning/version_client.txt, and CHANGELOG.md.

- Added unit tests for the new GA model classes and lifecycle events (49 new test cases, all passing).
Copilot AI review requested due to automatic review settings May 5, 2026 21:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the azure-ai-voicelive package toward its 1.0.0 GA shape by regenerating the VoiceLive models from a newer spec and adding support for newly surfaced VoiceLive features in the Java SDK.

Changes:

  • Adds new VoiceLive model/event types for web search, file search, avatar/video, transcription enhancements, session include options, interim responses, and reasoning token usage.
  • Updates voice/transcription enums, adds AzureAvatarVoiceSyncVoice, and refreshes generated Javadoc/comments across several model types.
  • Bumps the package toward 1.0.0 and updates README / changelog / tests to reflect the expanded API surface.

Reviewed changes

Copilot reviewed 55 out of 55 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
sdk/voicelive/azure-ai-voicelive/tsp-location.yaml Updates the pinned TypeSpec/spec commit used for generation.
sdk/voicelive/azure-ai-voicelive/src/test/java/com/azure/ai/voicelive/models/WebAndFileSearchTest.java Adds unit tests for new web/file search items and lifecycle events.
sdk/voicelive/azure-ai-voicelive/src/test/java/com/azure/ai/voicelive/models/TranscriptionAndIncludeOptionsTest.java Adds tests for transcription models, include/metadata, annotations, and interim response support.
sdk/voicelive/azure-ai-voicelive/src/test/java/com/azure/ai/voicelive/models/AzureAvatarVoiceSyncVoiceTest.java Adds tests for the new avatar voice-sync model.
sdk/voicelive/azure-ai-voicelive/src/test/java/com/azure/ai/voicelive/models/AvatarAndAudioBufferEventsTest.java Adds tests for avatar lifecycle and output-audio-buffer events.
sdk/voicelive/azure-ai-voicelive/src/samples/java/com/azure/ai/voicelive/ReadmeSamples.java Updates sample comments for personal voice models.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/VoiceLiveSessionResponse.java Adds include/metadata serialization and accessors to session responses.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/VoiceLiveSessionOptions.java Adds include/metadata serialization and accessors to session options.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/TranscriptionWord.java New generated transcription word model.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/TranscriptionPhrase.java New generated transcription phrase model.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/SessionUpdateConversationItemInputAudioTranscriptionCompleted.java Adds phrase/logprob fields to transcription-completed events.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/SessionUpdate.java Registers new server event discriminators.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/SessionResponseItem.java Registers new response item discriminators.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/SessionIncludeOption.java New expandable enum for include options.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ServerVadTurnDetection.java Improves generated Javadoc text.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ServerEventType.java Adds new server event constants.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ServerEventSessionAvatarSwitchToSpeaking.java New avatar-speaking server event model.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ServerEventSessionAvatarSwitchToIdle.java New avatar-idle server event model.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ServerEventResponseWebSearchCallSearching.java New web-search searching event model.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ServerEventResponseWebSearchCallInProgress.java New web-search in-progress event model.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ServerEventResponseWebSearchCallCompleted.java New web-search completed event model.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ServerEventResponseVideoDelta.java New streamed video delta event model.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ServerEventResponseFileSearchCallSearching.java New file-search searching event model.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ServerEventResponseFileSearchCallInProgress.java New file-search in-progress event model.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ServerEventResponseFileSearchCallCompleted.java New file-search completed event model.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ServerEventResponseAudioTranscriptAnnotationAdded.java New transcript annotation-added event model.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ServerEventOutputAudioBufferCleared.java New output-audio-buffer cleared event model.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ResponseWebSearchCallItemStatus.java New web-search item status enum.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ResponseWebSearchCallItem.java New web-search response item model.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ResponseFileSearchCallItemStatus.java New file-search item status enum.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ResponseFileSearchCallItem.java New file-search response item model.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ResponseCreateParams.java Adds per-response interim response support.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/PersonalVoiceModels.java Removes one older model constant and adds new GA model constants.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/OutputTokenDetails.java Adds reasoning token deserialization/accessor.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ItemType.java Adds new response item type constants.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/FileSearchResult.java New file search result model.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ClientEventType.java Adds output-audio-buffer clear event constant.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ClientEventOutputAudioBufferClear.java New client event for clearing output audio buffer.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/ClientEvent.java Registers the new client event discriminator.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/AzureVoiceType.java Adds avatar voice-sync discriminator value.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/AzureVoice.java Registers the new avatar voice subtype.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/AzureStandardVoice.java Improves generated Javadoc text.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/AzureSemanticVadTurnDetectionMultilingual.java Improves generated Javadoc text.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/AzureSemanticVadTurnDetectionEn.java Improves generated Javadoc text.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/AzureSemanticVadTurnDetection.java Improves generated Javadoc text.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/AzurePersonalVoice.java Improves generated Javadoc text.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/AzureCustomVoice.java Improves generated Javadoc text.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/AzureAvatarVoiceSyncVoice.java New avatar voice-sync model and JSON mapping.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/AudioInputTranscriptionOptionsModel.java Adds new transcription model constants.
sdk/voicelive/azure-ai-voicelive/src/main/java/com/azure/ai/voicelive/models/AudioInputTranscriptionOptions.java Updates model documentation text.
sdk/voicelive/azure-ai-voicelive/README.md Updates package version and sample model comments.
sdk/voicelive/azure-ai-voicelive/pom.xml Bumps artifact version to 1.0.0.
sdk/voicelive/azure-ai-voicelive/CHANGELOG.md Rewrites release notes for the 1.0.0 GA release.
eng/versioning/version_client.txt Updates central versioning entry for the package.

Comment thread sdk/voicelive/azure-ai-voicelive/CHANGELOG.md Outdated
Xiting Zhang added 2 commits May 5, 2026 14:41
…eaks

- Add 4 unit tests for SessionUpdateConversationItemInputAudioTranscriptionCompleted covering the new logprobs and phrases arrays (full payload, backward-compat without arrays, JSON round-trip, polymorphic dispatch via SessionUpdate).

- Trim CHANGELOG sections that don't apply to the GA delta from 1.0.0-beta.6.

- Add HDOMNI and SSML to the workspace cspell dictionary.
@xitzhang
Copy link
Copy Markdown
Member Author

xitzhang commented May 5, 2026

/azp run
java - azure-ai-voicelive - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants