feat: Multi-Speaker Dialogue, 28 Emotional Styles#29
Open
huan-zz3 wants to merge 10 commits into
Open
Conversation
Add comprehensive project documentation tailored for AI agents and developers. Includes project overview, directory structure, code map, and development conventions. - Define core stack (TypeScript, WebSocket, Jest, pnpm) - Map tasks to specific file locations - List code symbols and their roles - Document TypeScript and testing configurations - Specify anti-patterns and unique styles (SSML, WebSocket) - Include SSML reference documentation for speech synthesis This ensures consistent understanding of the codebase architecture and constraints during automated development or refactoring.
Add new TypeScript examples to the example directory to demonstrate core library features and API usage patterns. - Add .gitignore to exclude sensitive config and generated audio files - Add dialogue demo showing SSML structure for multi-role conversation - Add text substitution demo showcasing professional term handling These scripts serve as reference implementations for API integration and help users verify functionality with real-world scenarios.
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
- Delete test-multi-speaker-demo.ts (non-standard location) - Add src/AGENTS.md for core source code documentation - Update root AGENTS.md with complete project structure - Add new source files: DialogueBuilder.ts, DialogueTurn.ts, SSMLUtils.ts
Update project documentation to reflect new API capabilities and conventions. - Add DialogueBuilder class and interfaces (DialogueTurn, TextSegment) - Document SSML utilities (escapeSSML, validateStyle, validateStyleDegree) - Update project overview with current code scale and feature list - Add sections for error handling, logging, and SSML processing conventions - List specific error scenarios and anti-patterns for contributors
- Renamed 6 example TypeScript files to English names - Updated example/README.md with new filenames - All example files now use English naming convention - Git history preserved via git mv
- Translated DialogueTurn.ts: interfaces and class comments - Translated SSMLUtils.ts: function and constant comments - Translated DialogueBuilder.ts: class and method comments - Standardized terminology (Dialogue, Turn, Substitution, SSML, etc.) - All Chinese characters removed from src/ JSDoc comments - Build verification: pnpm run build passes
- Translated all 6 example TypeScript files - Updated all comments, console messages, and error messages to English - Task 7: 00-simple-dialogue-demo.ts - Chinese SSML retained (multilingual demo) - Task 8: 01-multi-speaker-dialogue-chained.ts - Chinese dialogue retained (multilingual demo) - Task 9: 02-multi-speaker-dialogue-functional.ts - Chinese dialogue retained (multilingual demo) - Task 10: 03-31-emotional-styles-demo.ts - Changed to English examples - Task 11: 04-style-degree-control-demo.ts - Changed to English examples - Task 12: 05-text-substitution-demo.ts - Changed substitution examples to English - All output filenames updated to English - Build verification: pnpm run build passes
Wave 4 - Documentation Translation: - example/run.sh: Translated all shell comments and echo messages - example/README.md: Translated complete example documentation - AGENTS.md: Translated project knowledge base (184 lines) - docs/ssml-structure.md: Translated SSML structure documentation (252 lines) - docs/ssml-voice.md: Translated SSML voice documentation (226 lines) - docs/ssml-pronunciation.md: Translated SSML pronunciation docs (199 lines) All documentation now in English with: - Technical documentation style - Accurate SSML terminology - Microsoft documentation attribution retained - Build verification: pnpm run build passes
Additional translation - src/ directory knowledge base
Owner
|
I like the idea. I could maybe add a utils feature that could provide this.
|
Author
|
thanks for teaching, I will make changes as your requirment. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🙏 First PR Notice
✨ New Features Added
1. Multi-Speaker Dialogue Support
Build multi-speaker conversations effortlessly with chainable API:
2. 28 Emotional Styles
Full support for Microsoft Azure's official emotional styles:
Supported Styles:
advertisement_upbeat,affectionate,angry,assistantcalm,chat,cheerful,customerservicedepressed,documentary-narration,empathetic,excitedfearful,friendly,gentle,hopefullyrical,narration-professional,narration-relaxednewscast,newscast-casual,newscast-formalpoetry-reading,sad,serious,shoutingsports_commentary,sports_commentary_excitedterrified,unfriendly,whispering3. Style Degree Control
Fine-tune emotional intensity from 0.01 to 2.0:
4. Text Substitution
Replace abbreviations and technical terms with full pronunciations:
5. Comprehensive Examples
6 ready-to-run examples demonstrating all features:
00-simple-dialogue-demo.ts01-multi-speaker-dialogue-chained.ts02-multi-speaker-dialogue-functional.ts03-31-emotional-styles-demo.ts04-style-degree-control-demo.ts05-text-substitution-demo.ts