feat: Multi-Speaker Dialogue, 28 Emotional Styles by huan-zz3 · Pull Request #29 · Migushthe2nd/MsEdgeTTS

huan-zz3 · 2026-03-22T11:59:17Z

🙏 First PR Notice

🌟 This is my first Pull Request ever! 🌟

I'm incredibly excited (and a bit nervous) to be contributing to this project. While I've done my best to ensure everything is perfect, I understand there might be areas that need improvement.

If you find anything:

Incomplete translations

Awkward phrasing

Missing edge cases

Better ways to structure things

Any issues whatsoever

Please don't hesitate to let me know! I'm committed to making this PR as good as it needs to be. I'll respond quickly to any feedback and make changes until you're completely satisfied with the result.

Your guidance and review would mean a lot to me as I learn the contribution process. Thank you for your time and consideration! 🙏

✨ New Features Added

1. Multi-Speaker Dialogue Support

Build multi-speaker conversations effortlessly with chainable API:

import { DialogueBuilder, buildDialogueSSML } from "msedge-tts";

// Method 1: Chained builder
const dialogue = new DialogueBuilder()
  .addTurn({
    voice: "zh-CN-XiaoxiaoNeural",
    text: "Hello everyone!",
    style: "cheerful"
  })
  .addTurn({
    voice: "en-US-AndrewNeural",
    text: "Welcome to our podcast!",
    lang: "en-US"
  })
  .build();

// Method 2: Functional API
const ssml = buildDialogueSSML([
  { voice: "zh-CN-YunxiNeural", text: "Today we discuss AI", style: "documentary-narration" },
  { voice: "en-US-AriaNeural", text: "That's right!", style: "excited", lang: "en-US" }
]);

2. 28 Emotional Styles

Full support for Microsoft Azure's official emotional styles:

const ssml = buildDialogueSSML([
  { voice: "zh-CN-XiaomoNeural", text: "I'm so happy!", style: "cheerful" },
  { voice: "zh-CN-XiaoxiaoNeural", text: "Welcome to customer service", style: "customerservice" },
  { voice: "en-US-JennyNeural", text: "Breaking news!", style: "newscast-formal", lang: "en-US" }
]);

Supported Styles:

advertisement_upbeat, affectionate, angry, assistant
calm, chat, cheerful, customerservice
depressed, documentary-narration, empathetic, excited
fearful, friendly, gentle, hopeful
lyrical, narration-professional, narration-relaxed
newscast, newscast-casual, newscast-formal
poetry-reading, sad, serious, shouting
sports_commentary, sports_commentary_excited
terrified, unfriendly, whispering

3. Style Degree Control

Fine-tune emotional intensity from 0.01 to 2.0:

const ssml = buildDialogueSSML([
  { 
    voice: "zh-CN-XiaomoNeural", 
    text: "This is normal",
    style: "sad",
    styleDegree: 0.5  // Weaker emotion
  },
  { 
    voice: "zh-CN-XiaomoNeural", 
    text: "This is heartbreaking!",
    style: "sad",
    styleDegree: 2.0  // Strongest emotion
  }
]);

4. Text Substitution

Replace abbreviations and technical terms with full pronunciations:

const ssml = buildDialogueSSML([
  { 
    voice: "zh-CN-XiaoxiaoNeural",
    text: "W3C制定了 Web 标准，API 基于 HTTP 协议",
    substitutions: [
      { text: "W3C", alias: "万维网联盟" },
      { text: "Web", alias: "万维网" },
      { text: "HTTP", alias: "超文本传输协议" }
    ],
    style: "narration-professional"
  }
]);

5. Comprehensive Examples

6 ready-to-run examples demonstrating all features:

Example	Description	File
0	Simple dialogue demo	`00-simple-dialogue-demo.ts`
1	Multi-speaker (chained)	`01-multi-speaker-dialogue-chained.ts`
2	Multi-speaker (functional)	`02-multi-speaker-dialogue-functional.ts`
3	31 emotional styles	`03-31-emotional-styles-demo.ts`
4	Style degree control	`04-style-degree-control-demo.ts`
5	Text substitution	`05-text-substitution-demo.ts`

Add comprehensive project documentation tailored for AI agents and developers. Includes project overview, directory structure, code map, and development conventions. - Define core stack (TypeScript, WebSocket, Jest, pnpm) - Map tasks to specific file locations - List code symbols and their roles - Document TypeScript and testing configurations - Specify anti-patterns and unique styles (SSML, WebSocket) - Include SSML reference documentation for speech synthesis This ensures consistent understanding of the codebase architecture and constraints during automated development or refactoring.

Add new TypeScript examples to the example directory to demonstrate core library features and API usage patterns. - Add .gitignore to exclude sensitive config and generated audio files - Add dialogue demo showing SSML structure for multi-role conversation - Add text substitution demo showcasing professional term handling These scripts serve as reference implementations for API integration and help users verify functionality with real-world scenarios.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

- Delete test-multi-speaker-demo.ts (non-standard location) - Add src/AGENTS.md for core source code documentation - Update root AGENTS.md with complete project structure - Add new source files: DialogueBuilder.ts, DialogueTurn.ts, SSMLUtils.ts

Update project documentation to reflect new API capabilities and conventions. - Add DialogueBuilder class and interfaces (DialogueTurn, TextSegment) - Document SSML utilities (escapeSSML, validateStyle, validateStyleDegree) - Update project overview with current code scale and feature list - Add sections for error handling, logging, and SSML processing conventions - List specific error scenarios and anti-patterns for contributors

- Renamed 6 example TypeScript files to English names - Updated example/README.md with new filenames - All example files now use English naming convention - Git history preserved via git mv

- Translated DialogueTurn.ts: interfaces and class comments - Translated SSMLUtils.ts: function and constant comments - Translated DialogueBuilder.ts: class and method comments - Standardized terminology (Dialogue, Turn, Substitution, SSML, etc.) - All Chinese characters removed from src/ JSDoc comments - Build verification: pnpm run build passes

- Translated all 6 example TypeScript files - Updated all comments, console messages, and error messages to English - Task 7: 00-simple-dialogue-demo.ts - Chinese SSML retained (multilingual demo) - Task 8: 01-multi-speaker-dialogue-chained.ts - Chinese dialogue retained (multilingual demo) - Task 9: 02-multi-speaker-dialogue-functional.ts - Chinese dialogue retained (multilingual demo) - Task 10: 03-31-emotional-styles-demo.ts - Changed to English examples - Task 11: 04-style-degree-control-demo.ts - Changed to English examples - Task 12: 05-text-substitution-demo.ts - Changed substitution examples to English - All output filenames updated to English - Build verification: pnpm run build passes

Wave 4 - Documentation Translation: - example/run.sh: Translated all shell comments and echo messages - example/README.md: Translated complete example documentation - AGENTS.md: Translated project knowledge base (184 lines) - docs/ssml-structure.md: Translated SSML structure documentation (252 lines) - docs/ssml-voice.md: Translated SSML voice documentation (226 lines) - docs/ssml-pronunciation.md: Translated SSML pronunciation docs (199 lines) All documentation now in English with: - Technical documentation style - Accurate SSML terminology - Microsoft documentation attribution retained - Build verification: pnpm run build passes

Additional translation - src/ directory knowledge base

Migushthe2nd · 2026-03-23T09:22:32Z

I like the idea. I could maybe add a utils feature that could provide this.
Please however

remove ai docs
remove ai agent files
use an xml escape library, not a custom-made function (this library could provide a automatic escaping in the non-raw functions)
review your code, examples and config, to remove any reliances on and mentions of your "https://ttspro.cn/" website, and do not use a config like this in the first place. (See test scripts for examples)
please verify for me that all your voice types, parameters, and ssml speech tags work. I feel like AI just assumed all of them work (while some are blocked - see readme.md)

huan-zz3 · 2026-03-23T13:18:57Z

thanks for teaching, I will make changes as your requirment.

huan-zz3 and others added 10 commits March 18, 2026 16:32

chore: remove root-level test file

7df85a5

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>

docs: rename example files to English names and update references

ad1e60f

- Renamed 6 example TypeScript files to English names - Updated example/README.md with new filenames - All example files now use English naming convention - Git history preserved via git mv

docs: translate src/AGENTS.md to English

b3b97da

Additional translation - src/ directory knowledge base

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Multi-Speaker Dialogue, 28 Emotional Styles#29

feat: Multi-Speaker Dialogue, 28 Emotional Styles#29
huan-zz3 wants to merge 10 commits into
Migushthe2nd:mainfrom
huan-zz3:main

huan-zz3 commented Mar 22, 2026

Uh oh!

Migushthe2nd commented Mar 23, 2026 •

edited

Loading

Uh oh!

huan-zz3 commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

huan-zz3 commented Mar 22, 2026

🙏 First PR Notice

✨ New Features Added

1. Multi-Speaker Dialogue Support

2. 28 Emotional Styles

3. Style Degree Control

4. Text Substitution

5. Comprehensive Examples

Uh oh!

Migushthe2nd commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

huan-zz3 commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Migushthe2nd commented Mar 23, 2026 •

edited

Loading