Skip to content

Add Dia TTS fine-tuning tutorial#232

Draft
Deep-unlearning wants to merge 2 commits intomainfrom
finetune-dia
Draft

Add Dia TTS fine-tuning tutorial#232
Deep-unlearning wants to merge 2 commits intomainfrom
finetune-dia

Conversation

@Deep-unlearning
Copy link
Copy Markdown
Contributor

This adds a comprehensive tutorial for fine-tuning Dia (Nari Labs' conversational TTS model) for text-to-speech synthesis on a new language. The tutorial covers:

  • Understanding Dia's multi-speaker dialogue format with [S1], [S2] tags
  • Loading and preparing conversational speech datasets
  • Creating a custom DiaDataCollator for proper audio/text formatting
  • Memory-efficient training with 8-bit optimization
  • Single-speaker and multi-speaker dialogue generation
  • Adapting to different languages (French, German examples)
  • Evaluation strategies for TTS models

This adds a comprehensive tutorial for fine-tuning Dia (Nari Labs' conversational TTS model) for text-to-speech synthesis on a new language. The tutorial covers:

- Understanding Dia's multi-speaker dialogue format with [S1], [S2] tags
- Loading and preparing conversational speech datasets
- Creating a custom DiaDataCollator for proper audio/text formatting
- Memory-efficient training with 8-bit optimization
- Single-speaker and multi-speaker dialogue generation
- Adapting to different languages (French, German examples)
- Evaluation strategies for TTS models
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant