Skip to content

Conversation

@osinkolu
Copy link

@osinkolu osinkolu commented Feb 6, 2026

Summary

I've added a standalone Jupyter Notebook (Digital_Umuganda_TTS_Inference_Tutorial.ipynb) that allows users to run the Kinyarwanda TTS model in modern environments, such as Google Colab or local Python 3.10+ setups.

Problem

The current repository setup relies on older dependencies (specifically a 2021 commit of coqui-tts) that do not compile easily on newer Python versions. Running the model with the latest coqui-tts library normally causes crashes due to:

  1. Config Mismatches: Missing keys like output_sample_rate and phoneme_language.
  2. Path Errors: Hardcoded local paths in the original config.json.
  3. Runtime Errors: A logic conflict in newer library versions where the speaker embedding is dropped if use_d_vector_file is set to True during inference.

Solution

This notebook provides a "one-click" solution that:

  • Automates Setup: Handles git lfs pull and installs modern dependencies.
  • In-Memory Patching: Fixes the configuration errors dynamically without modifying the existing config.json on disk (preserving the original repo structure).
  • Runtime Fix: Includes a runtime patch to force the model to accept speaker embeddings, resolving the crash in modern Coqui-TTS versions.

Tested successfully on Google Colab.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant