Skip to content

Enhanced Slur Expression and Added F0 Correction Processing in VOICEVOX#1918

Draft
rokujyushi wants to merge 7 commits intostakira:masterfrom
rokujyushi:VOICEVOX-Slur-Support
Draft

Enhanced Slur Expression and Added F0 Correction Processing in VOICEVOX#1918
rokujyushi wants to merge 7 commits intostakira:masterfrom
rokujyushi:VOICEVOX-Slur-Support

Conversation

@rokujyushi
Copy link
Contributor

This pull request introduces multiple improvements and refactoring to the Voicevox renderer integration in OpenUtau. Changes focus on speaker initialization processing, phoneme and note processing, error handling, and improving code modularity. It also adds new utility methods and parameters to enhance slur support and improve robustness in synthesizer parameter generation.

Voicevox Synthesis and Speaker Handling:

  • Added a new method InitializedSpeaker to VoicevoxUtils to ensure speakers are initialized before synthesis and support skipping re-initialization. It is now called at the appropriate point in the rendering pipeline. [1] [2] [3]
  • Improved the process for determining and applying the correct speaker style and ID during synthesis, ensuring the appropriate voice quality and style are used.

Improvements to Phoneme and Note Processing:

  • Refactored the method converting phrases to Voicevox synthesis parameters (PhraseToVoicevoxSynthParams), introducing a new BuildVNotes helper and pitch_slur parameter to improve slur processing and phoneme-note alignment. [1] [2] [3]
  • Added the IsPhonemeNoteCountMatch method to verify phoneme-note alignment, improving parameter generation accuracy.
  • Updated VoicevoxQueryNotes to add vqnindex and slur_index fields, and added an override for ToString() to improve debugging and logging.

Error Handling and Logging:

  • Improved error handling in the synthesis process, distinguishing between Voicevox-specific exceptions and general exceptions, while providing more useful error messages for users.
  • Enhanced logging at synthesis start and error states to aid debugging and traceability. [1] [2]

Utilities and Code Modularity:

  • Added the HashPhraseGroups utility to generate phrase group hashes (useful for caching and deduplication).
  • Enhanced slur (vowel extension note) support in the VOICEVOX renderer and added F0 (pitch) correction processing. We reorganized the logic of PhraseToVoicevoxSynthParams, revised the arguments and exception handling of BuildVNotes, added slur length addition processing to NoteGroupsToVQuery, newly implemented the AdjustF0ForSlur method, and strengthened the robustness of getBaseSingerID. This results in more natural and stable slur expression in OpenUtau.

We have enhanced slur (vowel extension note) support and added F0 (pitch) correction processing in the VOICEVOX renderer. We reorganized the logic in `PhraseToVoicevoxSynthParams`, revised the arguments and exception handling in `BuildVNotes`, added slur length addition processing in `NoteGroupsToVQuery`, newly implemented the `AdjustF0ForSlur` method, and strengthened the robustness of `getBaseSingerID`. This results in more natural and stable slur expression in OpenUtau.
Copilot AI review requested due to automatic review settings January 11, 2026 02:16
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request enhances the Voicevox renderer integration in OpenUtau by adding support for slur expressions and F0 pitch correction processing. The changes refactor synthesis parameter generation, improve speaker initialization, and enhance error handling.

Changes:

  • Added speaker initialization checking with InitializedSpeaker method to ensure speakers are ready before synthesis
  • Refactored phrase-to-synthesis parameter conversion with new BuildVNotes helper and pitch_slur parameter for improved slur support
  • Enhanced error handling with more specific VoicevoxException messages and better logging

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 24 comments.

File Description
OpenUtau.Core/Voicevox/VoicevoxUtils.cs Added InitializedSpeaker method for speaker validation, AdjustF0ForSlur for pitch correction, enhanced NoteGroupsToVQuery with pitch_slur support, improved getBaseSingerID robustness, and renamed utility methods for clarity
OpenUtau.Core/Voicevox/VoicevoxRenderer.cs Refactored PhraseToVoicevoxSynthParams logic with new BuildVNotes and IsPhonemeNoteCountMatch methods, improved speaker initialization flow, enhanced error handling, and added HashPhraseGroups utility method

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@stakira
Copy link
Owner

stakira commented Jan 18, 2026

tests failing

rokujyushi and others added 6 commits January 23, 2026 00:14
Include both phrase.hash and HashPhraseGroups in VoicevoxRenderer's cache file name to prevent conflicts; reorganize parameter generation logic in PhaseToVoicevoxSynthParams, unify variable names and refine error messages. BuildVNotes improves lyrics handling and exception messages; InitializedSpeaker in VoicevoxUtils simplifies JSON parsing and organizes response decisions; NoteGroupsToVQuery clarifies slur processing; and NoteGroupsToVQuery improves the way to handle slurs. Simplified conditional expressions in getBaseSingerID.
…onal branching

Modified the process to be similar to VoicevoxRenderer.BuildVNotes
Corrected function names to refer to functions before modification
The log output when an exception occurs has been changed from the entire exception object to only the exception message (e.Message). This makes the log more concise and suppresses the output of unnecessary information.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +258 to +266
if (IsSyllableVowelExtensionNote(vNotes[index].lyric) && pitch_slur) {
if (index > 0) {
if (VoicevoxUtils.phoneme_List.kanas.TryGetValue(vNotes[index - 1].lyric, out string str)) {
lyric = str;
slur_index++;
}
} else {
slur_index = 0;
}
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slur_index is only incremented for extension notes and never reset when a non-slur note is processed, so once a slur has been seen, all subsequent notes keep a positive slur_index. This causes AdjustF0ForSlur to treat later non-slur notes as part of the same slur group and apply pitch scaling to them incorrectly, instead of starting a new base note (slur_index = 0). Consider resetting slur_index to 0 whenever you encounter a non-extension note so each slur sequence has a clear base note and scope.

Suggested change
if (IsSyllableVowelExtensionNote(vNotes[index].lyric) && pitch_slur) {
if (index > 0) {
if (VoicevoxUtils.phoneme_List.kanas.TryGetValue(vNotes[index - 1].lyric, out string str)) {
lyric = str;
slur_index++;
}
} else {
slur_index = 0;
}
bool isExtensionNote = IsSyllableVowelExtensionNote(vNotes[index].lyric);
if (isExtensionNote && pitch_slur) {
if (index > 0) {
if (VoicevoxUtils.phoneme_List.kanas.TryGetValue(vNotes[index - 1].lyric, out string str)) {
lyric = str;
slur_index++;
} else {
// Failed to inherit kana from previous note; reset slur index.
slur_index = 0;
}
} else {
// First note cannot be an extension base; ensure slur index is reset.
slur_index = 0;
}
} else {
// Non-extension note or pitch slur disabled: start a new slur sequence.
slur_index = 0;

Copilot uses AI. Check for mistakes.
currentLyric = VoicevoxUtils.dic.Lyrictodic(currentLyric);
} else if (!VoicevoxUtils.IsDicKana(currentLyric)) {
} else {
currentLyric = string.Empty;
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this mapping logic, the else branch sets currentLyric to string.Empty even when lyricList[^1] is a valid Kana and not in the dictionary, which means plain Kana lyrics are discarded instead of being passed through. This likely should keep the Kana value (as done in BuildVNotes in VoicevoxRenderer) so that non-dictionary Kana notes are still synthesized rather than treated as empty/rest notes.

Suggested change
currentLyric = string.Empty;
currentLyric = lyricList[^1];

Copilot uses AI. Check for mistakes.
Comment on lines +220 to +224
private List<VoicevoxNote> BuildVNotes(RenderPhrase phrase, string baseSingerID) {
List<VoicevoxNote> vNotes = new List<VoicevoxNote>();
try {
for (int i = 0; i < phrase.notes.Length; i++) {
var durationMs = phrase.notes[i].durationMs;
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The baseSingerID parameter of BuildVNotes is never used inside the method body, which adds noise to the signature and can confuse readers about hidden dependencies. If no future use is planned, consider removing this parameter (and updating call sites) to better reflect the actual inputs the method depends on.

Copilot uses AI. Check for mistakes.
@@ -407,7 +407,7 @@ RenderPitchResult IRenderer.LoadRenderedPitch(RenderPhrase phrase) {
if (singer != null) {

string baseSingerID = VoicevoxUtils.getBaseSingerID(singer);
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local variable baseSingerID is computed here but never used in the rest of LoadRenderedPitch, which can confuse readers into thinking it affects the pitch-loading logic. Consider removing this unused variable (or using it where intended) to keep the method focused on its actual dependencies.

Suggested change
string baseSingerID = VoicevoxUtils.getBaseSingerID(singer);

Copilot uses AI. Check for mistakes.
}
//Usually synthesis adds the length of the slur to the previous note.
if (IsSyllableVowelExtensionNote(vNotes[index].lyric) && !pitch_slur) {
vqMain.notes[index].frame_length += length;
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this loop, when vNotes[index].lyric is a syllable vowel extension note and pitch_slur is false, continue is hit without incrementing index, which will cause an infinite loop on the first slur note and hang synthesis for phrases containing slurs. You should advance index (or otherwise move to the next note) before continuing so that slur notes correctly extend the previous note’s frame_length without blocking progress through vNotes.

Suggested change
vqMain.notes[index].frame_length += length;
vqMain.notes[index].frame_length += length;
index++;

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

修正しました

@rokujyushi rokujyushi marked this pull request as draft January 25, 2026 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants