Enhanced Slur Expression and Added F0 Correction Processing in VOICEVOX#1918
Enhanced Slur Expression and Added F0 Correction Processing in VOICEVOX#1918rokujyushi wants to merge 7 commits intostakira:masterfrom
Conversation
We have enhanced slur (vowel extension note) support and added F0 (pitch) correction processing in the VOICEVOX renderer. We reorganized the logic in `PhraseToVoicevoxSynthParams`, revised the arguments and exception handling in `BuildVNotes`, added slur length addition processing in `NoteGroupsToVQuery`, newly implemented the `AdjustF0ForSlur` method, and strengthened the robustness of `getBaseSingerID`. This results in more natural and stable slur expression in OpenUtau.
There was a problem hiding this comment.
Pull request overview
This pull request enhances the Voicevox renderer integration in OpenUtau by adding support for slur expressions and F0 pitch correction processing. The changes refactor synthesis parameter generation, improve speaker initialization, and enhance error handling.
Changes:
- Added speaker initialization checking with
InitializedSpeakermethod to ensure speakers are ready before synthesis - Refactored phrase-to-synthesis parameter conversion with new
BuildVNoteshelper andpitch_slurparameter for improved slur support - Enhanced error handling with more specific VoicevoxException messages and better logging
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 24 comments.
| File | Description |
|---|---|
| OpenUtau.Core/Voicevox/VoicevoxUtils.cs | Added InitializedSpeaker method for speaker validation, AdjustF0ForSlur for pitch correction, enhanced NoteGroupsToVQuery with pitch_slur support, improved getBaseSingerID robustness, and renamed utility methods for clarity |
| OpenUtau.Core/Voicevox/VoicevoxRenderer.cs | Refactored PhraseToVoicevoxSynthParams logic with new BuildVNotes and IsPhonemeNoteCountMatch methods, improved speaker initialization flow, enhanced error handling, and added HashPhraseGroups utility method |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
tests failing |
Include both phrase.hash and HashPhraseGroups in VoicevoxRenderer's cache file name to prevent conflicts; reorganize parameter generation logic in PhaseToVoicevoxSynthParams, unify variable names and refine error messages. BuildVNotes improves lyrics handling and exception messages; InitializedSpeaker in VoicevoxUtils simplifies JSON parsing and organizes response decisions; NoteGroupsToVQuery clarifies slur processing; and NoteGroupsToVQuery improves the way to handle slurs. Simplified conditional expressions in getBaseSingerID.
…onal branching Modified the process to be similar to VoicevoxRenderer.BuildVNotes Corrected function names to refer to functions before modification
…/OpenUtau into VOICEVOX-Slur-Support
The log output when an exception occurs has been changed from the entire exception object to only the exception message (e.Message). This makes the log more concise and suppresses the output of unnecessary information.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (IsSyllableVowelExtensionNote(vNotes[index].lyric) && pitch_slur) { | ||
| if (index > 0) { | ||
| if (VoicevoxUtils.phoneme_List.kanas.TryGetValue(vNotes[index - 1].lyric, out string str)) { | ||
| lyric = str; | ||
| slur_index++; | ||
| } | ||
| } else { | ||
| slur_index = 0; | ||
| } |
There was a problem hiding this comment.
slur_index is only incremented for extension notes and never reset when a non-slur note is processed, so once a slur has been seen, all subsequent notes keep a positive slur_index. This causes AdjustF0ForSlur to treat later non-slur notes as part of the same slur group and apply pitch scaling to them incorrectly, instead of starting a new base note (slur_index = 0). Consider resetting slur_index to 0 whenever you encounter a non-extension note so each slur sequence has a clear base note and scope.
| if (IsSyllableVowelExtensionNote(vNotes[index].lyric) && pitch_slur) { | |
| if (index > 0) { | |
| if (VoicevoxUtils.phoneme_List.kanas.TryGetValue(vNotes[index - 1].lyric, out string str)) { | |
| lyric = str; | |
| slur_index++; | |
| } | |
| } else { | |
| slur_index = 0; | |
| } | |
| bool isExtensionNote = IsSyllableVowelExtensionNote(vNotes[index].lyric); | |
| if (isExtensionNote && pitch_slur) { | |
| if (index > 0) { | |
| if (VoicevoxUtils.phoneme_List.kanas.TryGetValue(vNotes[index - 1].lyric, out string str)) { | |
| lyric = str; | |
| slur_index++; | |
| } else { | |
| // Failed to inherit kana from previous note; reset slur index. | |
| slur_index = 0; | |
| } | |
| } else { | |
| // First note cannot be an extension base; ensure slur index is reset. | |
| slur_index = 0; | |
| } | |
| } else { | |
| // Non-extension note or pitch slur disabled: start a new slur sequence. | |
| slur_index = 0; |
| currentLyric = VoicevoxUtils.dic.Lyrictodic(currentLyric); | ||
| } else if (!VoicevoxUtils.IsDicKana(currentLyric)) { | ||
| } else { | ||
| currentLyric = string.Empty; |
There was a problem hiding this comment.
In this mapping logic, the else branch sets currentLyric to string.Empty even when lyricList[^1] is a valid Kana and not in the dictionary, which means plain Kana lyrics are discarded instead of being passed through. This likely should keep the Kana value (as done in BuildVNotes in VoicevoxRenderer) so that non-dictionary Kana notes are still synthesized rather than treated as empty/rest notes.
| currentLyric = string.Empty; | |
| currentLyric = lyricList[^1]; |
| private List<VoicevoxNote> BuildVNotes(RenderPhrase phrase, string baseSingerID) { | ||
| List<VoicevoxNote> vNotes = new List<VoicevoxNote>(); | ||
| try { | ||
| for (int i = 0; i < phrase.notes.Length; i++) { | ||
| var durationMs = phrase.notes[i].durationMs; |
There was a problem hiding this comment.
The baseSingerID parameter of BuildVNotes is never used inside the method body, which adds noise to the signature and can confuse readers about hidden dependencies. If no future use is planned, consider removing this parameter (and updating call sites) to better reflect the actual inputs the method depends on.
| @@ -407,7 +407,7 @@ RenderPitchResult IRenderer.LoadRenderedPitch(RenderPhrase phrase) { | |||
| if (singer != null) { | |||
|
|
|||
| string baseSingerID = VoicevoxUtils.getBaseSingerID(singer); | |||
There was a problem hiding this comment.
The local variable baseSingerID is computed here but never used in the rest of LoadRenderedPitch, which can confuse readers into thinking it affects the pitch-loading logic. Consider removing this unused variable (or using it where intended) to keep the method focused on its actual dependencies.
| string baseSingerID = VoicevoxUtils.getBaseSingerID(singer); |
| } | ||
| //Usually synthesis adds the length of the slur to the previous note. | ||
| if (IsSyllableVowelExtensionNote(vNotes[index].lyric) && !pitch_slur) { | ||
| vqMain.notes[index].frame_length += length; |
There was a problem hiding this comment.
In this loop, when vNotes[index].lyric is a syllable vowel extension note and pitch_slur is false, continue is hit without incrementing index, which will cause an infinite loop on the first slur note and hang synthesis for phrases containing slurs. You should advance index (or otherwise move to the next note) before continuing so that slur notes correctly extend the previous note’s frame_length without blocking progress through vNotes.
| vqMain.notes[index].frame_length += length; | |
| vqMain.notes[index].frame_length += length; | |
| index++; |
This pull request introduces multiple improvements and refactoring to the Voicevox renderer integration in OpenUtau. Changes focus on speaker initialization processing, phoneme and note processing, error handling, and improving code modularity. It also adds new utility methods and parameters to enhance slur support and improve robustness in synthesizer parameter generation.
Voicevox Synthesis and Speaker Handling:
InitializedSpeakertoVoicevoxUtilsto ensure speakers are initialized before synthesis and support skipping re-initialization. It is now called at the appropriate point in the rendering pipeline. [1] [2] [3]Improvements to Phoneme and Note Processing:
Error Handling and Logging:
Utilities and Code Modularity: