How do I construct the input for the “Speech-to-Singing Style Transfer” task?

Hi, Thank you for your great work! I am very interested in this work!

From the inference/style_transfer.py, i need to construct the following input for "Zero-Shot Style Transfer" task.
- prompt audio path
- prompt ph, note, note_dur, note_type
- target ph, note, note_dur, note_type

And I think the “Speech-to-Singing Style Transfer” task only requires the following input, is my understanding correct? I think if the input is speech.WAV, there should be no elements such as ph and note? How should I modify the inference/style_transfer.py for “Speech-to-Singing Style Transfer” task?
- prompt audio path
- target ph, note, note_dur, note_type

"Zero-Shot Style Transfer" and “Speech-to-Singing Style Transfer” task  in https://tcsinger.github.io/#parallel-style-control.
<img width="775" alt="image" src="https://github.com/user-attachments/assets/237f6297-71eb-4735-9de7-8eb12c23770f">
<img width="804" alt="image" src="https://github.com/user-attachments/assets/30ab3062-6a96-4a9d-a710-58187ddf16bc">


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do I construct the input for the “Speech-to-Singing Style Transfer” task? #5

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

How do I construct the input for the “Speech-to-Singing Style Transfer” task? #5

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions