Alignments in PyTorch implementation

Mainly as a coding practice and for learning, I'm currently working on implementing the aligner model in PyTorch. 

The predicted mels during training seem to improve, however I somehow didn't get any clear diagonal alignments yet and therefore also no proper results at inference. I have not yet implemented decreasing the reduction factor over training time (going with r=10 all the way) and also not forcing the alignments at given steps - could this be a reason?

I attached the attentions from the last decoder attention at step 89000 - with some fantasy you can see a glimpse of diagonality, but also when training further (upon till ~200k) it doesn't get any better. I train on LJSpeech with the exact configs from this repo.

![grafik](https://github.com/as-ideas/TransformerTTS/assets/81180604/b7d24dcd-cdb5-4102-8bb9-fd6721b74919)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alignments in PyTorch implementation #138

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Alignments in PyTorch implementation #138

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions