Thank you very much for publishing your code. I have successfully reproduced the effect in the paper. But during testing, the loss will turn into nan around 80-100 rounds. May I ask why this is happening? How can I do better? I hope to receive your reply, thank you.
Thank you very much for publishing your code. I have successfully reproduced the effect in the paper. But during testing, the loss will turn into nan around 80-100 rounds. May I ask why this is happening? How can I do better? I hope to receive your reply, thank you.