Hello, thank you for the great paper and repository.
I have a question regarding the QR Loss implementation in the current Github code, which seems to differ from what's described in the paper.
The current code snippet is:
loss = nn.MSELoss()(x_querry, x_querry_ori) * self.qr_loss_weight
However, from my understanding of the QR Loss from the paper, the following commented-out code seems to be the correct implementation:
loss = nn.MSELoss()(aq_k, aq_k_ori) * self.qr_loss_weight
Shouldn't the second version be used instead of the current one?
Also, the paper mentions applying a softmax function, but I couldn't find that in the current code. Is the softmax step intentionally omitted, or is is unnecessary in practice?
Hello, thank you for the great paper and repository.
I have a question regarding the QR Loss implementation in the current Github code, which seems to differ from what's described in the paper.
The current code snippet is:
loss = nn.MSELoss()(x_querry, x_querry_ori) * self.qr_loss_weightHowever, from my understanding of the QR Loss from the paper, the following commented-out code seems to be the correct implementation:
loss = nn.MSELoss()(aq_k, aq_k_ori) * self.qr_loss_weightShouldn't the second version be used instead of the current one?
Also, the paper mentions applying a softmax function, but I couldn't find that in the current code. Is the softmax step intentionally omitted, or is is unnecessary in practice?