-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Description
Hi!
I read the original paper and I have a question about MLM metric, which described in Section 4.2.1.
In Section 4.2.1, this metric seems like the summation of the log-likelihood of every token in the response when they masked. But in Figure 3, the MLM score of each token looks like the probability after softmax rather than log(probability), since all of them are larger than zero. Is there any mistake in my understanding?
My other question is, why do negative sign should be added to the summation of likelihood (e.g., -sum_i_|r|(l_i))? In my intuition, the response with a higher likelihood should have a higher score, but the negative sign reverses this tendency.
Thank you!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels