How to understand the results

Hi @Seth-Park ,

I'm struggling to understand the evaluation metrics. In the paper you've got Table 2:

![Screenshot from 2020-10-21 16-21-06](https://user-images.githubusercontent.com/9368849/96733210-7c497b00-13b9-11eb-9555-7c9caddb094f.png)

But after downloading and evaluating your pretrained model I got the following numbers:

```=========Results Summary==========
------------semantic change best result-------------
CIDEr: 1.00742455128 (test)
Bleu_4: 0.511085051903 (test)
Bleu_3: 0.612453337061 (test)
Bleu_2: 0.712512983841 (test)
Bleu_1: 0.80904675167 (test)
ROUGE_L: 0.654282229769 (test)
METEOR: 0.334430665011 (test)
SPICE: 0.2793739702 (test)
------------non-semantic change best result-------------
CIDEr: 1.14646062504 (test)
Bleu_4: 0.618167729466 (test)
Bleu_3: 0.64995045894 (test)
Bleu_2: 0.715953303178 (test)
Bleu_1: 0.783191698339 (test)
ROUGE_L: 0.763303090909 (test)
METEOR: 0.50608216891 (test)
SPICE: 0.346267623357 (test)
------------total best result-------------
CIDEr: 1.14955152668 (test)
Bleu_4: 0.535546570013 (test)
Bleu_3: 0.621429742545 (test)
Bleu_2: 0.71323722181 (test)
Bleu_1: 0.801726535202 (test)
ROUGE_L: 0.708792660339 (test)
METEOR: 0.37936030774 (test)
SPICE: 0.312820796779 (test)
```

So, I believe I should multiply those metrics by 100, right? But those are better than in the paper, i.e. in the TOTAL section:
```
Bleu_4 pretrained: 53.6 > Bleu_4 reported 47.3
CIDEr pretrained: 115.0 > CIDEr reported 112.3
METEOR pretrained 37.9 > METEOR reported 33.9 
SPICE pretrained 31.3 > SPICE reported 24.5
```

Is there any particular reason why you reported smaller numbers in the paper?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to understand the results #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

How to understand the results #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions