-
Notifications
You must be signed in to change notification settings - Fork 15
Open
Description
Hi there!
Thanks for sharing this data, and for your work at nextml!
I wanted to mention a few discrepancies I found with this data, in case they are helpful to know about:
- I think contest 644 has the wrong image associated with it because https://github.com/nextml/caption-contest-data/blob/master/contests/info/644/644_captions.csv do not match https://github.com/nextml/caption-contest-data/blob/master/contests/info/644/644.jpg
- 665 and 666, while matching the shark picture, appear to be near duplicates
- I think https://github.com/nextml/caption-contest-data/blob/master/contests/info/607/607.jpg shouldn't be a duplicate of 605.jpg , I think it should be a picture of death in a coffin.
- https://github.com/nextml/caption-contest-data/blob/master/contests/summaries/688_summary_KLUCB.csv contains the wrong contest id in each row.
- I've encountered some odd behavior within the csvs when the captions themselves contain line breaks. I don't have an example on hand (can find one if it comes up again), but while this should be escape-able with a reasonable csv parser, I pandas' was breaking for me.
- contest 656 (cowboy therapy) has captions for contest 655.
- contest 657 has the wrong captions.
I may encounter more as I play around with the data, and can keep sharing if helpful
Thanks!
Jack
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels