NNS-500

Acceptability judgment task dataset based on the sentences written by non-native English speakers

Acceptability judgment task (AJT)

AJT is a common method in empirical linguistics to gather information about the internal grammar of speakers of a language, which is considered a promising area to evaluate neural language models’ linguistic knowledge. There is a Corpus of Linguistic Acceptability (CoLA) whose creators think Boolean judgements sufficient; similarly, some non-English resources cast acceptability as a binary classification task.

Dataset

NNS-500 dataset based on the sentences written by non-native speakers (which is important from the point of view of the source of unacceptable sentences) and labelled by a university English teacher is intended for testing the pre-trained neural networks. It has 350 acceptable and 150 unacceptable sentences, which is 70% of acceptability (this compares to 69.2% in the CoLA out-of-domain set).
Dataset: https://github.com/yualeks63/NNS-500/blob/main/NNS-500_dataset.csv
More information: https://github.com/yualeks63/NNS-500/blob/main/NNS-500_dataset_description.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LICENSE		LICENSE
NNS-500_dataset.csv		NNS-500_dataset.csv
NNS-500_dataset_description.pdf		NNS-500_dataset_description.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NNS-500

Acceptability judgment task (AJT)

Dataset

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

NNS-500

Acceptability judgment task (AJT)

Dataset

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Packages