Skip to content

yoonmioh/RobustMorphComp

Repository files navigation

Towards robust complexity indices in linguistic typology: a corpus-based assessment

Supplementary Information

Yoon Mi Oh and François Pellegrino

This repo contains Supplementary Informaton for the paper "Towards robust complexity indices in linguistic typology: a corpus-based assessment".

  • data.txt: the txt file providing most of the information for each language
  • dataMC.txt: the values of four morphological complexity metrics (WID, TTR, MTLD, H) obtained by different corpus sampling configurations (Whole, 5, 10, 20, 40, and 60 subsets)
  • allWID.txt: the average WID estimated from three different corpus configurations (WID_FP, WID_PP, and WID_NP)
  • Figure 1.png: the image file for Figure 1 not generated by R code
  • Figure 12.png: the image file for Figure 12 not generated by R code
  • surprisal.txt: English surprisal estimated at the verse level with the lm-scorer package downloaded from https://github.com/simonepri/lm-scorer using the GPT-2 model
  • WID_PP_NP.txt: WID_PP and WID_NP calculated with randomized English surprisal
  • rWID_PP_NP.txt: Spearman's correlation coefficient between WID_PP and WID_NP
  • SupplementaryInfo.Rmd: the RMarkdown file incorporating the analysis code, the main results detailed in the paper and results from additional analyses
  • SupplementaryInfo.html: the resulting HTML file

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages