Quantifying Phonosemantic Iconicity Distributionally in 6 Languages

This repository has the code required to replicate the experiments reported on in the paper of the above title.

Quickstart

It's recommended you use a Conda environment for this, as otherwise package installation can become iffy, especially on macOS.

conda create -n quanticon_env python=3.11.13 -y
conda activate quanticon_env
conda install -c conda-forge fasttext -y
conda install pip -y
pip install -r requirements.txt
conda install ipykernel -y

Replication of any processes reported on in the paper can take place by running the appropriate cells in the notebook, which has annotations to guide you.

A full replication can take place by setting an OpenAI API key in .env and simply clicking "Run All" in the scripts/experiments.ipynb notebook.

Please note that the API use at the scale reported on in the paper costs some $15 USD. This cost can be mitigated by changing top_n_to_decompose in the "Setup" section of the notebook. You can set your API key with:

echo "OPENAI_API_KEY=your_api_key_here" > .env

The words.csv and roots.csv files were too large to include in the repository, but I'm sure I could transfer them via a fileshare service to anyone interested. If you are, or should you have any other questions/comments/concerns, contact (redacted for anonymity).

Used external assets

G2P (Apache 2.0 License)
Wordfreq (Apache 2.0 License)
Fasttext (MIT License)
Epitran (MIT License)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantifying Phonosemantic Iconicity Distributionally in 6 Languages

Quickstart

Used external assets

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Quantifying Phonosemantic Iconicity Distributionally in 6 Languages

Quickstart

Used external assets