Simplifying Translations for Children: Iterative Simplification Considering Age of Acquisition with LLMs
This is the official repo for the ACL Findings 2024 paper Simplifying Translations for Children: Iterative Simplification Considering Age of Acquisition with LLMs.
Our research uses Simple-English-Wikipedia for construct simplification model. We can simplify translations for specific age children, considering Age of Aquisition(AoA) when simplification.
git clone https://github.com/nttcslab-nlp/simple-wiki.git
cd simple-wiki
conda create -n simplifyingmt python=3.10 -y
conda activate simplifyingmt
pip install requirements.txt
python3 -m spacy download en_core_web_sm
python src/prepare_test.py --target_age 10 --output_dir_path "/path/to/data_dir"
Specify the directory where the fine-tuned model will be output and the base model.
accelerate launch src/model/finetune.py --model_name_or_path "" --output_dir ""
Specify directory path in generate.sh
sh generate.sh
Specify directory path in evaluate.sh
python -m nltk.downloader punkt
sh evaluate.sh
This software is released under the NTT License, see LICENSE.txt. According to the license, it is not allowed to create pull requests. Please feel free to send issues.
Our dataset is publicly available on HuggingFace under the CC BY-SA 3.0 license.