ling573

Repository for our CLMS LING 573 group project. Evaluated on the BillSum corpus, our system:

takes in a plaintext legislative bill document
segments it into semantically coherent chunks
applies neural syntactic simplification to each chunk
generates a summary of the document with improved readability metrics than human-written summaries and SOTA baseline models.

Conda first-time setup

Clone repository if it does not already exist

git clone git@github.com:AnanthaR20/ling573.git

Download miniconda for your OS
Create a new environment:

conda create -n 573-env

Activate the environment to start developing! Yay!

conda activate 573-env

Install all the required packages:

pip install -r requirements.txt

Download the optimized spaCy English language model for evaluation

python -m spacy download en_core_web_sm

Note that Conda environments have a slightly different setup if you run the wugwATSS system on Hyak. Official documentation can be found here

Virtual environment usage

Virtual environment invocation is managed by our controller shell script, generate_run.sh by specifying the correct config file, patas.config or hyak.config. To explain each parameter provided in the config:

# Specify the platform that is running the augmented system; the only valid options are "patas" or "hyak"
PLATFORM="patas" 
# any Huggingface Seq2Seq model for summarization can be used here
CHECKPOINT="google/pegasus-billsum" 
# For Deliverable 3, we only need to generate model predictions on the preprocessed data
MODE="predict" 
# Filepath relative to the root folder of this repository
TESTFILE="preprocess/data/billsum_clean_test_se3-t5-512-512.csv" 
# For Deliverable 3, we only need to reconstruct full summaries after summarizing chunks
CONCAT="post"
# Patas and Hyak can handle different batch sizes due to memory constraints
BATCH_SIZE=4

Augmented system

Note that the test data is directly preloaded in the preprocess directory of this repository so no extra data downloads are necessary to run the system. Additionally, all ROUGE and readability formulas are evaluated at test time and handled by the controller script at the last stage.

Patas (NVIDIA Quadro 8000)

SSH into Patas

ssh <UW NetID>@patas.ling.washington.edu
cd ling573
conda activate 573-env

Clean

python preprocess/clean.py

This script takes approximately 1-2 minutes, and can be run directly on the Patas head node.

Chunk

To create the semantic self-segmented chunks from BillSum documents, we directly call the Se3 submodule:

cd preprocess/se3/
git submodule init
git submodule update
git pull
condor_submit augmented_segment.cmd

This stage takes approximately 2-3 hours to run as a Condor job on the entire test split. This also requires running Se3's metric learning script, which we describe briefly in the ATS section.

Simplify

To simplify the document segments, we navigate back to the preprocess module:

cd ..
condor_submit augmented_simplify.cmd

Alternatively, this can be run on Hyak:

cd ..
sbatch

On Hyak, this step takes approximately 2 hours to run.

Zero-shot summary generation

Run the controller script which will generate and submit a Condor job on your behalf:

cd ling573
./generate_run.sh patas
# Attend for successful Condor job submission message

If running on Hyak, the controller script will generate a SLURM batch job, build an Apptainer with the Conda environment and submit the job on your behalf:

cd /gscratch/scrubbed/jcmw614/ling573
./generate_run.sh hyak
# Attend for successful SLURM job submission message

Note: this takes approximately 40 min for the first 15 documents of the test set, which were segmented into 130 text chunks.

Model finetuning

Run the controller script which will generate and submit a Condor job on your behalf:

cd ling573
./generate_finetune.sh wugwatts-led_unsimp

This will generate the requisite Condor log, error, and output files with the template finetune_model.<cluster number> where you can best monitor runtime and issues with the .err file.

Baseline system

Note that test data is directly loaded with the datasets library so no extra arguments are needed on the command line.

Patas (NVIDIA Quadro 8000)

SSH into Patas

ssh <UW NetID>@patas.ling.washington.edu

Submit condor job

condor_submit run_baseline/run_baseline.cmd

Wait...
Find your output in run_baseline.out

M1 Apple Silicon

Activate virtual environment (see above)
Run system from terminal

python backup_run.py

Wait...but hopefully not as long!
In our ad-hoc backup run, output was directly printed to console and manually copied into a text file, baseline_console.txt. This console output can be aligned with the provided title column from BillSum using the align() function defined in backup_run.py and written to a CSV, baseline_test.csv. This CSV can be used for baseline evaluation.

Baseline evaluation

Extract confidence intervals on ROUGE scores with eval_metrics.py

cd eval
python eval_metrics.py

Extract readability scores in eval_readability.ipynb as we are still testing out different eval resources

jupyter notebook

Use the provided readability scores to evaluate t-tests on each readability score

ATS development

After testing that a pre-existing ATS code repo behaves as expected when tested independently, we add the repository in the preprocess child directory.

Se3

A prerequisite to run the wugwATSs system is to have access to adjusted Legal-BERT sentence embeddings as instantiated by the Se3 repository. This can be handled by a Condor job in approximately 7 hours:

cd preprocess/se3
condor_submit learning.cmd

Name		Name	Last commit message	Last commit date
Latest commit History 488 Commits
configs		configs
eval		eval
output		output
posthoc		posthoc
predictions		predictions
preprocess		preprocess
run_baseline		run_baseline
spring_archive		spring_archive
.gitignore		.gitignore
.gitmodules		.gitmodules
Binary_Evaluate.cmd		Binary_Evaluate.cmd
Binary_Finetune.cmd		Binary_Finetune.cmd
NLLP_Phase1_Evaluate.cmd		NLLP_Phase1_Evaluate.cmd
NLLP_Phase1_Finetune.cmd		NLLP_Phase1_Finetune.cmd
NLLP_Phase2_Evaluate.cmd		NLLP_Phase2_Evaluate.cmd
NLLP_Phase2_Finetune.cmd		NLLP_Phase2_Finetune.cmd
NLLP_Phase3_Postprocess.cmd		NLLP_Phase3_Postprocess.cmd
README.md		README.md
nllp_evaluate_model.py		nllp_evaluate_model.py
nllp_evaluate_model.sh		nllp_evaluate_model.sh
nllp_finetune_model.py		nllp_finetune_model.py
nllp_finetune_model.sh		nllp_finetune_model.sh
nllp_generate_evaluate.sh		nllp_generate_evaluate.sh
nllp_generate_finetune.sh		nllp_generate_finetune.sh
nllp_postprocess_evaluate.py		nllp_postprocess_evaluate.py
nllp_postprocess_evaluate.sh		nllp_postprocess_evaluate.sh
patas_requirements.txt		patas_requirements.txt
requirements.txt		requirements.txt
sanity_check.py		sanity_check.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ling573

Conda first-time setup

Virtual environment usage

Augmented system

Patas (NVIDIA Quadro 8000)

Clean

Chunk

Simplify

Zero-shot summary generation

Model finetuning

Baseline system

Patas (NVIDIA Quadro 8000)

M1 Apple Silicon

Baseline evaluation

ATS development

Se3

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ling573

Conda first-time setup

Virtual environment usage

Augmented system

Patas (NVIDIA Quadro 8000)

Clean

Chunk

Simplify

Zero-shot summary generation

Model finetuning

Baseline system

Patas (NVIDIA Quadro 8000)

M1 Apple Silicon

Baseline evaluation

ATS development

Se3

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages