LegalDigest for legal text understanding, built using parameter-efficient fine-tuning (LoRA). The project focuses on Indian Penal Code (IPC) data and supports both sequence-to-sequence and causal language models. This repository contains a smaller-scale, reproducible version of the project.
Finetuned Flan-t5-small HuggingFace Model Card
Training Loss – Seq2Seq (FLAN-T5)

Model logs are captured in Mlflow.
Requests logs are monitored through prometheus
LegalDigest/
├── legal_digest/
│ ├── api/
│ │ ├── inference.py (Endpoint for interncing the finetuned model)
│ │ ├── metrics.py (Endpoint for metrics logging for Prometheus)
│ │ └── train.py (Endpoint for finetuning the model)
│ │
│ ├── config/
│ │ ├── __init__.py
│ │ └── training.yaml (Config for model finetuning)
│ │
│ ├── data/
│ │ └── constitution_qa.json
│ │
│ ├── pipelines/
│ │ ├── __init__.py
│ │ ├── inference.py (inferencing the finetuned model)
│ │ ├── metrics.py (metrics for finetuned model Rouge, f1)
│ │ ├── preprocess.py (preprocessing of the data)
│ │ └── train.py (Fine tuning process)
│ │
│ ├── utils/
│ │ ├── __init__.py
│ │ ├── load_yaml_config.py (loads configs from training.yaml)
│ │ ├── logger.py
│ │ └── monitoring.py (monitoring uits)
│ │
│ ├── __init__.py
│ ├── app.py (streamlit demo app)
│ └── main.py (Entry point)
│
├── prometheus.yml
├── .env
├── .gitignore
├── LICENSE
├── pyproject.toml
└── README.md
git clone https://github.com/ayush9h/LegalDigest.git
cd LegalDigestuv venv .venv
source .venv/bin/activate # Linux / macOS.\.venv\Scripts\Activate # Windowspip install -e .In seperate terminals run the below two commands:
uvicorn main:app --reload
mlflow serverRuns prometheus on localhost:9090
docker run -p 9090:9090 -v <promtheus_yml_file_path>:/etc/prometheus/prometheus.yml prom/prometheus
