LLM Tuning Lab

LoRA fine-tuning setup for LLMs. Takes agent session data and injects tool catalogs into the training data.

Structure

lab/                    # training code
  ├── config.py         # pydantic config
  ├── data_processor.py # session data → training examples
  ├── train_lora.py     # main training script
  └── check_data.py     # preview data before training
infra/                  # terraform for AWS GPU instances
data/                   # your training data (gitignored)

Quick Start

Using Docker (recommended):

docker-compose up -d
docker exec llm-tuning-dev pipenv install --dev
docker exec llm-tuning-dev pipenv run pip install -e .

# Preview your data
docker exec llm-tuning-dev pipenv run python lab/check_data.py --num-examples=3

# Train
docker exec llm-tuning-dev pipenv run python lab/train_lora.py

Or locally with pipenv (Python 3.12):

pipenv install --dev
pipenv run pip install -e .
pipenv run python lab/check_data.py --num-examples=3
pipenv run python lab/train_lora.py

Configuration

Use environment variables with LLM_ prefix:

export LLM_MODEL__BASE_MODEL="meta-llama/Meta-Llama-3-8B-Instruct"
export LLM_DATA__MAX_LENGTH="4096"
export LLM_LORA__R="16"
export LLM_TRAINING__NUM_TRAIN_EPOCHS="2"
export LLM_TRAINING__LEARNING_RATE="2e-4"

Or edit defaults in lab/config.py.

Data Format

JSONL files with session data:

{
  "request": {
    "system": "system prompt",
    "messages": [{"role": "user", "content": "..."}]
  },
  "response": {
    "role": "assistant",
    "content": "expected output"
  }
}

The tool catalog from data/tool_catalogue.json gets injected into system prompts automatically.

Infrastructure

See infra/ for Terraform setup to run training on AWS GPU instances (g5.xlarge spot instances for ~$0.30/hr).

cd infra/envs/training
terraform apply -var="create_instance=true"

More details in infra/README.md and DEPLOYMENT.md.

Testing

pipenv run pytest
pipenv run pytest --cov=lab

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github		.github
infra		infra
lab		lab
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
DEPLOYMENT.md		DEPLOYMENT.md
Dockerfile		Dockerfile
Dockerfile.gpu		Dockerfile.gpu
LICENSE		LICENSE
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
check-quota-status.sh		check-quota-status.sh
docker-compose.yml		docker-compose.yml
monitor-training.sh		monitor-training.sh
pytest.ini		pytest.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Tuning Lab

Structure

Quick Start

Configuration

Data Format

Infrastructure

Testing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM Tuning Lab

Structure

Quick Start

Configuration

Data Format

Infrastructure

Testing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages