Setup · Quickstart · Dataset Formats · Tool Calling · Resuming Training · Benchmarks · Advanced Config
LEAP-Finetune is a minimal fine-tuning repo for LFM2, fully built on Open Source. It handles multi-gpu orchestration, dataset formatting and validation, and model checkpointing. We support different acceleration backends, including GPU nodes of 8xH100 80GB (both single node and multi node) as well as Modal (H100, H200, B200, ..) in case you don't have your own GPUs.
For feature requests or if you have a different setup, reach out to support@liquid.ai and tell us about your specific configuration.
curl -LsSf https://astral.sh/uv/install.sh | shgit clone <repository-url>
cd leap_finetuneuv syncCreate a YAML config file (or copy one from job_configs/):
project_name: "my_sft_project"
model_name: "LFM2-1.2B"
training_type: "sft"
dataset:
path: "HuggingFaceTB/smoltalk"
type: "sft"
limit: 1000
test_size: 0.2
subset: "all"
training_config:
extends: "DEFAULT_SFT"
num_train_epochs: 3
per_device_train_batch_size: 2
learning_rate: 2e-5
peft_config:
extends: "DEFAULT_LORA"
use_peft: truetraining_config.extendsinherits from a base config (e.g.DEFAULT_SFT,DEFAULT_DPO,DEFAULT_VLM_SFT) — any fields you specify override the basepeft_config.extendsworks the same way (e.g.DEFAULT_LORA,DEFAULT_VLM_LORA)- See
job_configs/for more examples (DPO, MoE, VLM, SLURM)
Run locally:
uv run leap-finetune <path_to_config.yaml>It uses Ray Train + Accelerate for distributed training.
Unless you overwrote output_dir, results will be stored in outputs/{project_name}/{run_name}/. Each run gets its own directory with a unique name based on model, dataset, LR, and timestamp.
You can run training jobs on Modal's serverless GPUs directly from your Mac or laptop — no local GPU required.
One-time setup:
huggingface-cli login # required — used for model downloads and trackio
modal setup # configure Modal credentialsAdd a modal: section to any config:
modal:
gpu: "H100:4"
timeout: 86400
output_volume: "leap-finetune"
output_dir: "/outputs"
detach: falseRun:
uv run leap-finetune job_configs/sft_example_modal.yamlThat's it. The CLI will:
- Build the container image (~5 min on first run, cached after that)
- Auto-create a
huggingface-secreton Modal from your local HF token - Stream build and training logs to your terminal in real-time
- Save checkpoints to a Modal Volume
Retrieving checkpoints:
modal volume ls leap-finetune # list saved checkpoints
modal volume get leap-finetune <checkpoint-name> ./local-outputs # download to localDetached mode: Set detach: true in the modal config to submit and disconnect. Monitor with modal app logs leap-finetune.
See job_configs/sft_example_modal.yaml for all available options.
If your config includes a slurm section, running leap-finetune will auto-generate and submit a SLURM script. You can also generate SLURM scripts without submitting:
uv run leap-finetune slurm <path_to_config.yaml>To monitor your SLURM jobs in a TUI:
uv run turm --meAdd tracker to your training_config:
training_config:
tracker: "trackio" # or "wandb"Trackio is a free experiment tracker that logs to a HuggingFace Space.
training_config:
tracker: "trackio"
trackio_space_id: "username/my-dashboard" # auto-created if it doesn't existRequires a HF token (via huggingface-cli login). On Modal, the token is auto-injected — no extra setup needed. View your dashboard at https://huggingface.co/spaces/<trackio_space_id>.
Weights & Biases is a popular experiment tracking platform.
training_config:
tracker: "wandb"Set your API key locally with export WANDB_API_KEY=your_key. On Modal, add a secret:
modal secret create wandb-secret WANDB_API_KEY=your_keyThen add it to your Modal config:
modal:
secrets:
- "wandb-secret"When training is done, you can bundle your output checkpoint with leap-bundle to use it directly within LEAP. Checkout our Quick Start guide.
{
"messages": [
{ "role": "user", "content": "What is the capital of France?" },
{ "role": "assistant", "content": "The capital of France is Paris." }
]
}{
"prompt": "What is the capital of France?",
"chosen": "The capital of France is Paris.",
"rejected": "The capital of France is London."
}{
"messages": [
{
"role": "system",
"content": [
{
"type": "text",
"text": "You are an image-based assistant. Answer questions based on the provided image."
}
]
},
{
"role": "user",
"content": [
{ "type": "image", "image": "/path/to/image.jpg" },
{ "type": "text", "text": "What do you see in this image?" }
]
},
{
"role": "assistant",
"content": [{ "type": "text", "text": "I see a car in the image." }]
}
]
}Note: VLM datasets commonly have images in a separate row and are referenced in the messages column. If your image URLs or Paths are in a separate column from your messages, you'll need to merge the images into the 'messages' section like above.
Tool calls use LFM bracket notation pre-baked in the assistant content field. Tool definitions go in the system prompt, and tool responses use role: "tool".
{
"messages": [
{
"role": "system",
"content": "List of tools: [{\"type\":\"function\",\"function\":{\"name\":\"get_weather\",\"description\":\"Get weather for a city\",\"parameters\":{\"type\":\"object\",\"properties\":{\"location\":{\"type\":\"string\"}},\"required\":[\"location\"]}}},{\"type\":\"function\",\"function\":{\"name\":\"search_web\",\"description\":\"Search the web\",\"parameters\":{\"type\":\"object\",\"properties\":{\"query\":{\"type\":\"string\"}},\"required\":[\"query\"]}}},{\"type\":\"function\",\"function\":{\"name\":\"send_email\",\"description\":\"Send an email\",\"parameters\":{\"type\":\"object\",\"properties\":{\"to\":{\"type\":\"string\"},\"body\":{\"type\":\"string\"}},\"required\":[\"to\",\"body\"]}}}]"
},
{ "role": "user", "content": "What's the weather in Boston?" },
{
"role": "assistant",
"content": "<|tool_call_start|>[get_weather(location=\"Boston\")]<|tool_call_end|>"
},
{
"role": "tool",
"content": "{\"temperature\": 72, \"condition\": \"sunny\"}"
},
{ "role": "assistant", "content": "It's 72°F and sunny in Boston." }
]
}- Tool calls must be pre-baked in
contentusing<|tool_call_start|>[func(args)]<|tool_call_end|>bracket notation - Structured
tool_callsfields (OpenAI format) are auto-converted if present - Foreign formats (e.g.
<tool_call>XML) are rejected with an actionable error - Do not include
<|tool_response_start|>/<|tool_response_end|>markers inrole: "tool"messages — the LFM2 chat template adds these automatically during tokenization - LFM2 models additionally expect
<|tool_list_start|>/<|tool_list_end|>around tool definitions in the system prompt. Include these in your data if training an LFM2 model; omit them for LFM2.5. The pipeline warns on mismatches and auto-strips<|tool_list_start|>when training LFM2.5.
If a run is interrupted (GPU timeout, crash, SLURM preemption, etc.), you can resume from the last checkpoint with full optimizer state, LR schedule, and wandb continuity.
Add resume_from_checkpoint to your training_config:
training_config:
resume_from_checkpoint: "latest" # resumes from the most recent checkpointThis finds the most recent run directory under outputs/{project_name}/ and resumes from its latest checkpoint. To resume from a specific checkpoint instead:
training_config:
resume_from_checkpoint: "/path/to/outputs/my_project/run_name/checkpoint-step-8000"What gets restored: model weights, optimizer states, LR scheduler position, training step counter, and RNG states. In order to resume a run, save_only_model *needs to be set to False.
Wandb continuity: The wandb run ID is saved to <run_dir>/.wandb_run_id automatically. On resume, it restores the same wandb run. Fresh runs always get a new wandb run.
Run benchmarks automatically during training at every eval_steps. Add a benchmarks section to your YAML config:
benchmarks:
max_new_tokens: 128
benchmarks:
- name: "mmmu_val"
path: "/data/mmmu_val.jsonl"
metric: "short_answer"
- name: "imagenette"
path: "/data/imagenette_eval.jsonl"
metric: "logprob_zero_shot"Benchmark data uses the same format as training data (HF messages schema). Available metrics: short_answer, grounding_iou, mcq_gen, logprob_zero_shot. Results are logged to wandb at benchmark/{name}/score.
See the Evaluation Guide for data format examples, YAML reference, and how to add custom metrics.
Default base configs live in src/leap_finetune/training_configs/ and are auto-discovered — new configs added to these files are immediately available via extends in YAML.
Liger Kernel is pre-installed. Enable it with use_liger_kernel: true in your training_config.
The dataset.path field in your YAML config accepts local files, HuggingFace Hub IDs, and cloud storage URIs:
| Source | Example path |
|---|---|
| Local file | /path/to/data.jsonl, /path/to/data.parquet |
| HuggingFace Hub | HuggingFaceTB/smoltalk |
| S3 | s3://bucket/path/to/data.parquet |
| GCS | gs://bucket/path/to/data.parquet |
| Azure | az://container/path/to/data.parquet |
Cloud storage requires appropriate credentials (AWS, GCP, or Azure). Use subset for HuggingFace datasets with multiple configs, and limit to cap the number of samples for quick testing.
- Hook
pre-committo git:uv run pre-commit install - Open a PR with your changes
Pre-commit will now run automatically on commits, or run manually:
uv run pre-commit run --all-filesPlease include a thorough description of changes and additions in your PR.
