JEPA Clip Ranking

JEPA-based clip mining for driving data. The repo is organized around one product question:

rank clips by likely human review value.

The training objective remains JEPA embedding prediction. The primary experiment outcome is ranking quality on a human-labeled benchmark. Cosine similarity remains a secondary model-health metric.

Workflow

The public entrypoints are:

train.py
score.py
evaluate.py
run_experiment.py

The normal sequence is:

Build or migrate a clip manifest with explicit clip_id, split, and scene_id.
Train JEPA on unlabeled train clips.
Score held-out clips with a clip-level review-value score.
Evaluate ranking quality against human review-value labels.
Compare experiment quality and efficiency from the run summaries.

Each run writes:

config_resolved.yaml
training/summary.json
training/checkpoints/
scoring/scores.jsonl
scoring/summary.json
evaluation/summary.json
summary.json

Repo Layout

src/jepa/
  config.py                 Config loading and runtime-profile resolution
  pipeline.py               train -> score -> evaluate stage runners
  data/                     Manifest loading, dataset classes, transforms
  models/                   VJEPA encoder and JEPA predictor model
  training/                 Training loop and embedding loss
  evaluation/               Ranking metrics, cosine similarity, telemetry helpers
  experiments/              Factorial design utilities

scripts/
  build_manifest_from_frames.py
  migrate_manifest.py
  run_factorial.py
  analyze_factorial.py

train.py
score.py
evaluate.py
run_experiment.py

Installation

Use Python 3.10-3.12 for this project. Python 3.13+ is not supported by the pinned PyTorch stack, and older Linux clusters commonly require the manylinux2014 wheels provided by the 2.5.x line.

uv sync

Or with pip:

pip install -e .
pip install -e ".[dev]"

Data Preparation

The pipeline consumes JSONL clip manifests, not raw videos directly.

Expected v1 clip record:

{
  "clip_id": "scene_001__CAM_FRONT__000000__000015",
  "split": "train",
  "scene_id": "scene_001",
  "camera": "CAM_FRONT",
  "frame_paths": [
    "scene_001/CAM_FRONT/000000.jpg",
    "scene_001/CAM_FRONT/000001.jpg"
  ],
  "timestamps": ["000000", "000001"],
  "metadata": {}
}

Build a Manifest From Extracted Frames

Frame directory layout:

data/raw/my_dataset/
  scene_001/
    CAM_FRONT/
      000000.jpg
      000001.jpg

Build a manifest:

uv run python scripts/build_manifest_from_frames.py \
  --frames-root data/raw/my_dataset \
  --output data/manifests/my_manifest.jsonl \
  --clip-length 16 \
  --stride 16 \
  --train-ratio 0.7 \
  --val-ratio 0.15 \
  --seed 42 \
  --camera CAM_FRONT

Migrate a Legacy Manifest

uv run python scripts/migrate_manifest.py \
  --input data/manifests/clips_manifest.jsonl \
  --output data/manifests/clips_manifest_v1.jsonl \
  --train-ratio 0.7 \
  --score-ratio 0.15 \
  --seed 42

Configuration

The default config is configs/default.yaml.

Top-level sections:

dataset
model
train
score
evaluation
runtime
experiment

Important controls:

model.init_mode: pretrained, resume, scratch
model.encoder_mode: frozen, finetune
runtime.profile: cpu, gpu, ddp
dataset.training_manifest, dataset.validation_manifest, dataset.scoring_manifest, dataset.evaluation_manifest
dataset.training_split, dataset.validation_split, dataset.scoring_split, dataset.evaluation_split
dataset.evaluation_labels

Inline overrides are supported with --set as JSON:

python3 run_experiment.py \
  --config configs/default.yaml \
  --run-dir experiments/runs/smoke_cpu \
  --set '{"runtime":{"profile":"cpu","batch_size_overrides":{"train":1,"score":1,"evaluation":1}}}' \
  --set '{"train":{"epochs":1}}'

Stage Usage

Train only:

uv run python train.py \
  --config configs/default.yaml \
  --run-dir experiments/runs/baseline

Score only:

uv run python score.py \
  --config configs/default.yaml \
  --run-dir experiments/runs/baseline

Evaluate only:

uv run python evaluate.py \
  --config configs/default.yaml \
  --run-dir experiments/runs/baseline

Label Review

For manual binary labeling of an evaluation-labels JSONL, use the local review app:

uv run python scripts/review_labels.py \
  --labels-path data/manifests/baseline_manifest_evaluation_labels.jsonl \
  --manifest-path data/manifests/baseline_manifest.py \
  --data-root data/raw/file

Then open http://127.0.0.1:8765 in a browser. The app saves label edits back into the JSONL in place. Keys: 1 positive, 0 negative, u clear, j next, k previous. It renders each clip as an in-browser looping animation and also shows sampled still frames underneath.

Full experiment:

uv run python run_experiment.py \
  --config configs/default.yaml \
  --run-dir experiments/runs/baseline

Scoring and Evaluation

score.py does not require human labels. It produces one clip-level score per clip:

review_value_score
mean_cosine_similarity
tubelet_score_mean
tubelet_score_std
tubelet_count

In the current implementation:

review_value_score = 1 - cosine_similarity

This is a novelty proxy, not ground truth.

evaluate.py requires human labels keyed by clip_id. It joins model scores to benchmark labels and computes:

Precision@K
Recall@K
Average Precision
PR-AUC
NDCG

Human label schema:

{
  "clip_id": "scene_001__CAM_FRONT__000080__000095",
  "review_value": "high_value",
  "review_value_grade": 2,
  "reason_codes": ["safety_critical_interaction"],
  "reviewer_id": "rater_01",
  "adjudicated_label": "high_value",
  "agreement": 1.0
}

The intended review-value classes are:

high_value
medium_value
low_value

Experiment Outputs

Training resource stats are stored in:

RUN_DIR/training/summary.json

Scoring / inference resource stats are stored in:

RUN_DIR/scoring/summary.json

These summaries include:

wall-clock time
samples or clips per second
latency mean / p50 / p95
peak memory
effective batch size
device and runtime profile
estimated energy for scoring

Factorial Experiments

Use configs/factorial.yaml to sweep config factors across the new pipeline.

Run a batch:

uv run python scripts/run_factorial.py --config configs/factorial.yaml

Analyze a completed batch:

uv run python scripts/analyze_factorial.py \
  --results experiments/factorial_runs/<date>/batch_<time>/results.jsonl \
  --factorial-config configs/factorial.yaml

Each batch writes:

design_matrix.jsonl
results.jsonl
batch_summary.json
runs/<run_name>/...

Testing

uv run pytest tests/

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
.vscode		.vscode
configs		configs
data/manifests		data/manifests
notebooks		notebooks
paper		paper
scripts		scripts
src/jepa		src/jepa
tests		tests
.gitignore		.gitignore
README.md		README.md
evaluate.py		evaluate.py
hw_5_task_1.ipynb		hw_5_task_1.ipynb
paper.aux		paper.aux
paper.log		paper.log
paper.out		paper.out
paper.pdf		paper.pdf
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_experiment.py		run_experiment.py
score.py		score.py
setup.py		setup.py
train.py		train.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JEPA Clip Ranking

Workflow

Repo Layout

Installation

Data Preparation

Build a Manifest From Extracted Frames

Migrate a Legacy Manifest

Configuration

Stage Usage

Label Review

Scoring and Evaluation

Experiment Outputs

Factorial Experiments

Testing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

JEPA Clip Ranking

Workflow

Repo Layout

Installation

Data Preparation

Build a Manifest From Extracted Frames

Migrate a Legacy Manifest

Configuration

Stage Usage

Label Review

Scoring and Evaluation

Experiment Outputs

Factorial Experiments

Testing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages