google · kandelak · Feb 7, 2026 · Feb 7, 2026 · Feb 8, 2026 · Feb 8, 2026
diff --git a/.gitignore b/.gitignore
@@ -1,2 +1,3 @@
 .idea
 .DS_Store
+.venv/
diff --git a/AGENTS.md b/AGENTS.md
@@ -0,0 +1,42 @@
+# Repository Guidelines
+
+## Project Structure & Modules
+- `NAVI Dataset Tutorial.ipynb` is the main walkthrough for downloading and exploring the dataset.
+- Root Python helpers: `data_util.py` (annotation/data loading), `mesh_util.py` (mesh IO and coordinate transforms), `transformations.py` (camera/math helpers), `misc_util.py` (shared utilities), `visualization.py` (rendering/display helpers for notebooks).
+- Rendering stack lives in `gl/` (`scene_renderer.py`, `egl_context.py`, `camera_util.py`, plus GLSL shaders under `gl/shaders/`).
+- Dataset artifacts live outside the repo; point scripts to your local `navi_v1.x` extract (e.g., `/path/to/navi_v1.5/`).
+
+## Environment & Setup
+- Python 3 with NumPy, PyTorch, matplotlib, PIL, trimesh, and EGL/GL drivers for rendering.
+- Create an isolated env, then install deps:
+  - `conda create --name navi python=3`
+  - `conda activate navi`
+  - `python -m pip install -r requirements.txt`
+- Download data (v1.5 default): `wget https://storage.googleapis.com/gresearch/navi-dataset/navi_v1.5.tar.gz` then `tar -xzf navi_v1.5.tar.gz`.
+
+## Build, Test, and Run
+- No compile step; run scripts directly after installing deps.
+- Quick import check:
+  ```bash
+  python - <<'PY'
+  import data_util, visualization, gl.scene_renderer
+  print("imports ok")
+  PY
+  ```
+- Notebook workflow: `jupyter notebook NAVI Dataset Tutorial.ipynb`, set your dataset root in the first cell, and run cells in order to validate loaders and rendering.
+
+## Coding Style & Naming Conventions
+- Match existing Python style: 2-space indentation, type hints where practical, and concise helper functions.
+- Use `snake_case` for variables/functions, `CapWords` for classes, and `UPPER_SNAKE` for constants.
+- Add docstrings for new public helpers; keep logs via the stdlib `logging` module instead of prints.
+
+## Testing Guidelines
+- No formal test suite yet; add targeted checks near the code you touch (e.g., minimal render call in `visualization.py`, dataset loader round-trip).
+- Keep tests/data-paths configurable via environment variables or function args so they run without bundling data.
+- Name new tests `test_<feature>.py` in the repo root or a `gl/tests/` subfolder if you add one.
+
+## Commit & Pull Request Guidelines
+- Ensure the Google CLA is signed (see `CONTRIBUTING.md`); all changes go through PR review.
+- Write imperative, scoped commit messages (e.g., `Add v1.5 split loader`, `Fix EGL fallback`).
+- PRs should summarize intent, note dataset path assumptions, and describe how you validated (import check, notebook cell references, screenshots for visual changes).
+- Do not commit dataset artifacts; add local paths to `.gitignore` if needed.
diff --git a/Makefile b/Makefile
@@ -0,0 +1,21 @@
+SHELL := /bin/bash
+
+
+YELLOW := "\e[1;33m"
+NC := "\e[0m"
+
+# Logger function
+INFO := @bash -c '\
+  printf $(YELLOW); \
+  echo "=> $$1"; \
+  printf $(NC)' SOME_VALUE
+
+.venv:  # creates .venv folder if does not exist
+	python3.10 -m venv .venv
+
+
+.venv/bin/uv: .venv # installs latest pip
+	.venv/bin/pip install -U uv
+
+install: .venv/bin/uv
+	.venv/bin/python3 -m uv pip install -r requirements.txt
diff --git a/README.md b/README.md
@@ -100,15 +100,19 @@ Each of the `annotations.json` contains the following information.
 ## Download the dataset.
 
 ```bash
+
 # Download (v1.5) 
 wget https://storage.googleapis.com/gresearch/navi-dataset/navi_v1.5.tar.gz
 
 ## Links for previous versions.
 # v1.0 
 # wget https://storage.googleapis.com/gresearch/navi-dataset/navi_v1.0.tar.gz
 
+# Change version if you downloaded previous ones
+VERSION=v1.5
+
 # Extract
-tar -xzf navi_v1.tar.gz
+tar -xzf navi_${VERSION}.tar.gz
 ```
 
 ## Clone the code and use the dataset.

diff --git a/compute_correspondences.py b/compute_correspondences.py
@@ -0,0 +1,218 @@
+from __future__ import annotations
+
+import json
+import os
+import random
+from typing import Optional, Tuple, Dict, List
+
+import data_util
+import visualization
+from tqdm import tqdm
+
+
+def compute_correspondences_for_image_pairs(
+        navi_release_root: str,
+        pairs_txt_path: str,
+        num_samples_per_scene: int,
+        output_path: str,
+        random_subsample_size: Optional[int] = None,
+        range_indices: Optional[Tuple[int, int]] = None,
+        seed: int = 0,
+        max_items_to_store: int = None
+) -> None:
+    """Computes 2D-2D correspondences for image pairs listed in a text file and writes them to JSON.
+
+    The input text file must contain one pair per line, with three whitespace-separated fields:
+        <view_1_image_path> <view_2_image_path> <angular_rot>
+
+    Example line:
+        3d_dollhouse_sink/multiview-00-pixel_5/images/000.jpg
+        3d_dollhouse_sink/multiview-00-pixel_5/images/012.jpg
+        177.80534415031153
+
+    For each line, the function:
+      1) Parses object_id, scene folder, and the two image filenames.
+      2) Builds a query string of the form "{object_id}-{scene_type}-{scene_idx}-{camera_model}".
+      3) Loads the mesh and the two images using `data_util.load_pair_data_for_scene`.
+      4) Samples 3D points on the mesh and projects them into both images using
+         `data_util.sample_and_project_on_image_pair`.
+      5) Intersects visible samples across both views.
+      6) Writes one JSON entry per correspondence.
+
+    Args:
+        navi_release_root: Root directory of the NAVI release.
+        pairs_txt_path: Path to the .txt file listing image pairs and angular rotation.
+        num_samples_per_scene: Number of 3D samples drawn per scene.
+        output_path: Path to the output dir where correspondences.json file is written. If `range_indices` is used, output is written under `output_path/range=<start_idx>_<end_idx>`.
+        random_subsample_size: If provided, randomly samples this many rows from the txt file
+            before processing.
+        range_indices: Optional (start_idx, end_idx) inclusive range of rows (0-based, after
+            filtering comments/blanks) to process. When provided:
+              - `random_subsample_size` must be None (range works on the full table).
+              - Output is written under `output_path/range=<start_idx>_<end_idx>`.
+        seed: Random seed used for subsampling.
+        max_items_to_store: Maximum number of items to store in the output file per scene. If None, stores all.
+
+    Output JSON format:
+        A list of dictionaries, each with the following keys:
+          - view_1_image_path (str)
+          - view_2_image_path (str)
+          - angular_rot (float)
+          - view_1_corresp_x (int)
+          - view_1_corresp_y (int)
+          - view_2_corresp_x (int)
+          - view_2_corresp_y (int)
+
+        Each input row can generate zero or many output entries depending on
+        how many 3D samples are visible in both views.
+    """
+
+    if max_items_to_store is not None and max_items_to_store <= 1:
+        raise ValueError("`max_items_to_store` must be greater than 1 or None.")
+    if not os.path.isfile(pairs_txt_path):
+        raise FileNotFoundError(f"pairs_txt_path not found: {pairs_txt_path}")
+
+    # ---- Read and parse txt rows ----
+    rows: List[Tuple[str, str, float]] = []
+    with open(pairs_txt_path, "r") as f:
+        for line_no, raw in enumerate(f, start=1):
+            line = raw.strip()
+            if not line or line.startswith("#"):
+                continue
+
+            parts = line.split()
+            if len(parts) != 3:
+                raise ValueError(
+                    f"Invalid line {line_no}: expected 3 fields, got {len(parts)}"
+                )
+
+            p1, p2, rot_str = parts
+            try:
+                rot = float(rot_str)
+            except ValueError as e:
+                raise ValueError(
+                    f"Invalid rotation value on line {line_no}: {rot_str}"
+                ) from e
+
+            rows.append((p1, p2, rot))
+
+    # ---- Range selection ----
+    if range_indices is not None and random_subsample_size is not None:
+        raise ValueError("`range_indices` and `random_subsample_size` are mutually exclusive.")
+
+    if range_indices is not None:
+        start_idx, end_idx = range_indices
+        if start_idx < 0 or end_idx < start_idx or end_idx >= len(rows):
+            raise ValueError(
+                f"Invalid range_indices ({start_idx}, {end_idx}) for {len(rows)} rows."
+            )
+        rows = rows[start_idx: end_idx + 1]
+    else:
+        start_idx = end_idx = None
+
+    # ---- Optional subsampling ----
+    if random_subsample_size is not None and random_subsample_size > 0:
+        rng = random.Random(seed)
+        rows = rng.sample(rows, min(random_subsample_size, len(rows)))
+
+    # ---- Helpers ----
+    def _parse_rel_image_path(rel_path: str) -> Tuple[str, str, str]:
+        """Returns (object_id, scene_folder, filename)."""
+        parts = rel_path.replace("\\", "/").split("/")
+        if len(parts) < 4 or parts[-2] != "images":
+            raise ValueError(
+                f"Unexpected image path format: {rel_path}"
+            )
+        return parts[0], parts[1], parts[-1]
+
+    def _scene_folder_to_query(object_id: str, scene_folder: str) -> str:
+        """Builds NAVI query string from scene folder."""
+        segs = scene_folder.split("-")
+        if len(segs) != 3:
+            raise ValueError(
+                f"Unexpected scene folder format: {scene_folder}"
+            )
+        scene_type, scene_idx, camera_model = segs
+        return f"{object_id}-{scene_type}-{scene_idx}-{camera_model}"
+
+    # ---- Compute correspondences ----
+    output_records: List[Dict[str, object]] = []
+    last_completed_index = -1
+
+    final_output_path = (
+        os.path.join(output_path, f"range={start_idx}_{end_idx}", "correspondences.json")
+        if start_idx is not None and end_idx is not None
+        else os.path.join(output_path, "correspondences.json")
+    )
+    output_dir = os.path.dirname(final_output_path) or "."
+    error_file_path = os.path.join(output_dir, "error_file.txt")
+
+    try:
+        base_offset = start_idx or 0
+        for idx, (view1_path, view2_path, angular_rot) in enumerate(
+                tqdm(rows, desc="Processing image pairs")
+        ):
+            obj1, scene1, fname1 = _parse_rel_image_path(view1_path)
+            obj2, scene2, fname2 = _parse_rel_image_path(view2_path)
+
+            if obj1 != obj2 or scene1 != scene2:
+                raise ValueError(
+                    "Image pair must belong to the same object and scene:\n"
+                    f"  {view1_path}\n  {view2_path}"
+                )
+
+            query = _scene_folder_to_query(obj1, scene1)
+
+            annotations, mesh, images = data_util.load_pair_data_for_scene(
+                query=query,
+                navi_release_root=navi_release_root,
+                image_pair=(fname1, fname2),
+            )
+
+            triangles, _, _ = visualization.prepare_mesh_rendering_info(mesh)
+
+            samples_visible_1, samples_visible_2 = (
+                data_util.sample_and_project_on_image_pair(
+                    triangles=triangles,
+                    annotations=(annotations[0], annotations[1]),
+                    images=(images[0], images[1]),
+                    num_samples=num_samples_per_scene,
+                )
+            )
+
+            intersected = data_util.intersect_visible_samples(
+                (samples_visible_1, samples_visible_2)
+            )
+
+            cnt = 0
+            for (_, ((x1, y1), (x2, y2))) in intersected.items():
+                if max_items_to_store is not None and cnt >= max_items_to_store:
+                    break
+                output_records.append({
+                    "view_1_image_path": view1_path,
+                    "view_2_image_path": view2_path,
+                    "angular_rot": float(angular_rot),
+                    "view_1_corresp_x": int(x1),
+                    "view_1_corresp_y": int(y1),
+                    "view_2_corresp_x": int(x2),
+                    "view_2_corresp_y": int(y2),
+                })
+                cnt += 1
+
+            last_completed_index = base_offset + idx
+
+    except Exception:
+        # Persist partial progress and index of last successful row.
+        os.makedirs(output_dir, exist_ok=True)
+        with open(final_output_path, "w") as f:
+            json.dump(output_records, f, indent=2)
+        with open(error_file_path, "w") as ef:
+            ef.write(str(last_completed_index))
+        raise
+    else:
+        # Clean successful run output.
+        os.makedirs(output_dir, exist_ok=True)
+        if os.path.exists(error_file_path):
+            os.remove(error_file_path)
+        with open(final_output_path, "w") as f:
+            json.dump(output_records, f, indent=2)