paper_convergence_plot: 3-method support, variable-length paths by efahnestock · Pull Request #628 · ewfuentes/robot

efahnestock · 2026-05-11T18:20:24Z

Summary

Refactor paper_convergence_plot.py so the figure-B convergence-curves plot and its companion summary table can render the current paper figure (NY → SF → Boston → Boston Night → Middletown → Norway → Fort Myers → Framingham, three methods compared, drop Seattle).

N-method generalization. The figure and table both supported a 2-method paired comparison; generalize to N. A third method (osm) is wired in alongside the existing safa/ours slots, with subset coverage allowed (envs without that method's results are silently skipped, table shows --). Per-row bolding picks the minimum across methods.
Variable-length paths. The previous code asserted every path under a given (env, method) had the same step count; that broke once the eval started writing paths of different lengths. Each path's prob_mass is now interpolated onto a common distance grid; aggregation uses nanmean / nanstd, and grid points where fewer than min_paths_for_ci (=10) paths contributed are truncated from the rendered curve.
Env list updated for the current figure. Drops Seattle, adds NewYork, reorders to NY → SF → Boston Snowy → Boston Night → Middletown → Norway → Fort Myers → Framingham. 4×2 grid fits cleanly.
CLI labels swappable. New --osm_label and --baseline_label flags so the displayed names (WAG, WAG+OSM, Ours, etc.) can be set without code edits. Existing --osm_dir / --osm_method already cover swapping the third method's source.
Defaults point at the current v6 results dirs. No-arg invocation produces a sane figure.

Test plan

Render with three current-paper methods: --osm_dir .../260504_160045_osm_baseline --osm_method safa_plus_osm_safa_raw --osm_label WAG+OSM — produces 4×2 grid, 8 envs, all three methods plotted with shaded 95% CIs.
LaTeX summary table has the expected per-row bolding and -- cells where a method lacks an env.
No-arg invocation still works.

Update (commit `1478642`)

Follow-up on top of the 3-method refactor with everything needed to produce the current paper figure:

4-method support. New --early_dir / --early_method / --early_label CLI flags wire a fourth method (early, green #4CAF50) into method_dirs / method_colors / method_labels with the same subset-coverage semantics as osm. Skipped silently when --early_dir is not provided, so the 3-method invocation still works.
Table polish. \begin{table*} for two-column page span; $\downarrow$ markers on metric headers (all metrics are "lower is better"); rename "Error" → "Final Error"; drop CC₅₀ row (CC₁₀₀ + Final Error is the canonical reporting set); CIs to 1 decimal place so very tight bands don't render as the misleading $\pm$ 0.
post_hurricane_ian_sw. Swap in ENVIRONMENTS, with a matching DISPLAY_NAMES entry so the visible label stays "Fort Myers". Reflects the newer Fort Myers subset (1073 panos / 471 822 sat patches, slightly shifted bbox).
y-axis label "P(mass within r)" → "Probability mass within radius".
dataset_statistics.py (2-line addition): PANO_LANDMARK_DIR_MAP gains NewYork and mapillary/post_hurricane_ian_sw so per-pano landmark counts populate for the paper table's NY row and the swapped Fort Myers row.

Example invocation for the final 4-method figure:

bazel run //experimental/overhead_matching/swag/scripts:paper_convergence_plot -- \
  --output_dir=/data/overhead_matching/evaluation \
  --osm_dir=/data/overhead_matching/evaluation/results/260504_160045_osm_baseline \
  --osm_method=safa_plus_osm_safa_raw --osm_label="WAG+OSM" \
  --early_dir=/data/overhead_matching/evaluation/results/260513_early_fusion_no_positions_v1 \
  --early_method=early_fusion_no_positions_v1 --early_label="Early Fusion"

…pable labels - Generalize the figure and table to N methods (was paired-2). Adds an OSM third method that can cover a subset of envs and skips silently for envs it's missing. Bolding picks the per-row minimum. - Replace fixed-length-path assumption with per-path interpolation onto a common distance grid; mean/CI use nan-aware reductions. Resolves the eval format change to variable-length paths. - Env list updated for the current paper figure: drop Seattle, add NewYork, order NY → SF → Boston → Boston Night → Middletown → Norway → Fort Myers → Framingham. - CLI: --osm_dir, --osm_method, --osm_label, --baseline_label for swapping the third-method source and the displayed names. Defaults: WAG baseline, DINOv3+OSM third method (override per call as needed). - Defaults point at the current v6 results dirs.

…e_ian_sw Builds on the 3-method refactor at the head of this branch. - 4th method 'early' wired into method_dirs/method_colors (green #4CAF50) /method_labels with subset-coverage semantics matching 'osm'; new --early_dir/--early_method/--early_label CLI flags. The 'early' slot is skipped when --early_dir is not provided so the no-arg / 3-method invocation still works. - Table polish: - \begin{table*} so the table spans both columns in a 2-column paper layout - down-arrow markers on metric headers (all metrics are "lower is better") - rename "Error" → "Final Error" - drop CC50 row; CC100 + Final Error is the canonical reporting set - CI to 1 decimal place (e.g. 9±0.1 instead of 9±0) - y-axis label "P(mass within r)" → "Probability mass within radius" - ENVIRONMENTS: swap post_hurricane_ian → post_hurricane_ian_sw to match the newer Fort Myers subset (1073 panos / 471 822 sat patches, slightly shifted bbox); DISPLAY_NAMES gains a post_hurricane_ian_sw → "Fort Myers" entry so the visible label is unchanged. - dataset_statistics.py: add NewYork and mapillary/post_hurricane_ian_sw to PANO_LANDMARK_DIR_MAP so per-pano landmark counts populate for the paper table's NY row and the swapped Fort Myers row.

…e flag

efahnestock added 5 commits May 11, 2026 13:55

paper_convergence_plot: CIs to 2 decimals, (m) units in metric headers

b90a3aa

paper_convergence_plot: 2-decimal mean to match CI precision

d4c2988

paper_convergence_plot: mean and CI to 1 decimal

5cb3dd1

efahnestock commented May 20, 2026

View reviewed changes

Comment thread experimental/overhead_matching/swag/scripts/paper_convergence_plot.py Outdated

paper_convergence_plot: throw on missing data, opt-in partial coverag…

3d0e45a

…e flag

efahnestock merged commit 870a336 into main May 20, 2026
3 checks passed

efahnestock deleted the paper-plot-multi-method branch May 20, 2026 18:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

paper_convergence_plot: 3-method support, variable-length paths#628

paper_convergence_plot: 3-method support, variable-length paths#628
efahnestock merged 6 commits into
mainfrom
paper-plot-multi-method

efahnestock commented May 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

efahnestock commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Related

Update (commit 1478642)

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

efahnestock commented May 11, 2026 •

edited

Loading

Update (commit `1478642`)