paper_convergence_plot: 3-method support, variable-length paths#628
Merged
Conversation
…pable labels - Generalize the figure and table to N methods (was paired-2). Adds an OSM third method that can cover a subset of envs and skips silently for envs it's missing. Bolding picks the per-row minimum. - Replace fixed-length-path assumption with per-path interpolation onto a common distance grid; mean/CI use nan-aware reductions. Resolves the eval format change to variable-length paths. - Env list updated for the current paper figure: drop Seattle, add NewYork, order NY → SF → Boston → Boston Night → Middletown → Norway → Fort Myers → Framingham. - CLI: --osm_dir, --osm_method, --osm_label, --baseline_label for swapping the third-method source and the displayed names. Defaults: WAG baseline, DINOv3+OSM third method (override per call as needed). - Defaults point at the current v6 results dirs.
…e_ian_sw
Builds on the 3-method refactor at the head of this branch.
- 4th method 'early' wired into method_dirs/method_colors (green #4CAF50)
/method_labels with subset-coverage semantics matching 'osm'; new
--early_dir/--early_method/--early_label CLI flags. The 'early' slot is
skipped when --early_dir is not provided so the no-arg / 3-method
invocation still works.
- Table polish:
- \begin{table*} so the table spans both columns in a 2-column paper
layout
- down-arrow markers on metric headers (all metrics are "lower is better")
- rename "Error" → "Final Error"
- drop CC50 row; CC100 + Final Error is the canonical reporting set
- CI to 1 decimal place (e.g. 9±0.1 instead of 9±0)
- y-axis label "P(mass within r)" → "Probability mass within radius"
- ENVIRONMENTS: swap post_hurricane_ian → post_hurricane_ian_sw to match
the newer Fort Myers subset (1073 panos / 471 822 sat patches, slightly
shifted bbox); DISPLAY_NAMES gains a post_hurricane_ian_sw → "Fort Myers"
entry so the visible label is unchanged.
- dataset_statistics.py: add NewYork and mapillary/post_hurricane_ian_sw
to PANO_LANDMARK_DIR_MAP so per-pano landmark counts populate for the
paper table's NY row and the swapped Fort Myers row.
efahnestock
commented
May 20, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Refactor
paper_convergence_plot.pyso the figure-B convergence-curves plot and its companion summary table can render the current paper figure (NY → SF → Boston → Boston Night → Middletown → Norway → Fort Myers → Framingham, three methods compared, drop Seattle).osm) is wired in alongside the existingsafa/oursslots, with subset coverage allowed (envs without that method's results are silently skipped, table shows--). Per-row bolding picks the minimum across methods.prob_massis now interpolated onto a common distance grid; aggregation usesnanmean/nanstd, and grid points where fewer thanmin_paths_for_ci(=10) paths contributed are truncated from the rendered curve.--osm_labeland--baseline_labelflags so the displayed names (WAG,WAG+OSM,Ours, etc.) can be set without code edits. Existing--osm_dir/--osm_methodalready cover swapping the third method's source.Test plan
--osm_dir .../260504_160045_osm_baseline --osm_method safa_plus_osm_safa_raw --osm_label WAG+OSM— produces 4×2 grid, 8 envs, all three methods plotted with shaded 95% CIs.--cells where a method lacks an env.Related
Built on top of the matrix-row-assert fix in #627, which is what allowed the v6 results for Boston / Boston Night to be regenerated and finally show non-saturated curves on this plot.
Update (commit 1478642)
Follow-up on top of the 3-method refactor with everything needed to produce the current paper figure:
--early_dir/--early_method/--early_labelCLI flags wire a fourth method (early, green#4CAF50) intomethod_dirs/method_colors/method_labelswith the same subset-coverage semantics asosm. Skipped silently when--early_diris not provided, so the 3-method invocation still works.\begin{table*}for two-column page span;$\downarrow$markers on metric headers (all metrics are "lower is better"); rename "Error" → "Final Error"; drop CC₅₀ row (CC₁₀₀ + Final Error is the canonical reporting set); CIs to 1 decimal place so very tight bands don't render as the misleading$\pm$ 0.post_hurricane_ian_sw. Swap inENVIRONMENTS, with a matchingDISPLAY_NAMESentry so the visible label stays "Fort Myers". Reflects the newer Fort Myers subset (1073 panos / 471 822 sat patches, slightly shifted bbox).dataset_statistics.py(2-line addition):PANO_LANDMARK_DIR_MAPgainsNewYorkandmapillary/post_hurricane_ian_swso per-pano landmark counts populate for the paper table's NY row and the swapped Fort Myers row.Example invocation for the final 4-method figure: