Skip to content

Sentinel-1 vessels: tile scenes into detector windows to cut materialize memory#323

Draft
rpw1134 wants to merge 2 commits into
masterfrom
ryan/sentinel1-tile-windows
Draft

Sentinel-1 vessels: tile scenes into detector windows to cut materialize memory#323
rpw1134 wants to merge 2 commits into
masterfrom
ryan/sentinel1-tile-windows

Conversation

@rpw1134

@rpw1134 rpw1134 commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Motivation

The S1 vessels sidecar OOMs intermittently on large S1C/S1D scenes. The detector runs on a single full-scene Window, so materialize_dataset builds a full-scene composite (FirstValidCompositor → whole-scene read); measured anon peak ~27 GiB, above an n1-standard-8 node's ~25.9 GiB allocatable → node OOM.

Change

  • Split each scene into a grid of disjoint DETECTOR_TILE_SIZE (2048px) detector windows; each tile materializes and runs the detector independently, so peak memory scales with the tile, not the scene.
  • Add cross-tile suppression (_merge_tile_detections): a vessel near a tile boundary can be detected in two tiles, so dedup by distance NMS per scene. Reuses distance_nms, factored out of NMSDistanceMerger._apply_nms (behavior unchanged; merge() still delegates to it).

Draft: detection-equivalence validation against the full-scene path is pending. See inline notes.

Split each scene into a grid of 2048px detector windows instead of one full-scene
window, so materialize peak scales with the tile size rather than the scene. Add
cross-tile distance NMS (reusing a distance_nms helper factored out of
NMSDistanceMerger) to suppress vessels detected in more than one tile.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
# Side length in pixels of the tiles each scene is split into for detection. Each tile is
# materialized and run through the detector independently, so peak memory scales with the
# tile size rather than the full scene.
DETECTOR_TILE_SIZE = 2048

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per your note — tiles are disjoint here (no overlap). At 2048² a vessel landing on a seam is uncommon, so I skipped overlap/halo entirely and rely on the cross-tile NMS below to catch the few that do split.

detections.append(detection)

return detections
return _merge_tile_detections(detections)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the across-window NMS you flagged as still needed even without overlap: a vessel straddling a tile boundary can be detected in both tiles, so we dedup per scene here. Centers are in the shared scene projection (all tiles use the same Projection), so they're directly comparable across tiles — no offset bookkeeping.

Comment thread rslp/utils/nms.py
DEFAULT_DISTANCE_THRESHOLD = 10


def distance_nms(

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On reusing NMSDistanceMerger: the class is a per-window CropPredictionMerger hook (merge(window, outputs, layer_config)), so it's only ever invoked within a single window and can't dedup across windows. The NMS algorithm is coordinate-agnostic, though, so I factored it into this distance_nms; _apply_nms now delegates to it (behavior unchanged) and the cross-tile pass calls it directly. So: reused the algorithm, no new merger class.

Tiles over the nodata margin outside the SAR swath (and outside the historicals'
overlap) never materialize all layers; skip them instead of raising, so the
detector only runs on tiles with the target and both historical images.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@rpw1134

rpw1134 commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

Benchmark (ryan-dev, T4; high-peak S1D …75D5; rslearn pinned to deployed #667, so this isolates the tiling effect)

Metric Tiled (this PR) Full-scene baseline
Materialize peak 13.3 GiB (anon 10.2) ~20.4 GiB
Wall-clock 27 min ~30 min
Box stability clean wedges a 30 GB box (swap)
Detections 111 114

Memory goal met — peak well under an n1-standard-8's ~25.9 GiB, runs end-to-end without the OOM. Tiling confirmed (scene → ~180 tiles); empty-tile drop + cross-tile NMS execute.

⚠️ Open / blocking before ready: detection equivalence is 106/114 matched within 50 m (8 missing, 5 extra) — not yet equivalent. Need to localize the ~13 discrepant detections:

  • missing likely from dropped historical-edge tiles and/or seam cuts → fix: materialize a nodata-filled historical for partial tiles instead of dropping them;
  • extra likely seam-split duplicates the cross-tile NMS didn't merge → revisit seam handling / NMS threshold.

Memory peak is also driven by workers=32 materializing 32 tiles at once — lowering that pushes the peak toward the single-tile floor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant