Sentinel-1 vessels: tile scenes into detector windows to cut materialize memory#323
Sentinel-1 vessels: tile scenes into detector windows to cut materialize memory#323rpw1134 wants to merge 2 commits into
Conversation
Split each scene into a grid of 2048px detector windows instead of one full-scene window, so materialize peak scales with the tile size rather than the scene. Add cross-tile distance NMS (reusing a distance_nms helper factored out of NMSDistanceMerger) to suppress vessels detected in more than one tile. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| # Side length in pixels of the tiles each scene is split into for detection. Each tile is | ||
| # materialized and run through the detector independently, so peak memory scales with the | ||
| # tile size rather than the full scene. | ||
| DETECTOR_TILE_SIZE = 2048 |
There was a problem hiding this comment.
Per your note — tiles are disjoint here (no overlap). At 2048² a vessel landing on a seam is uncommon, so I skipped overlap/halo entirely and rely on the cross-tile NMS below to catch the few that do split.
| detections.append(detection) | ||
|
|
||
| return detections | ||
| return _merge_tile_detections(detections) |
There was a problem hiding this comment.
This is the across-window NMS you flagged as still needed even without overlap: a vessel straddling a tile boundary can be detected in both tiles, so we dedup per scene here. Centers are in the shared scene projection (all tiles use the same Projection), so they're directly comparable across tiles — no offset bookkeeping.
| DEFAULT_DISTANCE_THRESHOLD = 10 | ||
|
|
||
|
|
||
| def distance_nms( |
There was a problem hiding this comment.
On reusing NMSDistanceMerger: the class is a per-window CropPredictionMerger hook (merge(window, outputs, layer_config)), so it's only ever invoked within a single window and can't dedup across windows. The NMS algorithm is coordinate-agnostic, though, so I factored it into this distance_nms; _apply_nms now delegates to it (behavior unchanged) and the cross-tile pass calls it directly. So: reused the algorithm, no new merger class.
Tiles over the nodata margin outside the SAR swath (and outside the historicals' overlap) never materialize all layers; skip them instead of raising, so the detector only runs on tiles with the target and both historical images. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Benchmark (ryan-dev, T4; high-peak S1D
|
| Metric | Tiled (this PR) | Full-scene baseline |
|---|---|---|
| Materialize peak | 13.3 GiB (anon 10.2) | ~20.4 GiB |
| Wall-clock | 27 min | ~30 min |
| Box stability | clean | wedges a 30 GB box (swap) |
| Detections | 111 | 114 |
Memory goal met — peak well under an n1-standard-8's ~25.9 GiB, runs end-to-end without the OOM. Tiling confirmed (scene → ~180 tiles); empty-tile drop + cross-tile NMS execute.
- missing likely from dropped historical-edge tiles and/or seam cuts → fix: materialize a nodata-filled historical for partial tiles instead of dropping them;
- extra likely seam-split duplicates the cross-tile NMS didn't merge → revisit seam handling / NMS threshold.
Memory peak is also driven by workers=32 materializing 32 tiles at once — lowering that pushes the peak toward the single-tile floor.
Motivation
The S1 vessels sidecar OOMs intermittently on large S1C/S1D scenes. The detector runs on a single full-scene
Window, somaterialize_datasetbuilds a full-scene composite (FirstValidCompositor → whole-scene read); measured anon peak ~27 GiB, above an n1-standard-8 node's ~25.9 GiB allocatable → node OOM.Change
DETECTOR_TILE_SIZE(2048px) detector windows; each tile materializes and runs the detector independently, so peak memory scales with the tile, not the scene._merge_tile_detections): a vessel near a tile boundary can be detected in two tiles, so dedup by distance NMS per scene. Reusesdistance_nms, factored out ofNMSDistanceMerger._apply_nms(behavior unchanged;merge()still delegates to it).Draft: detection-equivalence validation against the full-scene path is pending. See inline notes.