facebookresearch · sarlinpe · Aug 19, 2024 · Aug 19, 2024 · Aug 19, 2024 · Aug 19, 2024
diff --git a/README.md b/README.md
@@ -69,23 +69,74 @@ Try our minimal demo - take a picture with your phone in any city and find its e
     <em>OrienterNet positions any image within a large area - try it with your own images!</em>
 </p>
 
-## Evaluation
+## Mapillary Geo-Localization (MGL) dataset
+
+To train and evaluate OrienterNet, we introduce a large crowd-sourced dataset of images captured across multiple cities through the [Mapillary platform](https://www.mapillary.com/app/). To obtain the dataset:
 
-#### Mapillary Geo-Localization dataset
+1. Create a developper account at [mapillary.com](https://www.mapillary.com/dashboard/developers) and obtain a free access token.
+2. Run the following script to download the data from Mapillary and prepare it:
+
+```bash
+python -m maploc.data.mapillary.prepare --token $YOUR_ACCESS_TOKEN
+```
+
+By default the data is written to the directory `./datasets/MGL/` and requires about 80 GB of free disk space.
+
+#### Using different OpenStreetMap data
 
 <details>
 <summary>[Click to expand]</summary>
 
-To obtain the dataset:
+Multiple sources of OpenStreetMap (OSM) data can be selected for the dataset scripts `maploc.data.[mapillary,kitti].prepare` using the `--osm_source` option, which can take the following values:
+- `PRECOMPUTED` (default): download pre-computed raster tiles that are hosted [here](https://cvg-data.inf.ethz.ch/OrienterNet_CVPR2023/tiles/).
+- `CACHED`: compute the raster tiles from raw OSM data downloaded from [Geofabrik](https://download.geofabrik.de/) in November 2021 and hosted [here](https://cvg-data.inf.ethz.ch/OrienterNet_CVPR2023/osm/). This is useful if you wish to use different OSM classes but want to compare the results to the pre-computed tiles.
+- `LATEST`: fetch the latest OSM data from [Geofabrik](https://download.geofabrik.de/). This requires that the [Osmium tool](https://osmcode.org/osmium-tool/) is available in your system, which can be downloaded via `apt-get install osmium-tool` on Ubuntu and `brew install osmium-tool` on macOS.
 
-1. Create a developper account at [mapillary.com](https://www.mapillary.com/dashboard/developers) and obtain a free access token.
-2. Run the following script to download the data from Mapillary and prepare it:
+</details>
+
+#### Extending the dataset
+
+<details>
+<summary>[Click to expand]</summary>
 
+By default, the dataset script fetches data that was queried early 2022 from 13 locations. The dataset can be extended by including additional cities or querying images recently uploaded to Mapillary. To proceed, follow these steps:
+1. For each new location, add an entry to `maploc.data.mapillary.config.location_to_params` following the format:
+```python
+    "location_name": {
+        "bbox": BoundaryBox((lat_min, long_min), (lat_max, long_max)),
+        "filters": {"is_pano": True},
+        # or other filters like creator_username, model, etc.
+        # all described at https://www.mapillary.com/developer/api-documentation#image
+    }
+```
+The bounding box can easily be selected using [this tool](https://boundingbox.klokantech.com/). We recommend searching for cities with a high density of 360 panoramic images on the [Mapillary platform](https://www.mapillary.com/app/).
+
+2. Query the corresponding images and split them into training and valiation subsets with:
 ```bash
-python -m maploc.data.mapillary.prepare --token $YOUR_ACCESS_TOKEN
+python -m maploc.data.mapillary.split --token $YOUR_ACCESS_TOKEN --output_filename splits_MGL_v2_{scene}.json --data_dir datasets/MGL_v2
 ```
+Note that, for the 13 default locations, running this script will produce results slightly different from the default split file `splits_MGL_13loc.json` since new images have been uploaded since 2022 and some others have been taken down.
+
+3. Fetch and prepare the resulting data:
+```bash
+python -m maploc.data.mapillary.prepare --token $YOUR_ACCESS_TOKEN --split_filename splits_MGL_v2_{scene}.json --data_dir datasets/MGL_v2
+```
+4. To train or evaluate with this new version of the dataset, add the following CLI flags:
+```bash
+python -m maploc.[train,evaluation...] [...] data.data_dir=datasets/MGL_v2 data.split=splits_MGL_v2_{scene}.json
+```
+
+</details>
 
-By default the data is written to the directory `./datasets/MGL/`. Then run the evaluation with the pre-trained model:
+
+## Evaluation
+
+#### MGL dataset
+
+<details>
+<summary>[Click to expand]</summary>
+
+Download the dataset [as described previously](#mapillary-geo-localization-mgl-dataset) and run the evaluation with the pre-trained model:
 
 ```bash
 python -m maploc.evaluation.mapillary --experiment OrienterNet_MGL model.num_rotations=256
@@ -218,17 +269,6 @@ We provide several visualization notebooks:
 - [Visualize predictions on the KITTI dataset](./notebooks/visualize_predictions_kitti.ipynb)
 - [Visualize sequential predictions](./notebooks/visualize_predictions_sequences.ipynb)
 
-## OpenStreetMap data
-
-<details>
-<summary>[Click to expand]</summary>
-
-To make sure that the results are consistent over time, we used OSM data downloaded from [Geofabrik](https://download.geofabrik.de/) in November 2021. By default, the dataset scripts `maploc.data.[mapillary,kitti].prepare` download pre-generated raster tiles. If you wish to use different OSM classes, you can pass `--generate_tiles`, which will download and use our prepared raw `.osm` XML files.
-
-You may alternatively download more recent files from [Geofabrik](https://download.geofabrik.de/). Download either compressed XML files as `.osm.bz2` or binary files `.osm.pbf`, which need to be converted to XML files `.osm`, for example using Osmium: ` osmium cat xx.osm.pbf -o xx.osm`.
-
-</details>
-
 ## License
 
 The MGL dataset is made available under the [CC-BY-SA](https://creativecommons.org/licenses/by-sa/4.0/) license following the data available on the Mapillary platform. The model implementation and the pre-trained weights follow a [CC-BY-NC](https://creativecommons.org/licenses/by-nc/2.0/) license. [OpenStreetMap data](https://www.openstreetmap.org/copyright) is licensed under the [Open Data Commons Open Database License](https://opendatacommons.org/licenses/odbl/).

diff --git a/maploc/data/kitti/prepare.py b/maploc/data/kitti/prepare.py
@@ -9,22 +9,21 @@
 from tqdm.auto import tqdm
 
 from ... import logger
-from ...osm.tiling import TileManager
+from ...osm.prepare import OSMDataSource, download_and_prepare_osm
 from ...osm.viz import GeoPlotter
 from ...utils.geo import BoundaryBox, Projection
-from ...utils.io import DATA_URL, download_file
+from ...utils.io import download_file
 from .dataset import KittiDataModule
 from .utils import parse_gps_file
 
 split_files = ["test1_files.txt", "test2_files.txt", "train_files.txt"]
 
 
 def prepare_osm(
-    data_dir,
-    osm_path,
-    output_path,
-    tile_margin=512,
+    data_dir: Path,
+    osm_source: OSMDataSource,
     ppm=2,
+    tile_margin=512,
 ):
     all_latlon = []
     for gps_path in data_dir.glob("2011_*/*/oxts/data/*.txt"):
@@ -34,21 +33,26 @@ def prepare_osm(
     all_latlon = np.stack(all_latlon)
     projection = Projection.from_points(all_latlon)
     all_xy = projection.project(all_latlon)
-    bbox_map = BoundaryBox(all_xy.min(0), all_xy.max(0)) + tile_margin
+    bbox_tiling = BoundaryBox(all_xy.min(0), all_xy.max(0)) + tile_margin
+
+    tiles_path = data_dir / KittiDataModule.default_cfg["tiles_filename"]
+    osm_path = data_dir / "karlsruhe.osm"
+    tile_manager = download_and_prepare_osm(
+        osm_source,
+        "kitti",
+        tiles_path,
+        bbox_tiling,
+        projection,
+        osm_path,
+        ppm=ppm,
+    )
 
     plotter = GeoPlotter()
     plotter.points(all_latlon, "red", name="GPS")
-    plotter.bbox(projection.unproject(bbox_map), "blue", "tiling bounding box")
-    plotter.fig.write_html(data_dir / "split_kitti.html")
-
-    tile_manager = TileManager.from_bbox(
-        projection,
-        bbox_map,
-        ppm,
-        path=osm_path,
+    plotter.bbox(
+        projection.unproject(tile_manager.bbox), "black", "tiling bounding box"
     )
-    tile_manager.save(output_path)
-    return tile_manager
+    plotter.fig.write_html(data_dir / "viz_kitti.html")
 
 
 def download(data_dir: Path):
@@ -99,24 +103,13 @@ def download(data_dir: Path):
         "--data_dir", type=Path, default=Path(KittiDataModule.default_cfg["data_dir"])
     )
     parser.add_argument("--pixel_per_meter", type=int, default=2)
-    parser.add_argument("--generate_tiles", action="store_true")
+    parser.add_argument(
+        "--osm_source",
+        default=OSMDataSource.PRECOMPUTED.name,
+        choices=[e.name for e in OSMDataSource],
+    )
     args = parser.parse_args()
 
     args.data_dir.mkdir(exist_ok=True, parents=True)
     download(args.data_dir)
-
-    tiles_path = args.data_dir / KittiDataModule.default_cfg["tiles_filename"]
-    if args.generate_tiles:
-        logger.info("Generating the map tiles.")
-        osm_filename = "karlsruhe.osm"
-        osm_path = args.data_dir / osm_filename
-        if not osm_path.exists():
-            logger.info("Downloading OSM raw data.")
-            download_file(DATA_URL + f"/osm/{osm_filename}", osm_path)
-        if not osm_path.exists():
-            raise FileNotFoundError(f"No OSM data file at {osm_path}.")
-        prepare_osm(args.data_dir, osm_path, tiles_path, ppm=args.pixel_per_meter)
-        (args.data_dir / ".downloaded").touch()
-    else:
-        logger.info("Downloading pre-generated map tiles.")
-        download_file(DATA_URL + "/tiles/kitti.pkl", tiles_path)
+    prepare_osm(args.data_dir, OSMDataSource[args.osm_source], ppm=args.pixel_per_meter)
diff --git a/maploc/data/mapillary/config.py b/maploc/data/mapillary/config.py
@@ -0,0 +1,154 @@
+from omegaconf import OmegaConf
+
+from ...utils.geo import BoundaryBox
+
+location_to_params = {
+    "sanfrancisco_soma": {
+        "bbox": BoundaryBox((37.770364, -122.410307), (37.795545, -122.388772)),
+        "bbox_val": BoundaryBox(
+            (37.788123419925945, -122.40053535863909),
+            (37.78897443253716, -122.3994618718349),
+        ),
+        "filters": {"model": "GoPro Max"},
+        "osm_file": "sanfrancisco.osm",
+    },
+    "sanfrancisco_hayes": {
+        "bbox": BoundaryBox((37.768634, -122.438415), (37.783894, -122.410605)),
+        "bbox_val": BoundaryBox(
+            (37.77682908567614, -122.42439593370665),
+            (37.7776996640339, -122.42329849537967),
+        ),
+        "filters": {"model": "GoPro Max"},
+        "osm_file": "sanfrancisco.osm",
+    },
+    "montrouge": {
+        "bbox": BoundaryBox((48.80874, 2.298958), (48.825276, 2.332989)),
+        "bbox_val": BoundaryBox(
+            (48.81554465300679, 2.315590378986898),
+            (48.816228935240346, 2.3166087395920103),
+        ),
+        "filters": {"model": "LG-R105"},
+        "osm_file": "paris.osm",
+    },
+    "amsterdam": {
+        "bbox": BoundaryBox((52.340679, 4.845284), (52.386299, 4.926147)),
+        "bbox_val": BoundaryBox(
+            (52.358275965541495, 4.876867175817335),
+            (52.35920971624303, 4.878370977965195),
+        ),
+        "filters": {"model": "GoPro Max"},
+        "osm_file": "amsterdam.osm",
+    },
+    "lemans": {
+        "bbox": BoundaryBox((47.995125, 0.185752), (48.014209, 0.224088)),
+        "bbox_val": BoundaryBox(
+            (48.00468200256593, 0.20130905922712253),
+            (48.00555356009431, 0.20251886369476968),
+        ),
+        "filters": {"creator_username": "sogefi"},
+        "osm_file": "lemans.osm",
+    },
+    "berlin": {
+        "bbox": BoundaryBox((52.459656, 13.416271), (52.499195, 13.469829)),
+        "bbox_val": BoundaryBox(
+            (52.47478263625299, 13.436060761632277),
+            (52.47610554128314, 13.438407628895831),
+        ),
+        "filters": {"is_pano": True},
+        "osm_file": "berlin.osm",
+    },
+    "nantes": {
+        "bbox": BoundaryBox((47.198289, -1.585839), (47.236161, -1.51318)),
+        "bbox_val": BoundaryBox(
+            (47.212224982547106, -1.555772859366718),
+            (47.213374064189956, -1.554270622470525),
+        ),
+        "filters": {"is_pano": True},
+        "osm_file": "nantes.osm",
+    },
+    "toulouse": {
+        "bbox": BoundaryBox((43.591434, 1.429457), (43.61343, 1.456653)),
+        "bbox_val": BoundaryBox(
+            (43.60314813839066, 1.4431497839062253),
+            (43.604433961018984, 1.4448508228862122),
+        ),
+        "filters": {"is_pano": True},
+        "osm_file": "toulouse.osm",
+    },
+    "vilnius": {
+        "bbox": BoundaryBox((54.672956, 25.258633), (54.696755, 25.296094)),
+        "bbox_val": BoundaryBox(
+            (54.68292611300143, 25.276979025529165),
+            (54.68349008447563, 25.27798847871685),
+        ),
+        "filters": {"is_pano": True},
+        "osm_file": "vilnius.osm",
+    },
+    "helsinki": {
+        "bbox": BoundaryBox(
+            (60.1449128318, 24.8975480117), (60.1770977471, 24.9816543235)
+        ),
+        "bbox_val": BoundaryBox(
+            (60.163825618884275, 24.930182541064955),
+            (60.16518598734065, 24.93274647451007),
+        ),
+        "filters": {"is_pano": True},
+        "osm_file": "helsinki.osm",
+    },
+    "milan": {
+        "bbox": BoundaryBox(
+            (45.4810977947, 9.1732723899), (45.5284238563, 9.2255987917)
+        ),
+        "bbox_val": BoundaryBox(
+            (45.502686834500466, 9.189078329923374),
+            (45.50329294217317, 9.189881944589828),
+        ),
+        "filters": {"is_pano": True},
+        "osm_file": "milan.osm",
+    },
+    "avignon": {
+        "bbox": BoundaryBox(
+            (43.9416178156, 4.7887045302), (43.9584848909, 4.8227015622)
+        ),
+        "bbox_val": BoundaryBox(
+            (43.94768786305171, 4.809099008430249),
+            (43.94827840894793, 4.809954737764413),
+        ),
+        "filters": {"is_pano": True},
+        "osm_file": "avignon.osm",
+    },
+    "paris": {
+        "bbox": BoundaryBox((48.833827, 2.306823), (48.889335, 2.39067)),
+        "bbox_val": BoundaryBox(
+            (48.85558288211851, 2.3427920762801526),
+            (48.85703370256603, 2.3449544861818654),
+        ),
+        "filters": {"is_pano": True},
+        "osm_file": "paris.osm",
+    },
+    # Add any new region/city here:
+    # "location_name": {
+    #     "bbox": BoundaryBox((lat_min, long_min), (lat_max, long_max)),
+    #     "filters": {"is_pano": True},
+    #     # or other filters like creator_username, model, etc.
+    #     # all described at https://www.mapillary.com/developer/api-documentation#image
+    # }
+    # Other fields (bbox_val, osm_file) will be deduced automatically.
+}
+
+default_cfg = OmegaConf.create(
+    {
+        "downsampling_resolution_meters": 3,
+        "target_num_val_images": 100,
+        "val_train_margin_meters": 25,
+        "max_num_train_images": 50_000,
+        "max_image_size": 512,
+        "do_legacy_pano_offset": True,
+        "min_dist_between_keyframes": 4,
+        "tiling": {
+            "tile_size": 128,
+            "margin": 128,
+            "ppm": 2,
+        },
+    }
+)
diff --git a/maploc/data/mapillary/dataset.py b/maploc/data/mapillary/dataset.py
@@ -253,18 +253,28 @@ def parse_splits(self, split_arg, names):
                 "val": [n for n in names if n[0] in scenes_val],
             }
         elif isinstance(split_arg, str):
-            with (self.root / split_arg).open("r") as fp:
-                splits = json.load(fp)
+            if (self.root / split_arg).exists():
+                # Common split file.
+                with (self.root / split_arg).open("r") as fp:
+                    splits = json.load(fp)
+            else:
+                # Per-scene split file.
+                splits = defaultdict(dict)
+                for scene in self.cfg.scenes:
+                    with (self.root / split_arg.format(scene=scene)).open("r") as fp:
+                        scene_splits = json.load(fp)
+                    for split_name in scene_splits:
+                        splits[split_name][scene] = scene_splits[split_name]
             splits = {
-                k: {loc: set(ids) for loc, ids in split.items()}
-                for k, split in splits.items()
+                split_name: {scene: set(ids) for scene, ids in split.items()}
+                for split_name, split in splits.items()
             }
             self.splits = {}
-            for k, split in splits.items():
-                self.splits[k] = [
-                    n
-                    for n in names
-                    if n[0] in split and int(n[-1].rsplit("_", 1)[0]) in split[n[0]]
+            for split_name, split in splits.items():
+                self.splits[split_name] = [
+                    (scene, *arg, name)
+                    for scene, *arg, name in names
+                    if scene in split and int(name.rsplit("_", 1)[0]) in split[scene]
                 ]
         else:
             raise ValueError(split_arg)