QUASAR: QUAntile-based neural network reconstruction for near-Surface Atmospheric vaRiables
A traditional geostatistical spatial reconstruction problem consists of inferring the value of a variable at an unobserved location from observations at neighboring locations. Due to the inherent complexity of the task, including the filtering properties of the predictor and the uncertainties associated with the observations, it is often desirable to estimate not only a deterministic prediction but also its associated uncertainty.
This repository explores the problem using the flexibility of machine learning models. Specifically, given the 20 nearest neighboring observations of a geophysical field (e.g., temperature, precipitation, or other meteorological variables), a multilayer perceptron (MLP) is trained to reconstruct the full probabilistic distribution at a target location. The model outputs a discretized quantile function, providing both a prediction and an estimate of its uncertainty.
The model and training configurations are described in a config file.
Check example_config.toml for the available settings.
To train a model, run:
uv run quasar training config.tomlA new experiment directory will be created in the configured output location. This directory contains the trained model checkpoints, the best-performing model, and training metadata.
Suppose that the experiment directory created during training is located at output/XXXX.
To run inference on the test dataset, execute:
uv run quasar inference output/XXXX/config.tomlThe configuration file stored in the experiment directory ensures that inference is performed using the same settings employed during training.
To perform hyperparameter optimization using Optuna's Tree-structured Parzen Estimator (TPE) sampler, run:
uv run quasar optimize config.tomlThis command launches an Optuna study and stores the optimization results in the configured output directory.