Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 8 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -189,19 +189,24 @@ output/
# Pykan model state
model/

# config classes
# config classes (track templates and hydra settings)
config/
!config/example_config.yaml
!config/templates/
!config/hydra/

# eval example data
examples/eval/*.zarr

# scratch notebooks
examples/scratch

# claude
Claude.md
# claude (keep .claude/ private, commit project-level CLAUDE.md)
.claude/

# built docs
site/

# misc
.qodo/
.ruff_cache/
Expand Down
170 changes: 170 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
# DDR — Distributed Differentiable Routing

AI-agent context file. Committed to version control so every coding assistant
(Claude, Copilot, Cursor, etc.) gets the same codebase orientation.

---

## Architecture

DDR couples a **Kolmogorov-Arnold Network (KAN)** with **differentiable
Muskingum-Cunge (MC) routing** to learn spatially varying river-routing
parameters end-to-end via PyTorch autograd.

1. **KAN** ingests catchment attributes and predicts three spatial parameters
per reach: Manning's *n*, *q_spatial*, and *p_spatial*.
2. **Leopold & Maddock power law** converts depth to channel geometry:
`top_width = p_spatial * depth^(q_spatial + 1e-6)`.
3. **MC routing** solves the linearized Saint-Venant equations on a trapezoidal
channel cross-section using a sparse-matrix solve (CSR format).
4. Gradients flow from the loss (KGE, NSE, etc.) back through the routing
physics into the KAN weights.

---

## Module Map

| Path | Role |
|---|---|
| `src/ddr/routing/mmc.py` | Core `MuskingumCunge` engine — sparse matrix solve, trapezoid velocity, parameter denormalization |
| `src/ddr/routing/torch_mc.py` | PyTorch `nn.Module` wrapper (`dmc` class) |
| `src/ddr/nn/kan.py` | KAN neural network for spatial parameter prediction |
| `src/ddr/io/readers.py` | `StreamflowReader` for loading lateral inflows |
| `src/ddr/io/functions.py` | Utility functions (downsampling, etc.) |
| `src/ddr/validation/configs.py` | Pydantic config models (`Config`, `DataSources`, `Params`, `Kan`, `ExperimentConfig`) |
| `src/ddr/validation/enums.py` | `GeoDataset` and `Mode` enums |
| `src/ddr/validation/metrics.py` | Evaluation metrics |
| `src/ddr/validation/plots.py` | Plotting utilities |
| `src/ddr/geodatazoo/` | Dataset abstraction layer — `BaseGeoDataset`, `Merit`, `LynkerHydrofabric` |
| `src/ddr/scripts_utils.py` | Shared helpers used by the scripts below |

## Public API (`src/ddr/__init__.py`)

```python
from .routing.torch_mc import dmc # Differentiable routing model
from .nn import kan # KAN neural network
from .io.readers import StreamflowReader as streamflow # Data reader
from .io import functions as ddr_functions # Utilities
from . import validation # Config, Metrics, plotting
```

---

## Config Flow

```
Hydra YAML (config/) → OmegaConf DictConfig → validate_config() → Pydantic Config
```

- Hydra parses YAML from the `config/` directory.
- `validate_config()` in `src/ddr/validation/configs.py` converts the
`DictConfig` to a typed Pydantic model.
- Config supports `${oc.env:VAR_NAME,default}` interpolation for portable
paths across machines.

---

## Scripts

| Script | Purpose |
|---|---|
| `scripts/train.py` | Training loop (`python scripts/train.py --config-name=<config>`) |
| `scripts/test.py` | Evaluation |
| `scripts/train_and_test.py` | Combined train + test |
| `scripts/router.py` | Forward routing with a trained model |
| `scripts/summed_q_prime.py` | Baseline — unrouted sum of lateral inflows |

---

## Downstream Call-Site Checklist

When modifying `src/ddr/` interfaces (constructor signatures, `forward()`
return types, config fields), **always check and update these downstream
consumers**:

1. **`examples/`** — Example notebooks that instantiate `kan()`, `dmc()`, and
load configs.
2. **`benchmarks/scripts/benchmark.py`** and
**`benchmarks/src/ddr_benchmarks/benchmark.py`** — Own `kan()`/`dmc()`
instantiation and evaluation loops that must stay in sync with the core
scripts.
3. **`scripts/`** — All training/testing scripts.
4. **`config/`** — YAML files may reference field names that changed.

Quick grep to find all `kan()` constructor call sites:

```bash
grep -r "kan(" examples/ benchmarks/ scripts/
```

---

## Testing

```bash
uv run pytest # Unit tests (no data dependencies)
uv run pytest -m integration # Integration tests (requires HPC data)
```

- Unit tests live in `tests/`.
- Integration tests are marked with `@pytest.mark.integration` and deselected
by default (`addopts = "-m 'not integration'"`).

---

## Code Quality

| Tool | Config |
|---|---|
| **Linter** | ruff — rules: F, E, W, I, D, B, Q, TID, C4, BLE, UP, RUF100 |
| **Formatter** | ruff format |
| **Type checker** | mypy (strict: `disallow_untyped_defs = true`) |
| **Docstrings** | NumPy convention (`tool.ruff.lint.pydocstyle`) |
| **Line length** | 110 |
| **Pre-commit** | ruff check+format, mypy, nbstripout, trailing-whitespace, end-of-file-fixer, check-yaml |

All config lives in `pyproject.toml`. Pre-commit hooks are defined in
`.pre-commit-config.yaml`.

---

## Datasets (GeoDataset enum)

Two supported geodatasets (see `src/ddr/validation/enums.py`):

| Enum value | Dataset | Attributes | Geometry |
|---|---|---|---|
| `merit` | MERIT-Hydro global river network | `.nc` | `.shp` |
| `lynker_hydrofabric` | Lynker Hydrofabric v2.2 (CONUS) | icechunk store | `.gpkg` |

Key difference: MERIT uses `log10_uparea`; Lynker uses `log_uparea` for
upstream area.

---

## Workspace (monorepo)

Three packages managed by `uv`:

| Package | Directory | Description |
|---|---|---|
| `ddr` | `.` (root) | Core routing library |
| `ddr-engine` | `engine/` | Geospatial data preparation |
| `ddr-benchmarks` | `benchmarks/` | Comparison framework (vs DiffRoute, etc.) |

Install everything:

```bash
uv sync --all-packages
```

---

## Key Conventions

- Python **>=3.11, <3.14**.
- PyTorch with CUDA 13.0 index by default (configurable in `pyproject.toml`).
- Sparse CSR tensors are used for the routing matrix solve — expect beta
warnings from PyTorch (already suppressed in pytest config).
- `__init__.py` files use `F401` ignore so re-exports don't trigger unused-import
lint errors.
66 changes: 66 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Contributing to DDR

## Development Setup

```bash
git clone https://github.com/DeepGroundwater/ddr.git
cd ddr
uv sync --all-packages
pre-commit install
```

## Code Style

- **Formatter/Linter:** ruff (line length 110)
- **Type checking:** mypy (strict mode)
- **Docstrings:** NumPy convention
- Pre-commit hooks enforce all of the above on every commit.

## Running Tests

```bash
uv run pytest # Unit tests
uv run pytest -m integration # Integration tests (requires local data)
uv run pytest tests/routing/ -v # Run specific test directory
```

## Interface Change Checklist

When modifying `src/ddr/` interfaces (constructor signatures, forward() return types, config fields), check these downstream consumers:

1. **`scripts/`** — `train.py`, `test.py`, `train_and_test.py`, `router.py` instantiate `kan()` and `dmc()`
2. **`examples/`** — Notebooks load configs and instantiate models
3. **`benchmarks/`** — `benchmarks/scripts/benchmark.py` and `benchmarks/src/ddr_benchmarks/benchmark.py` have their own model instantiation
4. **`config/`** — YAML files may reference renamed or removed fields

Quick check: `grep -r "kan(" examples/ benchmarks/ scripts/` to find all call sites.

## Pull Request Process

1. Create a feature branch from `master`
2. Make your changes with tests
3. Ensure CI passes: `uv run pytest && uv run ruff check . && uv run mypy src/`
4. Open a PR with a clear description of what and why

## Adding a New GeoDataset

1. Add enum value to `src/ddr/validation/enums.py` (`GeoDataset`)
2. Create dataset class in `src/ddr/geodatazoo/` extending `BaseGeoDataset`
3. Register in `GeoDataset.get_dataset_class()`
4. Add config template in `config/templates/`
5. Add tests in `tests/geodatazoo/`

## Project Structure

```
ddr/
├── src/ddr/ # Core library
├── engine/ # Data preparation (ddr-engine package)
├── benchmarks/ # Comparison framework (ddr-benchmarks package)
├── scripts/ # Training/testing entry points
├── config/ # Hydra YAML configs
│ └── templates/ # Portable config templates (version controlled)
├── examples/ # Example notebooks + trained weights
├── tests/ # Test suite
└── docs/ # Zensical documentation
```
63 changes: 42 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@ An implementation of differentiable river routing methods for the NextGen Framew

[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)

### Dependencies
### Installation

The following commands will allow you to install all required dependencies for DDR
DDR uses [uv](https://docs.astral.sh/uv/) for dependency management. From the repo root:

```sh
# Full workspace — CPU (includes ddr, ddr-engine, and ddr-benchmarks)
Expand All @@ -26,39 +26,60 @@ uv sync --all-packages --group cuda12
uv sync --all-packages --group cuda13
```

This installs the `ddr` CLI entry point. Verify with `ddr --help`.

### Quick Start

1. **Copy a config template** from `config/templates/` and customize paths:
```sh
cp config/templates/merit_training.yaml config/my_training.yaml
# Edit config/my_training.yaml — set data_sources paths to your data
```

Templates use `${oc.env:DDR_DATA_DIR,./data}` so you can set the
`DDR_DATA_DIR` environment variable or edit the paths directly.

2. **Train a model** using the `ddr` CLI:
```sh
ddr train --config-name=my_training
```

3. **Evaluate** the trained model:
```sh
ddr test --config-name=my_testing
```

Available subcommands: `train`, `test`, `route`, `train-and-test`, `summed-q-prime`.

You can also call scripts directly if preferred:
```sh
uv run python scripts/train.py --config-name=my_training
```

### Data Engine

Next, you need to create the necessary data files for running a routing across your domain.
- The example below is for the NOAA-OWP Hydrofabric v2.2 (Dataset is not included in the repo)
- This requires the `ddr-engine` local package to be installed (which is done automatically through the above `uv sync`)
Before training, you need to create adjacency matrices for your domain:
- Requires the `ddr-engine` local package (installed automatically by `uv sync --all-packages`)
- The gauges.csv can be found [here](https://github.com/DeepGroundwater/references/tree/master/mhpi/dHBV2.0UH)

```sh
uv run python engine/scripts/build_hydrofabric_v2.2_matrices.py <PATH/TO/conus_nextgen.gpkg> data/ --gages references/mhpi/dHBV2.0UH/training_gauges.csv
```

This will create two files used for routing
- `hydrofabric_v2.2_conus_adjacency.zarr`
- a sparse COO matrix containing the whole river network for Hydrofabric v2.2 across CONUS
- `hydrofabric_v2.2_gages_conus_adjacency.zarr`
- a zarr.Group of sparse coo matrices for river networks upstream of USGS Gauges
This creates two files used for routing:
- `hydrofabric_v2.2_conus_adjacency.zarr` — sparse COO matrix of the full CONUS river network
- `hydrofabric_v2.2_gages_conus_adjacency.zarr` — zarr.Group of sparse COO matrices for networks upstream of USGS gauges

### Model Train
### Pre-trained Examples

All that's left is to train a routing model
```sh
# Train a model using the MHPI S3 defaults
python scripts/train.py --config-name example_config.yaml
The `examples/` directory contains pre-trained weights and notebooks for both
MERIT and Lynker Hydrofabric datasets. See [`examples/README.md`](examples/README.md).

#Test the model
python scripts/test.py --config-name example_test_config.yaml
```
### Documentation

### How to build docs locally
The zensical documentation can be built/verified locally through installing the optional `docs` dependencies and serving through localhost:
Build and serve the docs locally:

```sh
uv pip install -e ".[docs]"
uv run zensical serve
```

Expand Down
Loading
Loading