Skip to content

Revert to v0.5.1 baseline with Leopold & Maddock physics#166

Merged
taddyb merged 4 commits intoDeepGroundwater:masterfrom
taddyb:revert-to-v0.5.1
Mar 14, 2026
Merged

Revert to v0.5.1 baseline with Leopold & Maddock physics#166
taddyb merged 4 commits intoDeepGroundwater:masterfrom
taddyb:revert-to-v0.5.1

Conversation

@taddyb
Copy link
Collaborator

@taddyb taddyb commented Mar 14, 2026

Summary

  • Revert to v0.5.1 baseline: removes phi_kan bias correction, SWOT geometry, temporal KAN, forcing readers, and staged training
  • Leopold & Maddock power law: derives top_width and side_slope per-timestep from p_spatial * depth^q_spatial instead of learning them independently. Adds _apply_data_override() for Lynker/partial-coverage blending
  • p_spatial replaces top_width/side_slope as a learnable KAN output (width coefficient, log-space [1, 200])
  • In-place autograd fix: scatter_addclamp no longer double-writes to output tensor
  • MSE → MAE as default training loss
  • CUDA memory cleanup: explicit del + empty_cache() after each mini-batch

Test plan

  • All 429 tests pass (pytest tests/ --ignore=tests/references)
  • Run scripts/train_and_test.py with merit_training_config.yaml (MERIT path: power-law derived geometry)
  • Run scripts/train_and_test.py with lynker_train_and_test_config.yaml (Lynker path: data override)
  • Verify gradient flow through p_spatial → routing → loss

🤖 Generated with Claude Code

@taddyb
Copy link
Collaborator Author

taddyb commented Mar 14, 2026

results from MERIT train/test

[2026-03-14 06:27:34,987][__main__][INFO] - Evaluating with checkpoint: _ddr-v0.5.2.dev2+g21a3a96b5-merit-training_epoch_5_mb_35.pt
[2026-03-14 06:27:35,047][ddr.io.readers][INFO] - Reading icechunk store from local disk: /home/tbindas/projects/ddr/data/merit_dhbv2_UH_retrospective
[2026-03-14 06:27:35,259][ddr.io.statistics][INFO] - Reading Attribute Statistics from file: merit_attribute_statistics_merit_global_attributes_v2.nc.json
[2026-03-14 06:27:38,155][ddr.io.readers][INFO] - Reading icechunk store from local disk: /mnt/ssd1/data/icechunk/usgs_daily_observations
[2026-03-14 06:27:38,190][ddr.geodatazoo.merit][INFO] - Filtered 352 gages with DA_VALID=False
[2026-03-14 06:27:40,106][ddr.geodatazoo.merit][INFO] - Filtered 494 headwater gages with no upstream connectivity
[2026-03-14 06:27:40,106][ddr.geodatazoo.merit][INFO] - Gages mode: 2365 gauged locations
[2026-03-14 06:27:44,530][ddr.geodatazoo.merit][INFO] - Created gages adjacency matrix of shape: torch.Size([64892, 64892])
[2026-03-14 06:27:44,533][ddr.scripts_utils][INFO] - Loading spatial_nn from checkpoint: _ddr-v0.5.2.dev2+g21a3a96b5-merit-training_epoch_5_mb_35
[2026-03-14 06:34:19,019][ddr.validation.utils][INFO] - ----------------------------------------
[2026-03-14 06:34:19,019][ddr.validation.utils][INFO] - Metric     |         Mean |       Median
[2026-03-14 06:34:19,019][ddr.validation.utils][INFO] - ----------------------------------------
[2026-03-14 06:34:19,019][ddr.validation.utils][INFO] - NSE        |       0.5738 |       0.7332
[2026-03-14 06:34:19,019][ddr.validation.utils][INFO] - RMSE       |      17.3540 |       8.0274
[2026-03-14 06:34:19,019][ddr.validation.utils][INFO] - KGE        |       0.6425 |       0.7407

…anup, and autograd fix

- Keep l1_loss (MAE) over mse_loss
- Keep CUDA memory cleanup block after each mini-batch
- Apply in-place autograd fix: use intermediate `initial` tensor for scatter_add before clamp

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@taddyb
Copy link
Collaborator Author

taddyb commented Mar 14, 2026

Lynker hydrofabric performance

[2026-03-14 13:11:44,633][ddr.validation.utils][INFO] - ----------------------------------------
[2026-03-14 13:11:44,633][ddr.validation.utils][INFO] - Metric     |         Mean |       Median
[2026-03-14 13:11:44,633][ddr.validation.utils][INFO] - ----------------------------------------
[2026-03-14 13:11:44,633][ddr.validation.utils][INFO] - NSE        |       0.5213 |       0.7037
[2026-03-14 13:11:44,633][ddr.validation.utils][INFO] - RMSE       |      16.3834 |       7.0815
[2026-03-14 13:11:44,633][ddr.validation.utils][INFO] - KGE        |       0.6354 |       0.7487
[2026-03-14 13:11:44,633][ddr.validation.utils][INFO] - ----------------------------------------
[2026-03-14 13:11:44,633][__main__][INFO] - Test run complete. Please run examples/eval/evaluate.ipynb to generate performance plots / metrics
[2026-03-14 13:11:44,662][__main__][INFO] - Cleaning up...
[2026-03-14 13:11:44,663][__main__][INFO] - Total time: 84.92 minutes

@taddyb taddyb merged commit b7eaecb into DeepGroundwater:master Mar 14, 2026
4 checks passed
@taddyb taddyb deleted the revert-to-v0.5.1 branch March 14, 2026 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant