A physics-aware deep learning framework that uses Proximal Policy Optimisation (PPO) to iteratively refine free-water (FW) and intracellular volume fraction (ICVF) estimates from diffusion MRI.
For a detailed explanation of the internal files and the feedback loop, please see ARCHITECTURE.md.
graph TD
%% Execution Layer
subgraph Execution Layer
train["train_rl_model.py"]
eval["validate.py / evaluate.py"]
config["config/default_config.yaml"]
end
%% Data Layer
subgraph Data Layer
dataset["data/dataset.py"]
curriculum["data/noise_curriculum.py"]
noise["data/noise_injection.py"]
end
%% Model Layer
subgraph Model Layer
iter_model["models/iterative_model.py"]
ann["models/ann_prior.py"]
vit["models/vit_backbone.py"]
pde["models/pde_block.py"]
denoiser["models/denoiser.py"]
decoder["models/decoder.py"]
end
%% RL Layer
subgraph RL Layer
agent["rl/rl_agent.py"]
policy["rl/policy.py"]
reward["rl/reward.py"]
buffer["rl/replay_buffer.py"]
end
%% Optimization Layer
subgraph Optimization Layer
loss["losses/losses.py"]
utils["utils/checkpointing.py"]
end
%% Connections
config --> train
config --> eval
train --> dataset
dataset --> curriculum
curriculum --> noise
noise -.->|Noisy Data| iter_model
train --> iter_model
train --> agent
train --> loss
train --> utils
iter_model --> ann
iter_model --> vit
iter_model --> pde
iter_model --> denoiser
iter_model --> decoder
agent --> policy
agent --> buffer
iter_model -.->|Predictions| reward
reward -.->|Reward Signal| agent
policy -.->|Actions| denoiser
# Required packages
pip install torch torchvision numpy nibabel pyyaml tqdm
pip install timm # optional, for non-constant PDE diffusion coefficients# 1. Supervised warm-up (no RL, creates stable baseline)
python train_warmup.py
# 2. RL-augmented training (PPO Agent takes control)
python train_rl_model.py --warmup-ckpt output/warmup_best.pth
# 3. Inference (Generate FW/ICVF NIfTI maps)
python test_rl_model.py --model-path output/rl_model_best.pth
# 4. Comprehensive evaluation metrics
python evaluate.py --model-path output/rl_model_best.pth#!/bin/bash
#SBATCH --job-name=rl_fw_icvf
#SBATCH --gres=gpu:1
#SBATCH --mem=64G
#SBATCH --time=48:00:00
#SBATCH --output=logs/%j.out
module load cuda/11.8
source activate diffusion_env
cd /home/ranjeet/RL_MODEL
python train_warmup.py
python train_rl_model.py --warmup-ckpt output/warmup_best.pthAll hyperparameters are centrally managed in config/default_config.yaml.
# Key parameters
vit:
embed_dim: 64
patch_size: 4
depth: 4
num_heads: 8
pde:
K_inner: 5 # Inner PDE iterations
tissue_adaptive: true
iterative:
K_outer: 3 # Outer refinement iterations
rl:
algorithm: "PPO"
clip_eps: 0.2
ppo_epochs: 4
action_dim: 3 # [denoise, noise_adapt, pde_intensity]| File | Description |
|---|---|
predicted_fw.nii.gz |
Free-water fraction map [0,1] |
predicted_icvf.nii.gz |
Intracellular volume fraction [0,1] |
tissue_fraction.nii.gz |
1 − FW (tissue fraction) |
test_metrics.csv |
Per-subject RMSE/MAE metrics |
If you use this framework, please cite:
@misc{rl_fw_icvf_2025,
title={Reinforcement-Based Iterative Free-Water and Tissue Estimation
from Diffusion MRI},
author={Ranjeet Jha},
year={2025},
note={Extends VIT-RXN-DIFF with PPO-controlled iterative refinement}
}This project is for academic research purposes.