Added training code for PPO, A2C, DQN and random and greedy baselines by Thomson-Lam · Pull Request #7 · Thomson-Lam/firebot-eval

Thomson-Lam · 2026-03-30T19:11:51Z

Changes

added src/models/benchmarking.py for core metric tracking used for training for all algos
edited src/models/evaluate_agents.py as offline RL agents evaluation after training
added a no action baseline policy inside src/models/benchmarking.py that does not suppress fires as the baseline to compare against
edited src/models/fire_env.py to support the metrics needed according to the proposal

Main metrics:
mean_return: mean episodic return
asset_survival_rate: fraction of episodes with assets_lost == 0
containment_success_rate: fraction of episodes where the fire is extinguished before truncation
mean_burned_area_fraction: final burned-area fraction per episode, (burned + burning + asset_burned cells) / 625
std_across_seeds: standard deviation of the seed-level metric means

made a single interface for training all RL algos, src/models/train_rl_agent.py for argparse CLI usage
wrapped the CLI commands for training RL agents and evaluating them in Powershell and Bash inside scripts
initialized 5 fixed seeds (11, 22, 33, 44, 55) for initializing models and evaluation order for holdout environments for benchmarking
updated the README and docs for usage and details

Thomson-Lam added 2 commits March 30, 2026 13:49

feat: environment metrics for training

a9aee1f

feat: training code + benchmark, sh & ps1 wraper scripts for reproducing

f90659a

noahkostesku merged commit 13501ba into main Mar 30, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added training code for PPO, A2C, DQN and random and greedy baselines #7

Added training code for PPO, A2C, DQN and random and greedy baselines #7
noahkostesku merged 2 commits into
mainfrom
feat/training

Thomson-Lam commented Mar 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Thomson-Lam commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Thomson-Lam commented Mar 30, 2026 •

edited

Loading