Game On at the Distribution Grid: Congestion Management with Multi-Agent Deep Reinforcement Learning
This repository contains code and experiments to study strategic consumer behavior under locally-coincident congestion management mechanisms in the distribution grid. It provides simulation environments, optimization baselines, and multi-agent RL training pipelines (via BenchMARL) for peak charges and direct load control.
This repository is the open-source code for the paper: Miskiw, K. K., Alahmed, A. S., Hwang, S. Y. S., Botterud, A., & Staudt, P. (2026). Comparative Evaluation of Distribution Grid Congestion Management Mechanisms.
- Overview
- Abstract (paper)
- Setup
- Repository structure
- BenchMARL
- Code Formatting
- Contributors
- License
- Contact
Increasing electrification of consumer energy consumption is placing growing stress on distribution grids, inducing local demand peaks that cause congestion. Time-dependent electricity tariffs can exacerbate this issue, as they move demand to better match supply but rarely account for distribution grid impacts. To address this issue, distribution system operators need mechanisms at their disposal to alter demand. Yet, the consequences and interdependences that are created through these mechanisms remain elusive. We therefore compare two important families of such mechanisms, peak charges and direct load control, and show how they may induce strategic consumer behavior. We use game theory to derive analytic solutions of consumer behavior and subsequently apply deep reinforcement learning to analyze more complex settings. Peak charges can theoretically eliminate demand peaks by incentivizing consumers to flatten aggregate demand if peaks are anticipated and prices are set correctly. Setting appropriate peak charges is challenged by time-dependent electricity tariffs. Direct load control mitigates peaks by coercing consumers to shift demand in order to avoid curtailment. It is less reliant on peak anticipation and price setting. It is robust to behavioral uncertainty and price volatility, but may only curtail a subset of consumers rather than those for whom demand shifting is least costly. In turn, this may result in economic efficiency losses. We provide guidance to policy researchers and practitioners by systematically outlining the resulting tradeoffs between the two mechanisms.
Follow the steps below to set up and run the project:
-
Install Python 3.11.*
Download and install Python 3.11.* from the Python website.
-
Install Poetry
Poetry is used for dependency management and packaging. Install it via the Poetry documentation.
-
Install dependencies
From the repository root:
poetry install
-
Activate the environment
poetry env activate
This outputs a command you need to run, for example:
source /home/{...} -
Create a .env file
Create a .env file in the repository root with:
WANDB_API_KEY="KEY_FOR_WANDB" -
Gurobi license
Some optimization routines require Gurobi. Make sure you have a valid Gurobi license and that it is discoverable by the Gurobi Python package (e.g., a gurobi.lic file or a configured license server).
-
Run an experiment
Example:
cd ./dr_mechanisms cd ./_02_Peak_Shaving python -m experiment_locally_maddpg
- dr_mechanisms/ — Main experiment suites and notebooks.
- _01_Peak_Charges/ — Experiments and configs for peak charge tasks.
- _02_Peak_Shaving/ — Experiments and configs for peak shaving tasks.
- analysis.ipynb — Example analysis notebook for the peak shaving task.
- experiment_locally_maddpg.py / experiment_locally_mappo.py — Local training entry points.
- opt_peak_shaving.py — Optimization baseline for the peak shaving task.
- tests.ipynb — Lightweight test and sanity-check notebook.
- utils.py — Helper utilities used by experiments.
- agents/ — Agent definitions for the peak shaving environment.
- callbacks/ — Training callbacks and plotting utilities in line with BenchMARL and Wandb.
- env/ — Environment implementation and common environment utilities.
- yaml_configs/ — BenchMARL experiment, model, and algorithm configs.
- constants.py / context.py — Shared constants and global context helpers.
- pyproject.toml — Poetry configuration and dependencies.
- README.md — Project overview and setup instructions.
This project uses BenchMARL for multi-agent RL training and configuration management. The BenchMARL YAML configs live under each experiment folder in yaml_configs/ (e.g., algorithms, models, tasks, and experiment settings), and the training entry points use those configs directly.
This repository uses Black for code formatting and pre-commit hooks. Run the commands from the repository root so the .pre-commit-config.yaml is found.
To install:
pip install pre-commit
pre-commit installTo format all files:
pre-commit run --all-filesThis project was developed in collaboration between members of the Karlsruhe Institute of Technology (KIT).
| Name | Role & Contribution |
|---|---|
| Kim K. Miskiw | Conceptualization, Methodology, Software, Validation, Formal analysis, Data curation, Writing – Original Draft, Writing – Review & Editing, Visualization, Project administration, Funding acquisition |
| Ahmed Alahmed | Conceptualization, Formal analysis, Writing – Review & Editing, Supervision |
| Shannon Hwang | Formal analysis, Writing – Review & Editing |
| Audun Botterud | Conceptualization, Writing – Review & Editing, Supervision |
| Philipp Staudt | Conceptualization, Methodology, Formal analysis, Writing – Review & Editing, Supervision |
This repository is licensed under the MIT License. This permits commercial use, modification, distribution, and private use with minimal restrictions.
For questions or early access inquiries, contact: kim.miskiw@kit.edu