Skip to content

safe-autonomy-lab/counterfactual-rl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

counterfactual-rl

Counterfactual Explanations for RL

This repository implements the method proposed in the paper:

"Counterfactual Explanations for Continuous Action Reinforcement Learning" Shuyang Dong, Shangtong Zhang, Lu Feng. University of Virginia, IJCAI 2025 (under review)

GitHub repository: https://github.com/shuyang-dong/Counterfactual_Explanation_for_RL


Overview

This project presents a novel method for generating counterfactual explanations in reinforcement learning with continuous action spaces. It aims to answer: "What small changes to the agent's actions would have led to a better outcome?" The method integrates a custom distance metric for action trajectories, a sparse reward shaping mechanism, and an extended TD3 algorithm for learning optimal counterfactual policies.

The approach is evaluated on two domains:

  • Type 1 Diabetes management using the UVA/PADOVA simulator, where alternative insulin dosages are recommended.
  • OpenAI Gym's Lunar Lander, where the agent is guided to generate safer landing behaviors.

This framework supports both unconstrained and constrained counterfactual generation, allowing it to adapt to domain-specific requirements (e.g., safety or clinical constraints).

For more details, implementation, and experimental results, visit the main repository: https://github.com/shuyang-dong/Counterfactual_Explanation_for_RL

About

Counterfactual Explanations for RL

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors