Skip to content

pedramaghazadeh/Infinite-Horizon-Stochastic-Optimal-Control

Repository files navigation

Infinite-Horizon Stochastic Optimal Control for Differential Drive Robot


This project implements infinite-horizon stochastic optimal control algorithms for trajectory tracking of a differential-drive robot navigating a 2D environment with static circular obstacles.

We compare three control approaches:

  • Certainty Equivalent Control (CEC) — solved via nonlinear optimization using CasADi.
  • Generalized Policy Iteration (GPI) - Deterministic — without stochastic transitions.
  • Generalized Policy Iteration (GPI) — with full probabilistic modeling and value iteration.

The goal is to track a periodic reference trajectory over a 100-step horizon, minimizing tracking error, control effort, and proximity to obstacles.

🗂 Repository Structure


. ├── part1.py # Certainty Equivalent Control (CEC) ├── part2-deterministic.py # GPI assuming deterministic transitions ├── part2.py # Full GPI with stochastic transitions ├── utils.py # Utility functions (dynamics, reference trajectory, plotting) ├── mujoco_car.py # Optional MuJoCo simulator wrapper ├── fig/, meshes/, env.xml # Visualization and MuJoCo environment files ├── ECE276B_PR3.pdf # Final project report ├── README.txt # This file

📌 Requirements


  • Python 3.8+
  • NumPy
  • CasADi
  • Ray
  • tqdm
  • Matplotlib (for visualization)
  • MuJoCo (optional, for physics-based validation)

Install dependencies via:

pip install numpy casadi ray tqdm matplotlib

To use MuJoCo:

  • Download from https://mujoco.org/
  • Add the path to the ~/.mujoco folder and configure environment variables.

🚀 How to Run


  1. Certainty Equivalent Control (CEC) Solves a receding-horizon nonlinear program at each time step using CasADi's nlpsol.

    python part1.py
    
  2. Deterministic GPI Uses Generalized Policy Iteration assuming a deterministic transition model.

    python part2-deterministic.py
    
  3. Full Stochastic GPI Implements full GPI with stochastic transitions, obstacle-aware stage costs, and policy iteration.

    python part2.py
    

    Use --use-mujoco flag to run simulation using MuJoCo (if installed):

    python part2.py --use-mujoco
    

🧠 Key Features


  • Discretization of state and control spaces with configurable resolution.
  • Periodic reference trajectory via Lissajous curve.
  • Obstacle avoidance via exponential or hard penalty functions.
  • Terminal cost enforcement.
  • Parallel computation of transition matrices using Ray.
  • Support for both simulation-only and physics-based environments.

📈 Outputs


  • Simulation logs (position, orientation, control actions)
  • Visualizations of robot trajectory and reference path
  • Performance metrics: tracking error, total cost, loop time

All visualizations are saved automatically to the fig/ directory when the simulation completes.

📄 Report


For detailed methodology, equations, and evaluation results, see: ECE276B_PR3.pdf

🧑‍💻 Authors


  • Pedram Aghazadeh
    UC San Diego, ECE 276B — Planning & Learning in Robotics

📜 License


This project is for academic and research use only. No commercial use is permitted without permission.

About

Infinite-Horizon Stochastic Optimal Control for ECE 276b's final project at University of California San Diego

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors