Deep Q-Learning LunarLander

Train an AI agent to land a spacecraft on the moon using Deep Q-Learning (DQN).

About This Project

This project teaches a neural network to land a spacecraft in the LunarLander-v3 environment from OpenAI Gymnasium. The agent learns entirely from trial and error!

DQL

Network

State (8 numbers: position, velocity, angle, etc.)
        |
        v
  Neural Network (128 -> 64 neurons)
        |
        v
Q-values for each action: [Nothing: 5.2, Left: 3.1, Main: 8.5, Right: 2.0]
        |
        v
   Choose: FIRE MAIN ENGINE (highest Q-value)
        |
        v
   Get Reward (+100 for landing, -100 for crash)
        |
        v
   Update network using Bellman equation

Key Concepts

Concept	Simple Explanation
State	8 numbers describing the lander (position, velocity, angle)
Action	What to do (nothing, fire left/main/right engine)
Reward	Points for good landings, penalties for crashes
Q-Value	Expected future score from an action
Epsilon	Chance of random action (exploration)
Gamma	How much to value future rewards (0.99)

--

Installation

Step 1: Clone or Download

cd Desktop/ml/DQL2048  # or wherever you placed the project

Step 2: Create Virtual Environment

macOS/Linux:

python3 -m venv venv
source venv/bin/activate

Step 3: Install Dependencies

pip install -r requirements.txt

This installs:

PyTorch: Neural network framework
Gymnasium[box2d]: LunarLander environment
NumPy: Numerical computing
Matplotlib: Training visualization

Note: Box2D can be tricky on Windows. If installation fails:

pip install swig
pip install gymnasium[box2d]

Step 4: Verify Installation

python "import torch; import gymnasium; print('Ready to train!')"

Training the Agent

Quick Start

# Make sure venv is activated!
python train.py

Training Options

# Basic training (1000 episodes)
python train.py

# Longer training
python train.py --episodes 2000

# Resume from checkpoint
python train.py --load checkpoints/dqn_lunarlander_latest.pt

# Watch training in real-time (slower but fun!)
python train.py --render

# Custom hyperparameters
python train.py --learning-rate 0.0005 --batch-size 32

Watching the Agent Play

After Training

python visualize.py --model checkpoints/dqn_lunarlander_final.pt

Visualization Options

# Watch 5 episodes
python visualize.py -m checkpoints/dqn_lunarlander_final.pt -e 5

# Slower playback (better for learning)
python visualize.py -m checkpoints/dqn_lunarlander_final.pt -d 0.1

# No model? Watch random agent
python visualize.py --no-render

What You'll See

The visualization shows:

Real-time rendering of the lander
State values (position, velocity, angle)
Q-values for each action
Chosen action highlighted
Episode statistics (reward, steps)

Plot Training Progress

python plot_training.py --metrics logs/training_metrics.json

--

Expected Training Time

CPU Training (Default)

Episodes	Time (approx.)	Expected Performance
300	5-10 min	Starts improving, mostly crashes
500	10-20 min	Sometimes lands, inconsistent
800	20-35 min	Good landings, may solve!
1000	30-45 min	Usually solved (avg reward >= 200)

"Solved" means average reward >= 200 over 100 consecutive episodes.

Project Structure

DQL2048/
├── network.py          # Neural network (heavily commented!)
├── agent.py            # DQN agent with replay buffer
├── train.py            # Training script
├── visualize.py        # Watch agent play
├── plot_training.py    # Training visualization
├── config.py           # Hyperparameters
├── requirements.txt    # Dependencies
├── README.md           # This file
├── checkpoints/        # Saved models
├── logs/               # Training logs
└── plots/              # Training plots

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Q-Learning LunarLander

Table of Contents

About This Project

This project teaches a neural network to land a spacecraft in the LunarLander-v3 environment from OpenAI Gymnasium. The agent learns entirely from trial and error!

DQL

Network

Key Concepts

Installation

Step 1: Clone or Download

Step 2: Create Virtual Environment

Step 3: Install Dependencies

Step 4: Verify Installation

Training the Agent

Quick Start

Training Options

Watching the Agent Play

After Training

Visualization Options

What You'll See

Plot Training Progress

Expected Training Time

CPU Training (Default)

"Solved" means average reward >= 200 over 100 consecutive episodes.

Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
imgs		imgs
.gitignore		.gitignore
README.md		README.md
agent.py		agent.py
config.py		config.py
network.py		network.py
plot_training.py		plot_training.py
requirements.txt		requirements.txt
train.py		train.py
visualize.py		visualize.py

Folders and files

Latest commit

History

Repository files navigation

Deep Q-Learning LunarLander

Table of Contents

About This Project

This project teaches a neural network to land a spacecraft in the LunarLander-v3 environment from OpenAI Gymnasium. The agent learns entirely from trial and error!

DQL

Network

Key Concepts

Installation

Step 1: Clone or Download

Step 2: Create Virtual Environment

Step 3: Install Dependencies

Step 4: Verify Installation

Training the Agent

Quick Start

Training Options

Watching the Agent Play

After Training

Visualization Options

What You'll See

Plot Training Progress

Expected Training Time

CPU Training (Default)

"Solved" means average reward >= 200 over 100 consecutive episodes.

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages