Skip to content

Nawfal-AI/Turret_replication

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

# TURRET: Transferable Unified Robot Representation with Graph Neural Networks

> **TURRET**: A Graph Neural Network framework for adaptive multi-source cross-domain transfer learning in robotic control.

## πŸ“– Overview

TURRET is a novel framework that enables effective knowledge transfer across different robot morphologies and task domains using Graph Neural Networks (GNNs). This implementation reproduces the core contributions of the original paper:

- **Unified Semantic Space**: Projects states from different tasks into a common embedding space
- **Adaptive Transfer**: Dynamically weights source policies based on state-level semantic similarity  
- **GNN-based Policy**: Structured policy networks that explicitly model robot morphology
- **Gradual Independence**: Progressive transition from transfer learning to independent learning

## 🎯 Key Features

- πŸ•ΈοΈ **Morphology-aware GNNs**: Explicitly model robot structure as graphs
- πŸ”„ **Multi-source Transfer**: Combine knowledge from multiple source policies
- 🎯 **State-level Adaptation**: Dynamic transfer weights based on current state
- πŸ“ˆ **Progressive Learning**: Smooth transition from transfer to independent learning
- πŸ§ͺ **Comprehensive Evaluation**: Size transfer, morphology transfer, and ablation studies

## πŸš€ Quick Start

### Installation & Setup
# Clone repository
git clone https://github.com/your-username/turret-replication.git
cd turret-replication

# Create environment (recommended)
conda create -n turret python=3.8
conda activate turret

# Install dependencies
pip install -r requirements.txt

# Download pre-trained models
python scripts/download_pretrained.py

# Or pre-train source policies yourself
python experiments/pretrain_source.py

Run Experiments

# Quick demo (2 seeds, 100 episodes)
python scripts/run_all_experiments.py --num_seeds 2 --total_episodes 100

# Full paper replication (5 seeds, 500 episodes)
python scripts/run_all_experiments.py --num_seeds 5 --total_episodes 500

# Run specific experiments only
python scripts/run_all_experiments.py --experiments size morphology

# With GPU acceleration
python scripts/run_all_experiments.py --device cuda

Basic Usage

from configs.base_config import TURRETConfig
from experiments.paper_experiments import PaperExperimentReplicator

# Configure experiment
config = TURRETConfig(
    device="cuda",
    total_episodes=500,
    num_seeds=5
)

# Run all paper experiments
replicator = PaperExperimentReplicator(config)
results = replicator.run_all_experiments()

πŸ”¬ Experiment Guide

Running Transfer Experiments

# Run transfer experiments directly
python experiments/transfer_experiment.py

# Run specific experiment types
python scripts/run_all_experiments.py --experiments size          # Only size transfer
python scripts/run_all_experiments.py --experiments size morphology  # Size and morphology

Result Analysis

View Experiment Results

# Results are saved in:
ls data/paper_results/

# Main files:
# - paper_experiments_YYYYMMDD_HHMMSS.json       # Raw results
# - training_statistics_YYYYMMDD_HHMMSS.json     # Training statistics  
# - analysis/comprehensive_analysis_report.json  # Analysis report

Generate Analysis Reports

from analysis.result_analyzer import ResultAnalyzer

analyzer = ResultAnalyzer("data/paper_results")
analysis = analyzer.generate_comprehensive_analysis(results)

Visualize Results

from experiments.visualization.advanced_visualizer import AdvancedVisualizer

visualizer = AdvancedVisualizer()
visualizer.plot_transfer_dynamics(results)

Performance Optimization

GPU Acceleration

# Run with GPU
python scripts/run_all_experiments.py --device cuda

# Run with multiple GPUs
python scripts/run_all_experiments.py --device cuda --num_processes 4

Distributed Training

from optimization.distributed_trainer import DistributedTURRETTrainer

dist_trainer = DistributedTURRETTrainer(config)

Performance Analysis

from optimization.performance_optimizer import PerformanceOptimizer

optimizer = PerformanceOptimizer(config)
stats = optimizer.optimize_gnn_forward(model, node_observations, morphology_graph)

πŸ—οΈ Architecture

Core Components

TURRET/
β”œβ”€β”€ configs/
β”‚   β”œβ”€β”€ base_config.py              # Main configuration dataclass (TURRETConfig)
β”‚   └── environment_config.py       # Environment-specific settings
β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ policies/
β”‚   β”‚   β”œβ”€β”€ gnn_structured_policy.py      # Full GNN policy (production version)
β”‚   β”‚   └── structured_policy.py          # Simplified policy (testing version)
β”‚   β”œβ”€β”€ networks/
β”‚   β”‚   β”œβ”€β”€ attention_propagation.py      # Multi-head GNN layers
β”‚   β”‚   β”œβ”€β”€ set_transformer.py           # State embedding via attention
β”‚   β”‚   β”œβ”€β”€ input_network.py             # Node observation processing
β”‚   β”‚   β”œβ”€β”€ output_network.py            # Action distribution prediction
β”‚   β”‚   └── base_networks.py             # Base neural network components
β”‚   β”œβ”€β”€ morphology.py               # Robot graph structure definitions
β”‚   └── components/
β”‚       └── distributions.py        # Probability distributions for actions
β”œβ”€β”€ training/
β”‚   β”œβ”€β”€ trainers/
β”‚   β”‚   β”œβ”€β”€ transfer_trainer.py           # Complete TURRET training system
β”‚   β”‚   β”œβ”€β”€ ppo_trainer.py               # Base PPO implementation
β”‚   β”‚   └── base_trainer.py              # Abstract trainer interface
β”‚   β”œβ”€β”€ buffers.py                  # Experience replay buffers
β”‚   └── optimizers.py               # Gradient management and schedulers
β”œβ”€β”€ environments/
β”‚   β”œβ”€β”€ tasks/
β”‚   β”‚   β”œβ”€β”€ centipede.py                 # Centipede-n multi-legged robots
β”‚   β”‚   └── standard_robots.py           # MuJoCo standard robots
β”‚   β”œβ”€β”€ base_env.py               # Abstract environment interface
β”‚   └── mujoco_wrapper.py         # MuJoCo environment wrapper
β”œβ”€β”€ transfer/
β”‚   β”œβ”€β”€ semantic_space.py         # Unified state embedding space
β”‚   β”œβ”€β”€ weight_calculator.py      # Adaptive transfer weight computation
β”‚   β”œβ”€β”€ lateral_connections.py    # Knowledge fusion mechanisms
β”‚   β”œβ”€β”€ independence.py           # Gradual independence scheduler
β”‚   └── base_transfer.py          # Base class for transfer components
β”œβ”€β”€ experiments/
β”‚   β”œβ”€β”€ runners/
β”‚   β”‚   β”œβ”€β”€ size_transfer.py             # Size transfer experiments
β”‚   β”‚   β”œβ”€β”€ morphology_transfer.py       # Morphology transfer experiments
β”‚   β”‚   └── base_runner.py               # Experiment runner base class
β”‚   β”œβ”€β”€ evaluation/
β”‚   β”‚   β”œβ”€β”€ evaluator.py                 # Experiment evaluation
β”‚   β”‚   β”œβ”€β”€ metrics.py                   # Performance metrics
β”‚   β”‚   └── baseline_models.py           # Baseline method implementations
β”‚   β”œβ”€β”€ visualization/
β”‚   β”‚   β”œβ”€β”€ advanced_visualizer.py       # Interactive visualizations
β”‚   β”‚   β”œβ”€β”€ tsne_visualizer.py          # Dimensionality reduction
β”‚   β”‚   └── trajectory_plot.py           # Training trajectory plotting
β”‚   β”œβ”€β”€ paper_experiments.py      # Unified experiment entry point
β”‚   β”œβ”€β”€ transfer_experiment.py    # Transfer learning experiments
β”‚   └── pretrain_source.py        # Source policy pre-training
β”œβ”€β”€ analysis/
β”‚   └── result_analyzer.py        # Comprehensive result analysis
β”œβ”€β”€ optimization/
β”‚   β”œβ”€β”€ performance_optimizer.py  # Performance optimization tools
β”‚   └── distributed_trainer.py    # Distributed training support
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ run_all_experiments.py    # Main experiment runner
β”‚   └── download_pretrained.py    # Pre-trained model downloader
└── utils/
    β”œβ”€β”€ file_utils.py             # Checkpoint and file management
    └── logging_utils.py          # Logging and training statistics

Key Algorithms

  1. Graph-based Policy Representation
# Robot morphology as graph
morphology_graph = MorphologyGraph("Humanoid")
policy = GNNStructuredPolicyNetwork(config)
  1. Adaptive Transfer Weights
# Compute transfer weights based on state similarity
weights = weight_calculator.compute_transfer_weights(
    target_state, source_states
)
  1. Gradual Independence
# Progressive independence factor
p = independence_scheduler.get_current_p()
fused_output = p * target + (1-p) * transferred

πŸ“Š Experiments

Supported Transfer Scenarios

Experiment Type Source Tasks Target Tasks Description
Size Transfer HalfCheetah, Ant Humanoid, Walker2d Small→Large robot transfer
Morphology Transfer Quadruped→Biped Various combinations Cross-morphology transfer
Ablation Studies - - Component importance analysis
Baseline Comparison PPO, CAT, NerveNet Standard tasks Method performance comparison

Evaluation Metrics

  • Performance: Mean reward, learning speed, sample efficiency
  • Transfer Effectiveness: Weight distributions, semantic distances
  • Statistical Significance: Confidence intervals, effect sizes

πŸ“ˆ Results

Performance Comparison

| Method | Size Transfer | Morphology Transfer | Sample Efficiency |

Key Findings

  1. Effective Cross-Domain Transfer: TURRET successfully transfers knowledge across different robot morphologies
  2. Adaptive Weighting: State-level similarity metrics outperform fixed weighting schemes
  3. Scalability: GNN-based policies scale effectively to complex robot structures
  4. Progressive Learning: Gradual independence prevents negative transfer and improves final performance

πŸ› οΈ Development

Adding New Experiments

  1. Create new runner in experiments/runners/
  2. Register experiment in paper_experiments.py
  3. Update configuration classes and run scripts

Extending Components

  1. Add new components in appropriate modules
  2. Ensure compatibility with TURRETConfig for configuration
  3. Update import paths and dependencies

Testing & Validation

The project includes comprehensive testing tools:

# Full system verification
python verify_phase8.py

# Component interface testing
python tests/test_component_interfaces.py

# Experiment replication testing  
python tests/test_experiment_replication.py

# Performance benchmarking
python tests/performance_benchmark.py

# Final validation
python verification/final_validation.py

# Paper experiment test
python tests/test_run_paper_experiments.py

Code Structure

The project follows a modular architecture:

  • Config-driven: All experiments configured via TURRETConfig dataclass
  • Modular components: Easy to extend or replace components
  • Comprehensive testing: Each phase has verification scripts
  • Type hints: Full type annotation for better development experience

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Original TURRET paper authors for the innovative research
  • MuJoCo team for the physics simulation environment
  • PyTorch team for the deep learning framework
  • HuggingFace for pre-trained model hosting

Note: This is a replication project for research purposes. Performance may vary based on hardware and specific experimental setup.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors