A reinforcement learning environment for optimizing turbofan engine maintenance. Students learn to balance engine longevity with maintenance costs using a standard gym interface.
- 06/03: IMPORTANT: fixed the leaderboard path. Previously we passed the wrong url, sorry for this! Now it will work (see below)
- 01/03: Added max usage limit at 100k step() calls (largely sufficient). If you need more, please reach out
- 25/02: added score, stabilized features, increased speed
- 18/02: updated vectorized version and some function abstraction (v0.3)
- 13/02: fixed some token access and np format in single_trajectory example
Here are the two baselines you should try to beat if possible. The ranking is done on the Total Reward over the last 100 computed episodes, but you are encouraged to discuss the other metrics in your report.
| Name | Total Reward | Average Reward | Best Reward | Episodes | # Maintenances | Avg. Length | Failure Rate |
|---|---|---|---|---|---|---|---|
| Baseline 1 | 395144.11 | 3951.44 | 4410.57 | 100 | 696 | 320.2 | 5.0% |
| Baseline 2 | 311106.51 | 3111.07 | 5106.13 | 100 | 487 | 298.1 | 47.0% |
- Returned observations: An array of [10,9] observation arrays, each representing 9 sensor data from over 10 flights
- Goal: Learn optimal maintenance timing based on degradation patterns across multiple flights
Action Space (Discrete):
0: Do nothing (continue normal operation)1: Repair (reduce engine degradation, incurs maintenance cost)2: Sell (terminate episode, receive sale reward based on remaining engine life)
Objective: Maximize total reward by:
- Prolonging engine lifetime through strategic repairs
- Avoiding catastrophic failures
- Selling at optimal time for maximum profit
Create a .env file in your project root for automatic configuration:
# Server configuration
SERVER_URL=http://rlchallenge.orailix.com
USER_TOKEN=your_student_token
from student_client import create_student_gym_env
user_token = 'your_token'
# Automatically loads from .env file
env = create_student_gym_env(user_token=user_token)
# Reset environment
obs, info = env.reset()
# Run episode
for step in range(100):
# Choose action based on your policy
action = env.action_space.sample()
# Get next 10 flight observations
obs_list, reward, terminated, truncated, info = env.step(
action=action
)
if terminated or truncated:
break
# Clean up
env.close()
# Get your challenge score
from student_client import get_leaderboard_score
df = get_leaderboard_score(user_token=user_token)
print(df)/!\ Versioning: when intializing your environment, you will be prompted with your operating version, and perhaps if you require to update your environment with the latest version.
See example/simple_example.py for a complete working example with:
- Environment setup
- Action policy implementation
- Data collection and analysis
- Visualization of results
For detailed trajectory analysis, see example/single_trajectory.ipynb which includes:
- Step-by-step environment interaction
- Observation analysis
- Action timing visualization
- Reward optimization strategies
# Clone repository
git clone <repository-url>
cd challenge_CSC_52081_EP
# Install dependencies
pip install -r requirements.txt
pip install -e .Each observation contains 9 sensor measurements:
- HPC_Tout: High Pressure Compressor Temperature Outlet
- HP_Nmech: High Pressure Shaft Mechanical Speed
- HPC_Tin: High Pressure Compressor Temperature Inlet
- LPT_Tin: Low Pressure Turbine Temperature Inlet
- Fuel_flow: Fuel Flow Rate
- HPC_Pout_st: High Pressure Compressor Pressure Outlet
- LP_Nmech: Low Pressure Shaft Mechanical Speed
- phase_type: Flight Phase Type
- DTAMB: Temperature Difference
- Start with
simple_example.pyto understand the basic interaction pattern - Use
single_trajectory.ipynbfor in-depth analysis of engine degradation - Implement your own policies in the same structure
- Focus on action timing relative to degradation patterns
For technical issues, contact thil (at) lix (dot) polytechnique (dot) fr
Version: 1.0.0 Focus: Turbofan engine maintenance optimization