A Machine Learning Application that forecasts hour-ahead bike rental demand across an entire city, enabling dynamic pricing optimization and revenue maximization.
This Real-Time Bike Rental Demand Forecasting System solves a critical business problem for bike-sharing platforms operating across entire cities. By implementing a CatBoost regression model and Optuna-tuned hyperparameters, I reduced the forecasting error (MAE) by 51% compared to the baseline. The system provides hours-ahead demand predictions that guide dynamic pricing, helping you optimize revenue and improve fleet use.
Challenge: Bike-sharing platforms need to optimize dynamic pricing based on predicted demand to:
- Maximize Revenue: Increase prices during high-demand surges
- Stimulate Demand: Lower prices during slow periods to attract riders
- Optimize Fleet: Better plan supply-demand balance and bike distribution
Key Findings & Data Observations:
- Cyclical Demand Patterns: The target shows strong daily and seasonal cycles, specifically two sharp peaks at 7 AM and 5 PM.
- Weather & Temperature Sensitivity: There is a clear positive correlation between temperature and rentals, while demand drops sharply during light snow/rain. Clear weather yields the highest rentals.
- Feature Predictability: temp, season, weather, and workingday emerged as the strongest predictors. In contrast, windspeed showed discrete behavior and weak correlation on the rentals.
Solution: A real-time ML system that forecasts city-wide bike rental demand for the next hour, enabling:
- Dynamic Pricing Optimization: Real-time price adjustments based on demand forecasts
- Incentive Campaigns: Targeted promotions during predicted low-demand periods
- Operational Planning: Improved fleet utilization and supply-demand balance
Live Demo: LIVE APP

The architecture of the system is shown below:
The system implements a comprehensive machine learning pipeline with five distinct stages:
- Column Renaming: Configurable column mapping for data standardization
- Column Dropping: Removal of unnecessary columns based on configuration
- Data Reset: Index reset for clean data structure
- Configuration-Driven: All preprocessing steps controlled via config parameters
- Lag Features: Creation of time-lagged features for time series forecasting
- Configurable Lags: Multiple lag periods (e.g., 1, 2, 3, 5, 10 periods) for different features
- Backward Fill: Handling of missing values in lag features using bfill method
- Feature Naming: Automatic naming convention:
{feature}_lag_{period}
- Target Creation: Shifted target variable for forecasting (configurable shift period)
- Time-Series Split: Train-test split without shuffling to preserve temporal order
- CatBoost Model: Gradient boosting regressor with configurable parameters
- Optuna Hyperparameter Tuning: Automated optimization of learning_rate, depth, and l2_leaf_reg
- Early Stopping: Prevents overfitting with configurable early stopping rounds
- Time-Based Validation: Manual validation split within training data
- Model Loading: Dynamic model loading from specified path
- Single Prediction: Returns only the last prediction value for real-time forecasting
- Feature Preparation: Input DataFrame processing for model inference
- Timestamp Integration: Current timestamp handling for prediction tracking
- Model Persistence: Automated model saving after training
- Prediction Formatting: Single-row DataFrame with timestamp and prediction columns
- Time Increment: Configurable time increment for prediction timestamps
- Data Structure: Standardized output format for downstream processing
app-ml/train.py: Model training entrypoint boith locally and in productionapp-ml/inference.pyEntrypoint to run inference pipeline locallyapp-ml/inference-api.py: API for inference in production / on web-appapp-ui/app.py: Interactive dashboard for demand reocasting monitoring
# Clone the repository
git https://github.com/DelphinKdl/Demand-Forecasting.git
cd Demand-Forecasting
# Deploy all services with production configuration
docker-compose up --build
# Verify service health
docker-compose ps
docker-compose logs -f
# Access the application
# UI Dashboard: http://localhost:8050
# Inference API: http://localhost:5001/healthExpected Output:
app-ml-train Up
app-ml-inference-api Up
app-ui Up
# Clone the repository
git clone https://github.com/DelphinKdl/Demand-Forecasting.git
cd Demand-Forecasting
# Create and activate conda environment
conda env create -f environment.yml
conda activate bike-sharing
# Train the model first (if not already trained)
python app-ml/entrypoint/rain.py
# Run inference in a loop
python app-ml/entrypoint/inference.py
# Start the inference API tomlink to the application UI
python app-ml/entrypoint/inference_api.py
# Start the UI dashboard in another terminal
cd app-ui
python app.pyAccess the application:
- Live application: LIVE APP
- UI Dashboard: http://localhost:8050
- Inference API: http://localhost:5001
Demand-Forecasting/
├── 📁 app-ml/ # Demand Forecasting Engine
│ ├── 📁 entrypoint/ # Production ML Services
│ │ ├── prod_train.py # Demand model training pipeline
│ │ ├── prod_inference.py # Batch demand prediction service
│ │ └── inference_api.py # Real-time demand prediction API
│ ├── 📁 notebooks/ # Data Science & Analysis
│ │ ├── EDA.ipynb # Demand pattern analysis
│ │ └── Modeling.ipynb # Demand forecasting model development
│ ├── 📁 src/ # Core Forecasting Pipeline
│ │ ├── 📁 pipelines/ # Modular demand forecasting components
│ │ │ ├── preprocessing.py # Rental data preprocessing
│ │ │ ├── feature_engineering.py # Weather & temporal feature creation
│ │ │ ├── training.py # Demand model training pipeline
│ │ │ ├── inference.py # Real-time demand prediction
│ │ │ └── postprocessing.py # Pricing optimization logic
│ │ └── utils.py # Forecasting utilities & helpers
│ ├── Dockerfile # ML service containerization
│ └── requirements.txt # ML dependencies
├── 📁 app-ui/ # Dynamic Pricing Dashboard
│ ├── app.py # Main pricing dashboard application
│ ├── assets/ # Dashboard styling & assets
│ ├── Dockerfile # UI service containerization
│ └── requirements.txt # UI dependencies
├── 📁 common/ # Shared Business Logic
│ ├── data_manager.py # Rental data management & persistence
│ └── utils.py # Common utilities & pricing helpers
├── 📁 config/ # Configuration Management
│ ├── local.yaml # Development configuration
│ ├── staging.yaml # Staging environment config
│ └── production.yaml # Production environment config
├── 📁 data/ # Bike Rental Data Lake
│ ├── 📁 raw_data/ # Raw rental & weather data
│ │ ├── csv/ # Historical rental data (CSV)
│ │ └── parquet/ # Optimized rental data (Parquet)
│ └── 📁 prod_data/ # Processed data & predictions
│ ├── csv/ # Demand predictions (CSV)
│ └── parquet/ # Demand predictions (Parquet)
├── 📁 models/ # Demand Forecasting Models
│ ├── 📁 experiments/ # Model experimentation & A/B testing
│ └── 📁 prod/ # Production demand forecasting models
├── 📁 images/ # Documentation & Visualizations
├── docker-compose.yml # Multi-service orchestration
├── environment.yml # Conda environment specification
└── README.md # Project documentation
This project is licensed under a custom Personal Use License.
You are free to:
- Use the code for personal or educational purposes
- Publish your own fork or modified version on GitHub with attribution
You are not allowed to:
- Use this code or its derivatives for commercial purposes
- Resell or redistribute the code as your own product
- Remove or change the license or attribution
For any use beyond personal or educational purposes, please contact the author for written permission.

