This repository contains the implementation and experimental code for the research paper:
Deep Learning Based Forecasting of COVID-19 Hospitalisation in England: A Comparative Analysis
Michael Ajao-olarinoye, Vasile Palade, Seyed Mousavi, Fei He, and Petra A. Wark
2023 International Conference on Machine Learning and Applications (ICMLA), pp. 1344-1349
Jacksonville, FL, USA | December 15-17, 2023
In the midst of the COVID-19 pandemic, it was essential to accurately forecast the demand for hospitalisation resources to achieve an effective allocation of healthcare resources. This paper explores the potential of various Deep Learning (DL) models, namely basic Recurrent Neural Networks (RNNs), Long Short-Term Memory networks (LSTMs), Gated Recurrent Units (GRU), Bidirectional RNNs, and Sequence-to-Sequence architectures with the inclusion of attention mechanisms, to forecast the demand for hospitalisation resources (mechanical ventilators) in England during the COVID-19 pandemic. The implementation of simulated annealing (SA) as a hyperparameter tuning method produced certain model structures and good results in terms of prediction accuracy. Our findings show that the LSTM-based models (LSTM_SA), achieved the lowest mean average error (MAE), outperforming other architectures used in this study. The results of this study show the potential of DL models to forecast the demand for resources and could help inform the distribution of hospitalisation resources in England during the COVID-19 pandemic.
- Recurrent Neural Networks (RNN): Vanilla RNN, LSTM, GRU
- Bidirectional Variants: BiLSTM, BiGRU
- Attention Mechanisms: Dot Product, General, Additive, and Concat Attention
- Sequence-to-Sequence Models: Encoder-Decoder architectures with attention
Our experiments demonstrate that LSTM models optimized with Simulated Annealing (LSTM_SA) achieved the lowest Mean Absolute Error (MAE), outperforming other architectures for forecasting ventilator bed occupancy and providing valuable insights for healthcare resource management.
- Multiple Deep Learning Architectures: Implementation of RNN, LSTM, GRU, and their bidirectional variants
- Attention Mechanisms: Four types of attention (Dot Product, General, Additive, Concat)
- Multi-horizon Forecasting: Support for 1-day to 14-day ahead predictions
- Comprehensive Evaluation: MAE, RMSE, MAPE metrics across multiple forecast horizons
- NHS England Data Pipeline: Complete data preprocessing for UK COVID-19 healthcare data
The data preprocessing pipeline can be run from the notebooks:
-
Data Collection: Run
Data_preprocess.ipynbto:- Fetch COVID-19 data from NHS England API
- Merge hospitalization, case, and vaccination data
- Create the final preprocessed dataset
-
Exploratory Analysis: Use
00_Exploratory_Analysis.ipynbfor:- Data visualization and statistics
- Time series analysis (trend, seasonality, stationarity)
- Feature correlation analysis
The main experiments are organized in numbered notebooks:
# Example: Training an LSTM model for ventilator bed occupancy forecasting
from src.dl.multivariate_models import SingleStepRNNConfig, SingleStepRNNModel
from src.dl.dataloaders import TimeSeriesDataModule
# Configure the model
config = SingleStepRNNConfig(
rnn_type="LSTM",
input_size=10, # Number of input features
hidden_size=64, # Hidden layer size
num_layers=2, # Number of RNN layers
bidirectional=True, # Use bidirectional LSTM
learning_rate=1e-3
)
# Create model and train
model = SingleStepRNNModel(config)-
Baseline Comparison:
01_Baseline_Comparison.ipynb- Compare RNN, LSTM, GRU models
- Evaluate bidirectional variants
- Multi-horizon forecasting evaluation
-
Novel Models:
02_Novel_Model_Implementation.ipynb- Attention-enhanced models
- Seq2Seq architectures
-
Hyperparameter Tuning:
notebooks/experiment1.ipynb- Simulated Annealing (SA) - Primary hyperparameter optimization method
- Temperature-based acceptance probability with cooling schedule
- Hyperparameter search space:
rnn_type: RNN, GRU, LSTMhidden_size: 32-128num_layers: 5-30bidirectional: True/False
- Models tuned with SA denoted as "LSTM_SA", "GRU_SA", etc.
The datasets used in this study were collected from multiple official sources:
| Source | Data Type | Description |
|---|---|---|
| NHS England | Hospital Activity | COVID-19 hospital admissions, bed occupancy, ventilator usage |
| UK Coronavirus Dashboard | Case Data | Daily confirmed cases, deaths, testing data |
| ONS (Office for National Statistics) | Demographics | Population data by region and local authority |
| UK Government Vaccination Data | Vaccination | Vaccination rates and coverage by region |
| Google COVID-19 Community Mobility Reports | Mobility | Regional mobility trends during the pandemic |
The study uses COVID-19 healthcare data from NHS England, including:
| Feature | Description |
|---|---|
covidOccupiedMVBeds |
Target variable: COVID-19 patients on mechanical ventilators |
hospitalCases |
Total COVID-19 hospital admissions |
newAdmissions |
Daily new hospital admissions |
new_confirmed |
Daily confirmed COVID-19 cases |
cumAdmissions |
Cumulative hospital admissions |
Vax_index |
Vaccination coverage index |
- Training: April 2020 - December 2021
- Validation: January 2022 - March 2022
- Testing: April 2022 - July 2022
- Vanilla RNN: Basic recurrent neural network
- LSTM: Long Short-Term Memory networks
- GRU: Gated Recurrent Units
- BiLSTM/BiGRU: Bidirectional variants
- Attention-LSTM: LSTM with attention mechanisms
- Seq2Seq: Encoder-decoder architectures
-
Dot Product Attention:
$\text{score}(q, k) = q \cdot k$ -
Scaled Dot Product:
$\text{score}(q, k) = \frac{q \cdot k}{\sqrt{d_k}}$ -
General Attention:
$\text{score}(q, k) = q^T W k$ -
Additive Attention:
$\text{score}(q, k) = v^T \tanh(W_q q + W_k k)$
As described in the published paper, we use Simulated Annealing (SA) for hyperparameter tuning. The SA algorithm is a probabilistic optimization technique inspired by the annealing process in metallurgy.
Algorithm Overview:
-
Initialize: Start with initial hyperparameters and temperature
$T_0$ -
Generate Neighbor: Randomly perturb current hyperparameters
-
Evaluate: Train model and compute MAE on validation set
-
Accept/Reject: Accept new solution with probability:
$$P(\text{accept}) = \begin{cases} 1 & \text{if } \Delta E < 0 \ e^{-|\Delta E|/T} & \text{otherwise} \end{cases}$$
-
Cool: Reduce temperature:
$T_{n+1} = \alpha \cdot T_n$ (cooling rate$\alpha = 0.95$ ) -
Repeat until convergence or iteration limit
Implementation Details:
- Initial temperature:
$T_0 = 10$ - Cooling rate:
$\alpha = 0.95$ - Maximum iterations: 100
- Early stopping: 5 iterations without improvement
- Objective function: Mean Absolute Error (MAE)
The SA-optimized models (e.g., LSTM_SA) achieved the best forecasting performance, as reported in the paper.
If you use this code in your research, please cite our paper:
@INPROCEEDINGS{10459821,
author={Ajao-olarinoye, Michael and Palade, Vasile and Mousavi, Seyed and He, Fei and Wark, Petra A},
booktitle={2023 International Conference on Machine Learning and Applications (ICMLA)},
title={Deep Learning Based Forecasting of COVID-19 Hospitalisation in England: A Comparative Analysis},
year={2023},
pages={1344-1349},
keywords={COVID-19;Deep learning;Ventilators;Recurrent neural networks;Pandemics;Predictive models;Resource management;Deep learning;COVID-19;Hospitalisation forecasting;RNN;LSTM;GRU;Attention mechanism},
doi={10.1109/ICMLA58977.2023.00203}
}The time series forecasting techniques and some codes used in this project were largely informed by the following textbook:
@book{manu_modern_2022,
title={Modern Time Series Forecasting with Python: Explore Industry-Ready Time Series Forecasting Using Modern Machine Learning and Deep Learning},
author={Joseph, Manu},
year={2022},
edition={1st},
publisher={Packt Publishing},
isbn={978-1-80323-204-1}
}This comprehensive guide covers essential topics including ARIMA baselines, feature engineering for time series, LSTM and transformer models, global forecasting paradigms, and multi-step forecasting strategies that were instrumental in developing this research.