This repository contains an implementation of a Deep Recurrent Neural Network (RNN) using PyTorch for predicting students' grades in their 5th semester based on their previous academic grades from the 1st to 4th semesters.
- Introduction to RNN and PyTorch
- What is Recurrent Neural Network (RNN)?
- What is PyTorch?
- Preparing the dataset
- Heatmap
- Data Preprocessing
- RNN Structure
- Result
A recurrent neural network (RNN) is a type of artificial neural network that is particularly well-suited for processing sequential data or time series data. Unlike traditional feedforward neural networks, RNNs have loops in their network which allow information to persist and be passed from one step to the next. This allows them to effectively model and learn from sequential patterns and dependencies in data, such as those found in natural language, speech, handwriting, or sensor data. The key to this capability is the hidden state (or memory) within the RNN's internal loops, which captures information about what has been calculated previously. This hidden state acts as an internal memory that updates at each time step as it processes new inputs, informed by its prior state. Through this recurrent structure and dynamic memory ability, RNNs can very naturally process inputs of varying lengths. While powerful, training traditional RNNs can be challenging due to exploding or vanishing gradient problems. However, more sophisticated RNN architectures like LSTMs and GRUs have been developed to help mitigate these issues.
Source : geeksforgeeks.org, amazon.com.
PyTorch is an open-source machine learning (ML) framework based on the Python programming language and the Torch library. Torch is an open-source ML library used for creating deep neural networks and is written in the Lua scripting language. It's one of the preferred platforms for deep learning research. The framework is built to speed up the process between research prototyping and deployment.
Source : techtarget.com, pytorch.org.
This repository uses a dataset of semester results of engineering students from a university. This data was obtained from Kaggle and titled "Semester Result of Technical Students". The dataset contains information about students' performance, including their roll numbers (converted to random numbers for privacy), college code, subjects taken, and the semester CGPA (Cumulative Grade Point Average) achieved. It represents a real-world scenario where predicting a student's future academic performance can be challenging. According to the dataset provider, predicting the next semester's CGPA is a complex task. However, analyzing a student's past performance can provide valuable insights for making such predictions.
The first step of preparing the dataset is to look for correlations between variables using heatmaps to take important variables that can be used as RNN input. Heatmaps are effective for visualizing large datasets by representing data values with colors. This allows data scientists and software engineers to quickly identify patterns, trends, and variations in the data. The implementation of the heatmap is shown in the following figure:
In this repository, I use RNN to predict semester 5 grades. As can be seen from the heatmap, only academic grades from the 1st to 4th semesters variables are correlated with semester 5 grades, so those grades will be the input features for RNN.
Data preprocessing in the case of this dataset is done by removing variables or columns from the dataset that are not needed and then cleaning the data from missing or NaN values. The results of data preprocessing get 169 rows of clean data, the results can be seen in the following figure:
To build an RNN structure for predicting CGPA scores based on four previous CGPA scores, I would start by loading the data from an Excel file and preprocessing it by converting it into PyTorch tensors and splitting it into train and test sets. Next, I would define the RNN model architecture with the following parameters: input size of 4 (number of features), hidden size of 64 (size of the hidden state in the RNN), output size of 1 (single output value), number of layers of 3 (number of recurrent layers), and dropout rate of 0.2 (dropout rate for regularization). The RNN model architecture would include a recurrent layer that takes in the input features and outputs a hidden state, then passes through a fully connected layer to output a single value. The RNN model would be initialized with random weights, and the hidden state would be initialized with zeros.
Next, I would define the loss function and optimizer for training the RNN model. I would use the mean squared error (MSE) loss function and the Adam optimizer with a learning rate of 0.001 and weight decay for regularization. I would then train the RNN model for 1000 epochs, updating the model parameters using backpropagation and printing the training loss for every 100 epochs. I would also evaluate the RNN model on the test set and print the test loss.
Finally, I would save the predicted scores and the testing data to Excel files for further analysis. The RNN model can be customized by adjusting the hyperparameters and input features for other sequential data regression tasks. For example, I could increase the number of recurrent layers or the hidden size to capture more complex patterns in the data, or I could add additional input features such as the student's attendance, grades, and behavior. I could also adjust the learning rate, dropout rate, or weight decay to improve the model's performance.
The test loss of 0.2261 indicates the accuracy of the trained RNN model in predicting CGPA scores based on four previous CGPA scores. In this case, the test loss is a measure of the difference between the predicted CGPA scores and the actual CGPA scores in the test set, calculated using the mean squared error (MSE) loss function.
The MSE loss function measures the average squared difference between the predicted and actual CGPA scores and is a common metric for evaluating the accuracy of regression models. In this case, a test loss of 0.2261 indicates that the average squared difference between the predicted and actual CGPA scores in the test set is 0.2261.
Overall, the test loss of 0.2261, as calculated using the MSE loss function, suggests that the RNN model is a promising tool for predicting CGPA scores based on four previous CGPA scores. However, further refinement and customization of the model may be necessary to improve its accuracy and generalizability. It is important to evaluate the performance of the model using other metrics, such as the mean absolute error or the R-squared value, to get a more complete picture of the model's accuracy. Additionally, the test loss should be interpreted in the context of the specific problem and dataset, as a high test loss may still result in significant errors in the predicted CGPA scores. The result of the test loss can be seen in the following figure:



