This GitHub repo contains the code and data needed to predict the sentiment of a moview review as positive or negative. It builds a Radial Basis Function Support Vector Machine (RBF SVM) using 1003 features;
- 500 unigrams.
- 500 bigrams.
- 3 sentiment scores.
The 1000 most relevant features, calculated using the Chi-squared test, are selected to build the model.
To run this machine learning model you will need the following installed;
- Git
- Python (with
sklearn,pandas,numpyandnltkmodules installed.)
By default the model will use the dataset in the Data/IMDb/test/ folder. To run the model on different data, simply replace this file with the data you want the model to predict.
The model parameters can be changed by editing line 143 of model.py.
To run the model;
- Clone the repository to your local machine.
- Navigate to the repository in the terminal.
- Type
python model.pyand hit enter.