This project involves building a sentiment analysis model for Twitter data using Python. The model classifies tweets as either positive or negative. The main steps include data extraction, preprocessing, vectorization, model training, evaluation, and saving the trained model for future use. This README provides a detailed explanation of the code, the technologies used, and instructions for setting up and running the project.
- Technologies and Skills Used
- Data Extraction
- Data Preprocessing
- Model Training and Evaluation
- Model Saving and Loading
- Sentiment Checker Function
- Resources
- Python: Programming language used for the entire project.
- Kaggle API: For downloading the dataset.
- Pandas: For data manipulation and analysis.
- NumPy: For numerical operations.
- NLTK: For natural language processing, including tokenization and stemming.
- Scikit-learn: For machine learning tasks, including data splitting, vectorization, model training, and evaluation.
- Logistic Regression: The machine learning algorithm used for sentiment classification.
- Pickle: For saving and loading the trained model.
- Setting up Kaggle Credentials: To access the Kaggle API, set up the Kaggle credentials and configure the path of
kaggle.json. - Extracting the CSV Dataset: Extract the CSV file from the downloaded zip file.
- Loading Data: Load the data into a Pandas DataFrame.
- Label Adjustment: Replace the label 4 with 1 to standardize the target labels.
- Text Cleaning and Stemming: Clean, tokenize, remove stopwords, and stem the words to standardize the text.
- Data Splitting: Split the data into training and testing sets.
- Vectorization: Convert text data into numerical vectors using TfidfVectorizer.
- Model Training: Train the logistic regression model.
- Model Evaluation: Evaluate the model's accuracy on training and testing data.
- Saving the Model: Save the trained model using Pickle.
- Loading the Model: Load the saved model for future use.
A function to check the sentiment of a given tweet using the trained model.
- YouTube Video: The tutorial followed to create this project.