This project is a Sentiment Analysis Pipeline that processes textual data to determine sentiment polarity (positive, negative, or neutral). It uses Natural Language Processing (NLP) techniques and machine learning models to analyze the sentiment of given text data.
π¦ Sentiment-Analysis-Pipeline
βββ app.py # Main application file
βββ data_cleaning.py # Data preprocessing script
βββ data_ingestion.py # Data loading and processing script
βββ model.py # Model training and evaluation script
βββ sentiment_analysis.iml # Project configuration file
βββ requirements.txt # Python dependencies
βββ sentiment_analysis.postman_collection.json # API collection for testing
βββ vectorizer.pkl # Vectorizer file for text transformation
βββ README.md # Project documentation
β Preprocesses textual data (removes noise, tokenization, lemmatization) β Supports multiple machine learning models (Logistic Regression, Naive Bayes, etc.) β Uses TF-IDF vectorization for feature extraction β Provides a REST API for sentiment analysis β Outputs sentiment as Positive, Negative, or Neutral
- Clone the Repository
git clone https://github.com/manu0312/Sentiment-Analysis-Pipeline.git cd Sentiment-Analysis-Pipeline - Create a Virtual Environment (Optional but Recommended)
python -m venv venv source venv/bin/activate # On macOS/Linux venv\Scripts\activate # On Windows
- Install Dependencies
pip install -r requirements.txt
python data_cleaning.pypython model.pypython app.pyThe API will be available at: http://127.0.0.1:5000
Send a POST request to the API:
curl -X POST http://127.0.0.1:5000/predict -H "Content-Type: application/json" -d '{"text": "I love this product!"}'Response:
{
"sentiment": "positive"
}- Python π
- Flask (for API development)
- Scikit-learn (for machine learning models)
- NLTK (for text preprocessing)
- Pandas & NumPy (for data manipulation)
π Happy Coding! π