This project is a Machine Learning-based Spam Message Classifier that predicts whether a given message is spam or not spam (ham).
To build a model that can automatically detect spam messages using Natural Language Processing (NLP).
- Python
- Pandas
- Scikit-learn
- NLTK
The dataset used is the SMS Spam Collection dataset containing labeled messages as spam or ham.
- Data preprocessing
- Text vectorization using CountVectorizer
- Model training using Multinomial Naive Bayes
- Model evaluation using accuracy score
The model achieved an accuracy of approximately 98%.
-
Install dependencies: pip install pandas scikit-learn nltk
-
Run the model: python model.py
Input: "You have won ₹5000!" Output: Spam
Input: "Let's meet tomorrow" Output: Not Spam
The project successfully classifies spam messages with high accuracy and demonstrates the use of NLP in real-world applications.

