This project is a machine learning solution to the famous Titanic dataset, where the goal is to predict which passengers survived the tragedy based on features like age, class, gender, and more.
Using Python and common data science libraries, this notebook walks through the full data science workflow:
- Data cleaning and preprocessing
- Exploratory data analysis (EDA)
- Feature engineering
- Model training and evaluation
pandasfor data manipulationmatplotlib&seabornfor visualizationsscikit-learnfor building ML modelsJupyter Notebookfor code and documentation
Titanic_Project.ipynbβ Main notebook with full analysis and modelingtrain.csvβ Training dataset (from Kaggle Titanic Challenge)test.csvβ (Optional) Test dataset for final predictionsREADME.mdβ Project documentation (this file)
- Handling missing values
- Label encoding and feature selection
- Logistic Regression, Decision Tree, Random Forest
- Model accuracy and confusion matrix
- Feature importance analysis
- Clone the repo:
git clone https://github.com/Hardikk-7/Titanic-Project.git cd Titanic-Project