Skip to content

rabeehakamran/Preprocessing-Numpy-only

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Titanic Survival Analysis (NumPy + Matplotlib)

📌 Project Overview

This project analyzes the Titanic dataset using NumPy & Matplotlib.
We perform data preprocessing, feature engineering, statistical analysis, and visualization to uncover survival patterns.


⚙️ Data Processing Steps

1️⃣ Data Loading & Cleaning

  • Merged Name columns
  • Handled missing values (Age, Fare → mean | Embarked → mode)

2️⃣ Encoding

  • Sex encoded (female=0, male=1)
  • Embarked encoded (S=0, C=1, Q=2)

3️⃣ Feature Engineering

  • Dropped Name, Ticket, Cabin
  • Added FamilySize & IsAlone features

4️⃣ Normalization

  • Applied Z-score scaling on Age & Fare

5️⃣ Statistical Analysis

  • Computed mean, median, std for key features
  • Calculated survival rates by gender & class
  • Correlation matrix of numerical features

6️⃣ Visualizations

  • Survival Rate by Gender (bar chart)
  • Fare Distribution (histogram)
  • Correlation Heatmap

7️⃣ Train/Test Split

  • Random shuffle
  • 80% training, 20% testing

About

Titanic dataset analysis using NumPy and Matplotlib, includes preprocessing, feature engineering, statistical analysis, and visualizations to explore survival patterns.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors