Netflix Data Analysis & Visualization – ML Project

A machine learning and data analysis project that explores the Netflix dataset to uncover content trends, perform data cleaning, and visualize meaningful patterns in streaming media.

Project Overview

This project explores and analyzes the Netflix dataset to extract actionable insights using Python. It includes data cleaning, exploratory data analysis (EDA), and visualizations to understand content distribution by type, country, genre, and release timeline.

Key Objectives:

Clean and preprocess the dataset
Analyze the distribution of Netflix content (Movies vs TV Shows)
Identify content trends by country, release year, and ratings
Visualize top genres, directors, and frequently appearing cast members
Answer specific business-oriented queries using code and visual analysis

Dataset

Source: Kaggle – Netflix Dataset
File: netflix_titles.csv
⚠ Dataset is not included in this repository due to redistribution restrictions.
Download manually from the link above and place it in your project folder to run the notebook.

Techniques & Features Used

Data Cleaning (handling nulls, duplicates, format issues)
Exploratory Data Analysis (EDA)
Grouping, filtering, and cross-analysis by multiple columns
Visualization using:
Seaborn
Matplotlib
Insightful Question-Answering (e.g., most active countries, popular genres)

Tools & Technologies

Languages: Python
Libraries: Pandas, NumPy, Matplotlib, Seaborn
Environment: Google Colab
Version Control: Git & GitHub

Sample Insights

TV Shows dominate recent additions compared to Movies.
The United States and India are the top content providers.
Peak content additions occurred between 2017–2019.
"Documentaries" and "Dramas" are the most frequent genres.

How to Run the Project

Clone the repository or download the notebook.
Download the dataset from Kaggle.
Place netflix_titles.csv in the same directory as the notebook.
Open netflix_Ml_Project.ipynb in Jupyter Notebook or Google Colab.
Run all cells to reproduce the results.

👩‍💻 Author

Raviha Khan
📍 Karachi, Pakistan
🔗 LinkedIn
🐙 GitHub
📧 ravihakhan53@gmail.com

"Learning by doing — turning Netflix data into meaningful insights."

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
README.md		README.md
netflix_Ml_Project.ipynb		netflix_Ml_Project.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Netflix Data Analysis & Visualization – ML Project

Project Overview

Key Objectives:

Dataset

Techniques & Features Used

Tools & Technologies

Sample Insights

How to Run the Project

👩‍💻 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Netflix Data Analysis & Visualization – ML Project

Project Overview

Key Objectives:

Dataset

Techniques & Features Used

Tools & Technologies

Sample Insights

How to Run the Project

👩‍💻 Author

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages