Skip to content

GopalGB/Netflix_Data_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Netflix Data Analysis Project

This repository contains the code and analysis for the Netflix data analysis project. The goal is to analyze the dataset provided by Netflix to generate insights that could help the company in deciding which type of shows/movies to produce and how they can grow the business in different countries.

Dataset Overview

The dataset consists of a list of all the TV shows/movies available on Netflix. Each entry includes details such as show ID, type (movie or TV show), title, director, cast, country, date added, release year, rating, duration, genre, and description.

Analysis Goals

As we explore the data, our aim is to answer specific questions and generate actionable insights for Netflix. Some of the key questions we'll address include:

  • What type of content is available in different countries?
  • How has the number of movies released per year changed over the last 20-30 years?
  • Comparison of TV shows vs. movies.
  • What is the best time to launch a TV show?
  • Analysis of actors/directors of different types of shows/movies.
  • Does Netflix focus more on TV shows than movies in recent years?

Analysis Approach

  1. Defining Problem Statement and Analyzing Basic Metrics: Understanding the objectives and initial exploration of the dataset.

  2. Data Exploration: Examining the shape of data, data types, missing values, and statistical summaries.

  3. Non-Graphical Analysis: Utilizing value counts and unique attributes to understand the data distribution.

  4. Visual Analysis:

    • Univariate analysis: Using distplots, countplots, and histograms for continuous variables.
    • Boxplots for categorical variables.
    • Heatmaps and pairplots for correlation analysis.
  5. Missing Value & Outlier Check: Detecting missing values and outliers in the dataset.

  6. Insights Based on Analysis: Providing observations on the range of attributes, distribution of variables, and relationships between them.

  7. Business Insights: Highlighting patterns observed in the data and what can be inferred from them.

  8. Recommendations: Actionable items for the business, presented in simple terms without technical jargon.

Usage

  1. Clone this repository.
  2. Download the dataset files from the provided link.
  3. Place the dataset files in the appropriate directory.
  4. Run the analysis scripts in your preferred environment.

About

Exploratory data analysis of Netflix content catalog — trends, genres, ratings, and regional distribution

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors