Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
319 changes: 319 additions & 0 deletions Week-1-Pandas/.ipynb_checkpoints/Exercise-checkpoint.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,319 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Data wrangling with Pandas exercise\n",
"* For this exercise we will be using the `listings.csv` data file."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Load in the data file using `pd.read_csv()`"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"# Load data here\n",
"df = pd.read_csv('data/listings.csv')\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Exercise 2 - Filtering\n",
"\n",
"Return the following subsets of the dataframe.\n",
"\n",
"1. How many listings are there with a price less than 100? \n",
"\n",
"\n",
"2. Find how many listings there are in just Brooklyn.\n",
"\n",
"\n",
"3. Find how many listings there are in Brooklyn with a price less than 100.\n",
"\n",
"\n",
"4. Using `.isin()` select anyone that has the host name of Michael, David, John, and Daniel.\n",
"\n",
"\n",
"5. Create a new column called `adjusted_price` that has $100 added to every listing in Williamsburg. The prices for all other listings should be the same as the were before. \n",
"\n",
"\n",
"6. What % of the rooms are private, and what % of the rooms are shared. \n",
" * Hint, use `.value_counts()`\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# 1. How many listings are there with a price less than 100? \n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# 2. Make a new DataFrame of listings in Brooklyn named `df_bk` \n",
"# and find how many listings in just Brooklyn.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"# 3. Find how many listings there are in Brooklyn with a price less than 100.\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# 4. Using `.isin()` select anyone that has the host name of Michael, David, John, and Daniel.\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"# 5. Create a new column called `adjusted_price` that has $100 added to every listing in Williamsburg. \n",
"# The prices for all other listings should be the same as the were before. \n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"# 6. What % of the rooms are private, and what % of the rooms are shared. \n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Exercise 3 - Grouping\n",
"\n",
"1. Using `groupby`, count how many listings are in each neighbourhood_group.\n",
"\n",
"\n",
"2. Using `groupby`, find the mean price for each of the neighbourhood_groups. \n",
"\n",
"\n",
"3. Using `groupby` and `.agg()`, find the min and max price for each of the neighbourhood_groups. \n",
"\n",
"\n",
"4. Using `groupby`, find the median price for each room type in each neighbourhood_group.\n",
"\n",
"\n",
"5. Using `groupby` and `.agg()`, find the count, min, max, mean, median, and std of the prices for each room type in each neighbourhood_group."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"# 1. Using `groupby`, count how many listings are in each neighbourhood_group.\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"# 2. Using `groupby`, find the mean price for each of the neighbourhood_groups. \n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"# 3. Using `groupby` and `.agg()`, find the min and max price for each of the neighbourhood_groups. \n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"# 4. Using `groupby`, find the mean price for each room type in each neighbourhood_group.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"# 5. Using `groupby` and `.agg()`, find the count, min, max, mean, median, and std of the prices \n",
"# for each room type in each neighbourhood_group.\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Join and file saving.\n",
"1. Load the `prices.csv` and the `n_listings.csv`\n",
"\n",
"\n",
"2. Do join that keeps all the records for each table.\n",
" * Neighbourhood groups should include ['Bronx', 'Brooklyn', 'Manhattan', 'Queens', 'Staten Island',\n",
" 'LongIsland']\n",
" \n",
" \n",
"3. Save your joined csv as `joined.csv`\n",
"\n",
"\n",
"4. Load your saved table and see if it looks the same or different that the DataFrame you used to create it. "
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"# 1. Load the `prices.csv` and the `n_listings.csv`\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [],
"source": [
"# 2. Do join that keeps all the records for each table.\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Use the grammys.csv data for the next section of questions.\n",
"\n",
"1. Who was won Album of the Year in 2016?\n",
"\n",
"\n",
"2. Who won Best Rap Album in 2009?\n",
"\n",
"\n",
"3. How many awards was Kendrick Lamar nomiated for, and how many did he win...?"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"# 1. Who was won Album of the Year in 2016?\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"# 2. Who won Best Rap Album in 2009?\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"# 3. How many awards was Kendrick Lamar nomiated for, and how many did he win...?\n",
"\n",
"\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading