From 7a547e225f4899e74ce4a7c45798b5b9065ae5c4 Mon Sep 17 00:00:00 2001 From: GildaRIA <114354041+GildaRIA@users.noreply.github.com> Date: Sun, 26 Mar 2023 19:18:53 -0600 Subject: [PATCH 1/4] Delete main.ipynb --- .../your-code/main.ipynb | 464 ------------------ 1 file changed, 464 deletions(-) delete mode 100644 lab-functional-programming/your-code/main.ipynb diff --git a/lab-functional-programming/your-code/main.ipynb b/lab-functional-programming/your-code/main.ipynb deleted file mode 100644 index 8017d6e..0000000 --- a/lab-functional-programming/your-code/main.ipynb +++ /dev/null @@ -1,464 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Before your start:\n", - "- Read the README.md file\n", - "- Comment as much as you can and use the resources in the README.md file\n", - "- Happy learning!" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "import pandas as pd" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Challenge 1 - Iterators, Generators and `yield`. \n", - "\n", - "In iterator in Python is an object that represents a stream of data. However, iterators contain a countable number of values. We traverse through the iterator and return one value at a time. All iterators support a `next` function that allows us to traverse through the iterator. We can create an iterator using the `iter` function that comes with the base package of Python. Below is an example of an iterator." - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "1\n" - ] - } - ], - "source": [ - "# We first define our iterator:\n", - "\n", - "iterator = iter([1,2,3])\n", - "\n", - "# We can now iterate through the object using the next function\n", - "\n", - "print(next(iterator))" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "2\n" - ] - } - ], - "source": [ - "# We continue to iterate through the iterator.\n", - "\n", - "print(next(iterator))" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "3\n" - ] - } - ], - "source": [ - "print(next(iterator))" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "ename": "StopIteration", - "evalue": "", - "output_type": "error", - "traceback": [ - "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[1;31mStopIteration\u001b[0m Traceback (most recent call last)", - "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[1;31m# After we have iterated through all elements, we will get a StopIteration Error\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 2\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 3\u001b[1;33m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mnext\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0miterator\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", - "\u001b[1;31mStopIteration\u001b[0m: " - ] - } - ], - "source": [ - "# After we have iterated through all elements, we will get a StopIteration Error\n", - "\n", - "print(next(iterator))" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "1\n", - "2\n", - "3\n" - ] - } - ], - "source": [ - "# We can also iterate through an iterator using a for loop like this:\n", - "# Note: we cannot go back directly in an iterator once we have traversed through the elements. \n", - "# This is why we are redefining the iterator below\n", - "\n", - "iterator = iter([1,2,3])\n", - "\n", - "for i in iterator:\n", - " print(i)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In the cell below, write a function that takes an iterator and returns the first element in the iterator and returns the first element in the iterator that is divisible by 2. Assume that all iterators contain only numeric data. If we have not found a single element that is divisible by 2, return zero." - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [], - "source": [ - "def divisible2(iterator):\n", - " # This function takes an iterable and returns the first element that is divisible by 2 and zero otherwise\n", - " # Input: Iterable\n", - " # Output: Integer\n", - " \n", - " # Sample Input: iter([1,2,3])\n", - " # Sample Output: 2\n", - " \n", - " # Your code here:\n", - " " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Generators\n", - "\n", - "It is quite difficult to create your own iterator since you would have to implement a `next` function. Generators are functions that enable us to create iterators. The difference between a function and a generator is that instead of using `return`, we use `yield`. For example, below we have a function that returns an iterator containing the numbers 0 through n:" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [], - "source": [ - "def firstn(n):\n", - " number = 0\n", - " while number < n:\n", - " yield number\n", - " number = number + 1" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If we pass 5 to the function, we will see that we have a iterator containing the numbers 0 through 4." - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "0\n", - "1\n", - "2\n", - "3\n", - "4\n" - ] - } - ], - "source": [ - "iterator = firstn(5)\n", - "\n", - "for i in iterator:\n", - " print(i)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In the cell below, create a generator that takes a number and returns an iterator containing all even numbers between 0 and the number you passed to the generator." - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [], - "source": [ - "def even_iterator(n):\n", - " # This function produces an iterator containing all even numbers between 0 and n\n", - " # Input: integer\n", - " # Output: iterator\n", - " \n", - " # Sample Input: 5\n", - " # Sample Output: iter([0, 2, 4])\n", - " \n", - " # Your code here:\n", - " " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Challenge 2 - Applying Functions to DataFrames\n", - "\n", - "In this challenge, we will look at how to transform cells or entire columns at once.\n", - "\n", - "First, let's load a dataset. We will download the famous Iris classification dataset in the cell below." - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [], - "source": [ - "columns = ['sepal_length', 'sepal_width', 'petal_length','petal_width','iris_type']\n", - "iris = pd.read_csv(\"https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data\", names=columns)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's look at the dataset using the `head` function." - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [], - "source": [ - "# Your code here:\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's start off by using built-in functions. Try to apply the numpy mean function and describe what happens in the comments of the code." - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": {}, - "outputs": [], - "source": [ - "# Your code here:\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next, we'll apply the standard deviation function in numpy (`np.std`). Describe what happened in the comments." - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "metadata": {}, - "outputs": [], - "source": [ - "# Your code here:\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The measurements are in centimeters. Let's convert them all to inches. First, we will create a dataframe that contains only the numeric columns. Assign this new dataframe to `iris_numeric`." - ] - }, - { - "cell_type": "code", - "execution_count": 19, - "metadata": {}, - "outputs": [], - "source": [ - "# Your code here:\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next, we will write a function that converts centimeters to inches in the cell below. Recall that 1cm = 0.393701in." - ] - }, - { - "cell_type": "code", - "execution_count": 21, - "metadata": {}, - "outputs": [], - "source": [ - "def cm_to_in(x):\n", - " # This function takes in a numeric value in centimeters and converts it to inches\n", - " # Input: numeric value\n", - " # Output: float\n", - " \n", - " # Sample Input: 1.0\n", - " # Sample Output: 0.393701\n", - " \n", - " # Your code here:\n", - " " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now convert all columns in `iris_numeric` to inches in the cell below. We like to think of functional transformations as immutable. Therefore, save the transformed data in a dataframe called `iris_inch`." - ] - }, - { - "cell_type": "code", - "execution_count": 22, - "metadata": {}, - "outputs": [], - "source": [ - "# Your code here:\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We have just found that the original measurements were off by a constant. Define the global constant `error` and set it to 2. Write a function that uses the global constant and adds it to each cell in the dataframe. Apply this function to `iris_numeric` and save the result in `iris_constant`." - ] - }, - { - "cell_type": "code", - "execution_count": 23, - "metadata": {}, - "outputs": [], - "source": [ - "# Define constant below:\n", - "\n", - "\n", - "def add_constant(x):\n", - " # This function adds a global constant to our input.\n", - " # Input: numeric value\n", - " # Output: numeric value\n", - " \n", - " # Your code here:\n", - " " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Bonus Challenge - Applying Functions to Columns\n", - "\n", - "Read more about applying functions to either rows or columns [here](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.apply.html) and write a function that computes the maximum value for each row of `iris_numeric`" - ] - }, - { - "cell_type": "code", - "execution_count": 24, - "metadata": {}, - "outputs": [], - "source": [ - "# Your code here:\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Compute the combined lengths for each row and the combined widths for each row using a function. Assign these values to new columns `total_length` and `total_width`." - ] - }, - { - "cell_type": "code", - "execution_count": 25, - "metadata": {}, - "outputs": [], - "source": [ - "# Your code here:\n", - "\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.6" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} From 3779b4acf50eda5f08443b9c4686ce62d01220f6 Mon Sep 17 00:00:00 2001 From: GildaRIA <114354041+GildaRIA@users.noreply.github.com> Date: Sun, 26 Mar 2023 19:19:13 -0600 Subject: [PATCH 2/4] Add files via upload --- .../your-code/main.ipynb | 464 ++++++++++++++++++ 1 file changed, 464 insertions(+) create mode 100644 lab-functional-programming/your-code/main.ipynb diff --git a/lab-functional-programming/your-code/main.ipynb b/lab-functional-programming/your-code/main.ipynb new file mode 100644 index 0000000..8017d6e --- /dev/null +++ b/lab-functional-programming/your-code/main.ipynb @@ -0,0 +1,464 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Before your start:\n", + "- Read the README.md file\n", + "- Comment as much as you can and use the resources in the README.md file\n", + "- Happy learning!" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "import pandas as pd" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Challenge 1 - Iterators, Generators and `yield`. \n", + "\n", + "In iterator in Python is an object that represents a stream of data. However, iterators contain a countable number of values. We traverse through the iterator and return one value at a time. All iterators support a `next` function that allows us to traverse through the iterator. We can create an iterator using the `iter` function that comes with the base package of Python. Below is an example of an iterator." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "1\n" + ] + } + ], + "source": [ + "# We first define our iterator:\n", + "\n", + "iterator = iter([1,2,3])\n", + "\n", + "# We can now iterate through the object using the next function\n", + "\n", + "print(next(iterator))" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2\n" + ] + } + ], + "source": [ + "# We continue to iterate through the iterator.\n", + "\n", + "print(next(iterator))" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "3\n" + ] + } + ], + "source": [ + "print(next(iterator))" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "ename": "StopIteration", + "evalue": "", + "output_type": "error", + "traceback": [ + "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[1;31mStopIteration\u001b[0m Traceback (most recent call last)", + "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[1;31m# After we have iterated through all elements, we will get a StopIteration Error\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 2\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 3\u001b[1;33m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mnext\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0miterator\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[1;31mStopIteration\u001b[0m: " + ] + } + ], + "source": [ + "# After we have iterated through all elements, we will get a StopIteration Error\n", + "\n", + "print(next(iterator))" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "1\n", + "2\n", + "3\n" + ] + } + ], + "source": [ + "# We can also iterate through an iterator using a for loop like this:\n", + "# Note: we cannot go back directly in an iterator once we have traversed through the elements. \n", + "# This is why we are redefining the iterator below\n", + "\n", + "iterator = iter([1,2,3])\n", + "\n", + "for i in iterator:\n", + " print(i)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the cell below, write a function that takes an iterator and returns the first element in the iterator and returns the first element in the iterator that is divisible by 2. Assume that all iterators contain only numeric data. If we have not found a single element that is divisible by 2, return zero." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [], + "source": [ + "def divisible2(iterator):\n", + " # This function takes an iterable and returns the first element that is divisible by 2 and zero otherwise\n", + " # Input: Iterable\n", + " # Output: Integer\n", + " \n", + " # Sample Input: iter([1,2,3])\n", + " # Sample Output: 2\n", + " \n", + " # Your code here:\n", + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Generators\n", + "\n", + "It is quite difficult to create your own iterator since you would have to implement a `next` function. Generators are functions that enable us to create iterators. The difference between a function and a generator is that instead of using `return`, we use `yield`. For example, below we have a function that returns an iterator containing the numbers 0 through n:" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [], + "source": [ + "def firstn(n):\n", + " number = 0\n", + " while number < n:\n", + " yield number\n", + " number = number + 1" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "If we pass 5 to the function, we will see that we have a iterator containing the numbers 0 through 4." + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0\n", + "1\n", + "2\n", + "3\n", + "4\n" + ] + } + ], + "source": [ + "iterator = firstn(5)\n", + "\n", + "for i in iterator:\n", + " print(i)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the cell below, create a generator that takes a number and returns an iterator containing all even numbers between 0 and the number you passed to the generator." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [], + "source": [ + "def even_iterator(n):\n", + " # This function produces an iterator containing all even numbers between 0 and n\n", + " # Input: integer\n", + " # Output: iterator\n", + " \n", + " # Sample Input: 5\n", + " # Sample Output: iter([0, 2, 4])\n", + " \n", + " # Your code here:\n", + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Challenge 2 - Applying Functions to DataFrames\n", + "\n", + "In this challenge, we will look at how to transform cells or entire columns at once.\n", + "\n", + "First, let's load a dataset. We will download the famous Iris classification dataset in the cell below." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [], + "source": [ + "columns = ['sepal_length', 'sepal_width', 'petal_length','petal_width','iris_type']\n", + "iris = pd.read_csv(\"https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data\", names=columns)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's look at the dataset using the `head` function." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [], + "source": [ + "# Your code here:\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's start off by using built-in functions. Try to apply the numpy mean function and describe what happens in the comments of the code." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [], + "source": [ + "# Your code here:\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, we'll apply the standard deviation function in numpy (`np.std`). Describe what happened in the comments." + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [], + "source": [ + "# Your code here:\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The measurements are in centimeters. Let's convert them all to inches. First, we will create a dataframe that contains only the numeric columns. Assign this new dataframe to `iris_numeric`." + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [], + "source": [ + "# Your code here:\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Next, we will write a function that converts centimeters to inches in the cell below. Recall that 1cm = 0.393701in." + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": {}, + "outputs": [], + "source": [ + "def cm_to_in(x):\n", + " # This function takes in a numeric value in centimeters and converts it to inches\n", + " # Input: numeric value\n", + " # Output: float\n", + " \n", + " # Sample Input: 1.0\n", + " # Sample Output: 0.393701\n", + " \n", + " # Your code here:\n", + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now convert all columns in `iris_numeric` to inches in the cell below. We like to think of functional transformations as immutable. Therefore, save the transformed data in a dataframe called `iris_inch`." + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [], + "source": [ + "# Your code here:\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We have just found that the original measurements were off by a constant. Define the global constant `error` and set it to 2. Write a function that uses the global constant and adds it to each cell in the dataframe. Apply this function to `iris_numeric` and save the result in `iris_constant`." + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": {}, + "outputs": [], + "source": [ + "# Define constant below:\n", + "\n", + "\n", + "def add_constant(x):\n", + " # This function adds a global constant to our input.\n", + " # Input: numeric value\n", + " # Output: numeric value\n", + " \n", + " # Your code here:\n", + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Bonus Challenge - Applying Functions to Columns\n", + "\n", + "Read more about applying functions to either rows or columns [here](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.apply.html) and write a function that computes the maximum value for each row of `iris_numeric`" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": {}, + "outputs": [], + "source": [ + "# Your code here:\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Compute the combined lengths for each row and the combined widths for each row using a function. Assign these values to new columns `total_length` and `total_width`." + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": {}, + "outputs": [], + "source": [ + "# Your code here:\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.6.6" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} From d10f8437f54e7bd644bf594f5b67a9108688b4b6 Mon Sep 17 00:00:00 2001 From: GildaRIA <114354041+GildaRIA@users.noreply.github.com> Date: Sun, 26 Mar 2023 23:10:33 -0600 Subject: [PATCH 3/4] Delete main.ipynb --- .../your-code/main.ipynb | 464 ------------------ 1 file changed, 464 deletions(-) delete mode 100644 lab-functional-programming/your-code/main.ipynb diff --git a/lab-functional-programming/your-code/main.ipynb b/lab-functional-programming/your-code/main.ipynb deleted file mode 100644 index 8017d6e..0000000 --- a/lab-functional-programming/your-code/main.ipynb +++ /dev/null @@ -1,464 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Before your start:\n", - "- Read the README.md file\n", - "- Comment as much as you can and use the resources in the README.md file\n", - "- Happy learning!" - ] - }, - { - "cell_type": "code", - "execution_count": 1, - "metadata": {}, - "outputs": [], - "source": [ - "import numpy as np\n", - "import pandas as pd" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Challenge 1 - Iterators, Generators and `yield`. \n", - "\n", - "In iterator in Python is an object that represents a stream of data. However, iterators contain a countable number of values. We traverse through the iterator and return one value at a time. All iterators support a `next` function that allows us to traverse through the iterator. We can create an iterator using the `iter` function that comes with the base package of Python. Below is an example of an iterator." - ] - }, - { - "cell_type": "code", - "execution_count": 2, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "1\n" - ] - } - ], - "source": [ - "# We first define our iterator:\n", - "\n", - "iterator = iter([1,2,3])\n", - "\n", - "# We can now iterate through the object using the next function\n", - "\n", - "print(next(iterator))" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "2\n" - ] - } - ], - "source": [ - "# We continue to iterate through the iterator.\n", - "\n", - "print(next(iterator))" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "3\n" - ] - } - ], - "source": [ - "print(next(iterator))" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [ - { - "ename": "StopIteration", - "evalue": "", - "output_type": "error", - "traceback": [ - "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[1;31mStopIteration\u001b[0m Traceback (most recent call last)", - "\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m()\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[1;31m# After we have iterated through all elements, we will get a StopIteration Error\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 2\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 3\u001b[1;33m \u001b[0mprint\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mnext\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0miterator\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m", - "\u001b[1;31mStopIteration\u001b[0m: " - ] - } - ], - "source": [ - "# After we have iterated through all elements, we will get a StopIteration Error\n", - "\n", - "print(next(iterator))" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "1\n", - "2\n", - "3\n" - ] - } - ], - "source": [ - "# We can also iterate through an iterator using a for loop like this:\n", - "# Note: we cannot go back directly in an iterator once we have traversed through the elements. \n", - "# This is why we are redefining the iterator below\n", - "\n", - "iterator = iter([1,2,3])\n", - "\n", - "for i in iterator:\n", - " print(i)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In the cell below, write a function that takes an iterator and returns the first element in the iterator and returns the first element in the iterator that is divisible by 2. Assume that all iterators contain only numeric data. If we have not found a single element that is divisible by 2, return zero." - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [], - "source": [ - "def divisible2(iterator):\n", - " # This function takes an iterable and returns the first element that is divisible by 2 and zero otherwise\n", - " # Input: Iterable\n", - " # Output: Integer\n", - " \n", - " # Sample Input: iter([1,2,3])\n", - " # Sample Output: 2\n", - " \n", - " # Your code here:\n", - " " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Generators\n", - "\n", - "It is quite difficult to create your own iterator since you would have to implement a `next` function. Generators are functions that enable us to create iterators. The difference between a function and a generator is that instead of using `return`, we use `yield`. For example, below we have a function that returns an iterator containing the numbers 0 through n:" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": {}, - "outputs": [], - "source": [ - "def firstn(n):\n", - " number = 0\n", - " while number < n:\n", - " yield number\n", - " number = number + 1" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "If we pass 5 to the function, we will see that we have a iterator containing the numbers 0 through 4." - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "0\n", - "1\n", - "2\n", - "3\n", - "4\n" - ] - } - ], - "source": [ - "iterator = firstn(5)\n", - "\n", - "for i in iterator:\n", - " print(i)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In the cell below, create a generator that takes a number and returns an iterator containing all even numbers between 0 and the number you passed to the generator." - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [], - "source": [ - "def even_iterator(n):\n", - " # This function produces an iterator containing all even numbers between 0 and n\n", - " # Input: integer\n", - " # Output: iterator\n", - " \n", - " # Sample Input: 5\n", - " # Sample Output: iter([0, 2, 4])\n", - " \n", - " # Your code here:\n", - " " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Challenge 2 - Applying Functions to DataFrames\n", - "\n", - "In this challenge, we will look at how to transform cells or entire columns at once.\n", - "\n", - "First, let's load a dataset. We will download the famous Iris classification dataset in the cell below." - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [], - "source": [ - "columns = ['sepal_length', 'sepal_width', 'petal_length','petal_width','iris_type']\n", - "iris = pd.read_csv(\"https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data\", names=columns)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's look at the dataset using the `head` function." - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [], - "source": [ - "# Your code here:\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Let's start off by using built-in functions. Try to apply the numpy mean function and describe what happens in the comments of the code." - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": {}, - "outputs": [], - "source": [ - "# Your code here:\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next, we'll apply the standard deviation function in numpy (`np.std`). Describe what happened in the comments." - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "metadata": {}, - "outputs": [], - "source": [ - "# Your code here:\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The measurements are in centimeters. Let's convert them all to inches. First, we will create a dataframe that contains only the numeric columns. Assign this new dataframe to `iris_numeric`." - ] - }, - { - "cell_type": "code", - "execution_count": 19, - "metadata": {}, - "outputs": [], - "source": [ - "# Your code here:\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Next, we will write a function that converts centimeters to inches in the cell below. Recall that 1cm = 0.393701in." - ] - }, - { - "cell_type": "code", - "execution_count": 21, - "metadata": {}, - "outputs": [], - "source": [ - "def cm_to_in(x):\n", - " # This function takes in a numeric value in centimeters and converts it to inches\n", - " # Input: numeric value\n", - " # Output: float\n", - " \n", - " # Sample Input: 1.0\n", - " # Sample Output: 0.393701\n", - " \n", - " # Your code here:\n", - " " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Now convert all columns in `iris_numeric` to inches in the cell below. We like to think of functional transformations as immutable. Therefore, save the transformed data in a dataframe called `iris_inch`." - ] - }, - { - "cell_type": "code", - "execution_count": 22, - "metadata": {}, - "outputs": [], - "source": [ - "# Your code here:\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We have just found that the original measurements were off by a constant. Define the global constant `error` and set it to 2. Write a function that uses the global constant and adds it to each cell in the dataframe. Apply this function to `iris_numeric` and save the result in `iris_constant`." - ] - }, - { - "cell_type": "code", - "execution_count": 23, - "metadata": {}, - "outputs": [], - "source": [ - "# Define constant below:\n", - "\n", - "\n", - "def add_constant(x):\n", - " # This function adds a global constant to our input.\n", - " # Input: numeric value\n", - " # Output: numeric value\n", - " \n", - " # Your code here:\n", - " " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Bonus Challenge - Applying Functions to Columns\n", - "\n", - "Read more about applying functions to either rows or columns [here](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.apply.html) and write a function that computes the maximum value for each row of `iris_numeric`" - ] - }, - { - "cell_type": "code", - "execution_count": 24, - "metadata": {}, - "outputs": [], - "source": [ - "# Your code here:\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Compute the combined lengths for each row and the combined widths for each row using a function. Assign these values to new columns `total_length` and `total_width`." - ] - }, - { - "cell_type": "code", - "execution_count": 25, - "metadata": {}, - "outputs": [], - "source": [ - "# Your code here:\n", - "\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.6.6" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -} From 26e1b872db66dd2a76bcb1f73ec68c9233f06cd5 Mon Sep 17 00:00:00 2001 From: GildaRIA <114354041+GildaRIA@users.noreply.github.com> Date: Sun, 26 Mar 2023 23:10:59 -0600 Subject: [PATCH 4/4] Add files via upload --- .../your-code/main.ipynb | 805 ++++++++++++++++++ 1 file changed, 805 insertions(+) create mode 100644 lab-functional-programming/your-code/main.ipynb diff --git a/lab-functional-programming/your-code/main.ipynb b/lab-functional-programming/your-code/main.ipynb new file mode 100644 index 0000000..4a4d2dd --- /dev/null +++ b/lab-functional-programming/your-code/main.ipynb @@ -0,0 +1,805 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Before your start:\n", + "- Read the README.md file\n", + "- Comment as much as you can and use the resources in the README.md file\n", + "- Happy learning!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Challenge 1 - Working with JSON files\n", + "\n", + "Import the pandas library" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "# Your import here:\n", + "import pandas as pd\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### After importing pandas, let's find a dataset. In this lesson we will be working with a NASA dataset.\n", + "\n", + "Run the code in the cell below to load the dataset containing information about asteroids that have landed on earth. This piece of code helps us open the URL for the dataset and deocde the data using UTF-8." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "# Run this code\n", + "\n", + "from urllib.request import urlopen\n", + "import json\n", + "\n", + "response = urlopen(\"https://data.nasa.gov/resource/y77d-th95.json\")\n", + "json_data = response.read().decode('utf-8', 'replace')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the next cell, load the data in `json_data` and load it into a pandas dataframe. Name the dataframe `nasa`." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [], + "source": [ + "# Your code here:\n", + "nasa = pd.read_json(json_data)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now that we have loaded the data, let's examine it using the `head()` function." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
nameidnametyperecclassmassfallyearreclatreclonggeolocation:@computed_region_cbhk_fwbd:@computed_region_nnqa_25f4
0Aachen1ValidL521.0Fell1880-01-01T00:00:00.00050.775006.08333{'type': 'Point', 'coordinates': [6.08333, 50....NaNNaN
1Aarhus2ValidH6720.0Fell1951-01-01T00:00:00.00056.1833310.23333{'type': 'Point', 'coordinates': [10.23333, 56...NaNNaN
2Abee6ValidEH4107000.0Fell1952-01-01T00:00:00.00054.21667-113.00000{'type': 'Point', 'coordinates': [-113, 54.216...NaNNaN
3Acapulco10ValidAcapulcoite1914.0Fell1976-01-01T00:00:00.00016.88333-99.90000{'type': 'Point', 'coordinates': [-99.9, 16.88...NaNNaN
4Achiras370ValidL6780.0Fell1902-01-01T00:00:00.000-33.16667-64.95000{'type': 'Point', 'coordinates': [-64.95, -33....NaNNaN
\n", + "
" + ], + "text/plain": [ + " name id nametype recclass mass fall \\\n", + "0 Aachen 1 Valid L5 21.0 Fell \n", + "1 Aarhus 2 Valid H6 720.0 Fell \n", + "2 Abee 6 Valid EH4 107000.0 Fell \n", + "3 Acapulco 10 Valid Acapulcoite 1914.0 Fell \n", + "4 Achiras 370 Valid L6 780.0 Fell \n", + "\n", + " year reclat reclong \\\n", + "0 1880-01-01T00:00:00.000 50.77500 6.08333 \n", + "1 1951-01-01T00:00:00.000 56.18333 10.23333 \n", + "2 1952-01-01T00:00:00.000 54.21667 -113.00000 \n", + "3 1976-01-01T00:00:00.000 16.88333 -99.90000 \n", + "4 1902-01-01T00:00:00.000 -33.16667 -64.95000 \n", + "\n", + " geolocation \\\n", + "0 {'type': 'Point', 'coordinates': [6.08333, 50.... \n", + "1 {'type': 'Point', 'coordinates': [10.23333, 56... \n", + "2 {'type': 'Point', 'coordinates': [-113, 54.216... \n", + "3 {'type': 'Point', 'coordinates': [-99.9, 16.88... \n", + "4 {'type': 'Point', 'coordinates': [-64.95, -33.... \n", + "\n", + " :@computed_region_cbhk_fwbd :@computed_region_nnqa_25f4 \n", + "0 NaN NaN \n", + "1 NaN NaN \n", + "2 NaN NaN \n", + "3 NaN NaN \n", + "4 NaN NaN " + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Your code here:\n", + "nasa.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "#### The `value_counts()` function is commonly used in pandas to find the frequency of every value in a column.\n", + "\n", + "In the cell below, use the `value_counts()` function to determine the frequency of all types of asteroid landings by applying the function to the `fall` column." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "Fell 996\n", + "Found 4\n", + "Name: fall, dtype: int64" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Your code here:\n", + "nasa['fall'].value_counts()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Finally, let's save the dataframe as a json file again. Since we downloaded the file from an online source, the goal of saving the dataframe is to have a local copy. Save the dataframe using the `orient=records` argument and name the file `nasa.json`." + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": {}, + "outputs": [], + "source": [ + "# Your code here:\n", + "nasa.to_json('nasa.json')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Challenge 2 - Working with CSV and Other Separated Files\n", + "\n", + "csv files are more commonly used as dataframes. In the cell below, load the file from the URL provided using the `read_csv()` function in pandas. Starting version 0.19 of pandas, you can load a csv file into a dataframe directly from a URL without having to load the file first like we did with the JSON URL. The dataset we will be using contains informtaions about NASA shuttles. \n", + "\n", + "In the cell below, we define the column names and the URL of the data. Following this cell, read the tst file to a variable called `shuttle`. Since the file does not contain the column names, you must add them yourself using the column names declared in `cols` using the `names` argument. Additionally, a tst file is space separated, make sure you pass ` sep=' '` to the function." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": {}, + "outputs": [], + "source": [ + "# Run this code:\n", + "\n", + "cols = ['time', 'rad_flow', 'fpv_close', 'fpv_open', 'high', 'bypass', 'bpv_close', 'bpv_open', 'class']\n", + "tst_url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/statlog/shuttle/shuttle.tst'" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": {}, + "outputs": [], + "source": [ + "# Your code here:\n", + "shuttle = pd.read_csv(tst_url, names=cols, sep=' ')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's verify that this worked by looking at the `head()` function." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
timerad_flowfpv_closefpv_openhighbypassbpv_closebpv_openclass
550810-6112588644
56096052-4404444
50-189-7500394021
53979042-22537124
55282054-6262821
\n", + "
" + ], + "text/plain": [ + " time rad_flow fpv_close fpv_open high bypass bpv_close bpv_open \\\n", + "55 0 81 0 -6 11 25 88 64 \n", + "56 0 96 0 52 -4 40 44 4 \n", + "50 -1 89 -7 50 0 39 40 2 \n", + "53 9 79 0 42 -2 25 37 12 \n", + "55 2 82 0 54 -6 26 28 2 \n", + "\n", + " class \n", + "55 4 \n", + "56 4 \n", + "50 1 \n", + "53 4 \n", + "55 1 " + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Your code here:\n", + "shuttle.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "To make life easier for us, let's turn this dataframe into a comma separated file by saving it using the `to_csv()` function. Save `shuttle` into the file `shuttle.csv` and ensure the file is comma separated and that we are not saving the index column." + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [], + "source": [ + "# Your code here:\n", + "shuttle.to_csv('shuttle')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Challenge 3 - Working with Excel Files\n", + "\n", + "We can also use pandas to convert excel spreadsheets to dataframes. Let's use the `read_excel()` function. In this case, `astronauts.xls` is in the same folder that contains this notebook. Read this file into a variable called `astronaut`. \n", + "\n", + "Note: Make sure to install the `xlrd` library if it is not yet installed." + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [], + "source": [ + "# Your code here:\n", + "import xlrd\n", + "astronaut = pd.read_excel('astronauts.xls')" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Use the `head()` function to inspect the dataframe." + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
NameYearGroupStatusBirth DateBirth PlaceGenderAlma MaterUndergraduate MajorGraduate MajorMilitary RankMilitary BranchSpace FlightsSpace Flight (hr)Space WalksSpace Walks (hr)MissionsDeath DateDeath Mission
0Joseph M. Acaba2004.019.0Active1967-05-17Inglewood, CAMaleUniversity of California-Santa Barbara; Univer...GeologyGeologyNaNNaN23307213.0STS-119 (Discovery), ISS-31/32 (Soyuz)NaTNaN
1Loren W. ActonNaNNaNRetired1936-03-07Lewiston, MTMaleMontana State University; University of ColoradoEngineering PhysicsSolar PhysicsNaNNaN119000.0STS 51-F (Challenger)NaTNaN
2James C. Adamson1984.010.0Retired1946-03-03Warsaw, NYMaleUS Military Academy; Princeton UniversityEngineeringAerospace EngineeringColonelUS Army (Retired)233400.0STS-28 (Columbia), STS-43 (Atlantis)NaTNaN
3Thomas D. Akers1987.012.0Retired1951-05-20St. Louis, MOMaleUniversity of Missouri-RollaApplied MathematicsApplied MathematicsColonelUS Air Force (Retired)4814429.0STS-41 (Discovery), STS-49 (Endeavor), STS-61 ...NaTNaN
4Buzz Aldrin1963.03.0Retired1930-01-20Montclair, NJMaleUS Military Academy; MITMechanical EngineeringAstronauticsColonelUS Air Force (Retired)228928.0Gemini 12, Apollo 11NaTNaN
\n", + "
" + ], + "text/plain": [ + " Name Year Group Status Birth Date Birth Place Gender \\\n", + "0 Joseph M. Acaba 2004.0 19.0 Active 1967-05-17 Inglewood, CA Male \n", + "1 Loren W. Acton NaN NaN Retired 1936-03-07 Lewiston, MT Male \n", + "2 James C. Adamson 1984.0 10.0 Retired 1946-03-03 Warsaw, NY Male \n", + "3 Thomas D. Akers 1987.0 12.0 Retired 1951-05-20 St. Louis, MO Male \n", + "4 Buzz Aldrin 1963.0 3.0 Retired 1930-01-20 Montclair, NJ Male \n", + "\n", + " Alma Mater Undergraduate Major \\\n", + "0 University of California-Santa Barbara; Univer... Geology \n", + "1 Montana State University; University of Colorado Engineering Physics \n", + "2 US Military Academy; Princeton University Engineering \n", + "3 University of Missouri-Rolla Applied Mathematics \n", + "4 US Military Academy; MIT Mechanical Engineering \n", + "\n", + " Graduate Major Military Rank Military Branch Space Flights \\\n", + "0 Geology NaN NaN 2 \n", + "1 Solar Physics NaN NaN 1 \n", + "2 Aerospace Engineering Colonel US Army (Retired) 2 \n", + "3 Applied Mathematics Colonel US Air Force (Retired) 4 \n", + "4 Astronautics Colonel US Air Force (Retired) 2 \n", + "\n", + " Space Flight (hr) Space Walks Space Walks (hr) \\\n", + "0 3307 2 13.0 \n", + "1 190 0 0.0 \n", + "2 334 0 0.0 \n", + "3 814 4 29.0 \n", + "4 289 2 8.0 \n", + "\n", + " Missions Death Date Death Mission \n", + "0 STS-119 (Discovery), ISS-31/32 (Soyuz) NaT NaN \n", + "1 STS 51-F (Challenger) NaT NaN \n", + "2 STS-28 (Columbia), STS-43 (Atlantis) NaT NaN \n", + "3 STS-41 (Discovery), STS-49 (Endeavor), STS-61 ... NaT NaN \n", + "4 Gemini 12, Apollo 11 NaT NaN " + ] + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Your code here:\n", + "astronaut.head()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Use the `value_counts()` function to find the most popular undergraduate major among all astronauts." + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "'Physics'" + ] + }, + "execution_count": 26, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Your code here:\n", + "astronaut['Undergraduate Major'].value_counts().idxmax()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Due to all the commas present in the cells of this file, let's save it as a tab separated csv file. In the cell below, save `astronaut` as a tab separated file using the `to_csv` function. Call the file `astronaut.csv` and remember to remove the index column." + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": {}, + "outputs": [], + "source": [ + "# Your code here:\n", + "astronaut.to_csv('astronaut.csv', sep='\\t', index=False)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Bonus Challenge - Fertility Dataset\n", + "\n", + "Visit the following [URL](https://archive.ics.uci.edu/ml/datasets/Fertility) and retrieve the dataset as well as the column headers. Determine the correct separator and read the file into a variable called `fertility`. Examine the dataframe using the `head()` function." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": {}, + "outputs": [], + "source": [ + "# Your code here:\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.12" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +}