Skip to content

Created and manipulated Pandas DataFrames to analyze school and standardized test data

Notifications You must be signed in to change notification settings

Salllym/PyCitySchool_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PyCitySchool

Pandas is a powerful data analysis Python library and I've used the Pandas Framework to analyze datasets on schools and students. I've also used the Pandas library to make strategic decisions regarding future school budgets and priorities along with analyzing the district-wide standardized test results, specifically reading and math scores of students to showcase obvious trends in school performance.

  • Pandas and Jupyter notebook was used to write the code and create dataframes.

The final report includes the following:

District Summary

A high-level snapshot, in a DataFrame, of the district's key metrics, including the following:

  • Total schools
  • Total students
  • Total budget
  • Average math score
  • Average reading score
  • % passing math (the percentage of students who passed math)
  • % passing reading (the percentage of students who passed reading)
  • % overall passing (the percentage of students who passed math AND reading)
Total Schools Total Students Total Budget Average Math Score Average Reading Score % Passing Math % Passing Reading % Overall Passing
0 15 39170 $24,649,428.00 78.985371 81.87784 74.980853 85.805463 65.172326

School Summary

A DataFrame that summarizes key metrics about each school, including the following:

  • School name
  • School type
  • Total students
  • Total school budget
  • Per student budget
  • Average math score
  • Average reading score
  • % passing math (the percentage of students who passed math)
  • % passing reading (the percentage of students who passed reading)
  • % overall passing (the percentage of students who passed math AND reading)
School Type Total Students Total School Budget Per Student Budget Average Math Score Average Reading Score % Passing Math % Passing Reading % Overall Passing
Bailey High School District 4976 $3,124,928.00 $628.00 77.048432 81.033963 66.680064 81.933280 54.642283
Cabrera High School Charter 1858 $1,081,356.00 $582.00 83.061895 83.975780 94.133477 97.039828 91.334769
Figueroa High School District 2949 $1,884,411.00 $639.00 76.711767 81.158020 65.988471 80.739234 53.204476
Ford High School District 2739 $1,763,916.00 $644.00 77.102592 80.746258 68.309602 79.299014 54.289887
Griffin High School Charter 1468 $917,500.00 $625.00 83.351499 83.816757 93.392371 97.138965 90.599455

Highest-Performing Schools (by % Overall Passing)

A DataFrame that highlights the top 5 performing schools based on % Overall Passing. Include the following metrics:

  • School name
  • School type
  • Total students
  • Total school budget
  • Per student budget
  • Average math score
  • Average reading score
  • % passing math (the percentage of students who passed math)
  • % passing reading (the percentage of students who passed reading)
  • % overall passing (the percentage of students who passed math AND reading)
School Type Total Students Total School Budget Per Student Budget Average Math Score Average Reading Score % Passing Math % Passing Reading % Overall Passing
Cabrera High School Charter 1858 $1,081,356.00 $582.00 83.061895 83.975780 94.133477 97.039828 91.334769
Thomas High School Charter 1635 $1,043,130.00 $638.00 83.418349 83.848930 93.272171 97.308869 90.948012
Griffin High School Charter 1468 $917,500.00 $625.00 83.351499 83.816757 93.392371 97.138965 90.599455
Wilson High School Charter 2283 $1,319,574.00 $578.00 83.274201 83.989488 93.867718 96.539641 90.582567
Pena High School Charter 962 $585,858.00 $609.00 83.839917 84.044699 94.594595 95.945946 90.540541

Lowest-Performing Schools (by % Overall Passing)

A DataFrame that highlights the bottom 5 performing schools based on % Overall Passing. Include the following metrics:

  • School name
  • School type
  • Total students
  • Total school budget
  • Per student budget
  • Average math score
  • Average reading score
  • % passing math (the percentage of students who passed math)
  • % passing reading (the percentage of students who passed reading)
  • % overall passing (the percentage of students who passed math AND reading)
School Type Total Students Total School Budget Per Student Budget Average Math Score Average Reading Score % Passing Math % Passing Reading % Overall Passing
Rodriguez High School District 3999 $2,547,363.00 $637.00 76.842711 80.744686 66.366592 80.220055 52.988247
Figueroa High School District 2949 $1,884,411.00 $639.00 76.711767 81.158020 65.988471 80.739234 53.204476
Huang High School District 2917 $1,910,635.00 $655.00 76.629414 81.182722 65.683922 81.316421 53.513884
Hernandez High School District 4635 $3,022,020.00 $652.00 77.289752 80.934412 66.752967 80.862999 53.527508
Johnson High School District 4761 $3,094,650.00 $650.00 77.072464 80.966394 66.057551 81.222432 53.539172

Math Scores by Grade

A DataFrame that lists the average math score for students of each grade level (9th, 10th, 11th, 12th) at each school.

9th 10th 11th 12th
Bailey High School 77.083676 76.996772 77.515588 76.492218
Cabrera High School 83.094697 83.154506 82.765560 83.277487
Figueroa High School 76.403037 76.539974 76.884344 77.151369
Ford High School 77.361345 77.672316 76.918058 76.179963
Griffin High School 82.044010 84.229064 83.842105 83.356164

Reading Scores by Grade

A DataFrame that lists the average reading score for students of each grade level (9th, 10th, 11th, 12th) at each school.

9th 10th 11th 12th
Bailey High School 81.303155 80.907183 80.945643 80.912451
Cabrera High School 83.676136 84.253219 83.788382 84.287958
Figueroa High School 81.198598 81.408912 80.640339 81.384863
Ford High School 80.632653 81.262712 80.403642 80.662338
Griffin High School 83.369193 83.706897 84.288089 84.013699

Scores by School Spending

A table that breaks down school performance based on average spending ranges (per student). Use your judgment to create four bins with reasonable cutoff values to group school spending. Include the following metrics in the table:

  • Average math score
  • Average reading score
  • % passing math (the percentage of students who passed math)
  • % passing reading (the percentage of students who passed reading)
  • % overall passing (the percentage of students who passed math AND reading)
Average Math Score Average Reading Score % Passing Math % Passing Reading % Overall Passing
Spending Ranges (Per Student)
<$585 83.46 83.93 93.46 96.61 90.37
$585-630 81.90 83.16 87.13 92.72 81.42
$630-645 78.52 81.62 73.48 84.39 62.86
$645-680 77.00 81.03 66.16 81.13 53.53

Scores by School Size

A table that breaks down school performance based on school size (small, medium, or large).

Average Math Score Average Reading Score % Passing Math % Passing Reading % Overall Passing
School Size
Small (<1000) 83.821598 83.929843 93.550225 96.099437 89.883853
Medium (1000-2000) 83.374684 83.864438 93.599695 96.790680 90.621535
Large (2000-5000) 77.746417 81.344493 69.963361 82.766634 58.286003

Scores by School Type

A table that breaks down school performance based on type of school (district or charter).

Average Math Score Average Reading Score % Passing Math % Passing Reading % Overall Passing
School Type
Charter 83.473852 83.896421 93.620830 96.586489 90.432244
District 76.956733 80.966636 66.548453 80.799062 53.672208

Analysis

  • The dataset has a total of 15 schools and 39,170 students. The percent of students passing reading (85.80%) and their average reading score of 91.87 is higher than the percent of students passing math (74.98%) and their average math score of 78.98, which reveals that students are doing better in reading than math.

  • The school summary shows that from the 15 schools, 7 district schools have more students than the 8 charter schools which can mean that the district schools’ geographical coverage can be a reason for their larger student population/ size.

  • Based on the performance of spending ranges per student, students have a higher overall passing rate of 90.37 percent in the lower spending ranges (<$585).

  • Small and medium sized schools performed better than large sized schools on passing math. There's 93.55 percent in small sized schools and 93.59 percent in medium sized schools passing math while there's only 69.96 percent in large sized schools passing math.

  • The data shows that the Charter Schools are scoring better than the District Schools across all metrics (average, percent, and overall). Pena High School has the average math score (83.839917) with (94.59%) passing math and average reading score (84.044699) with (95.94%) passing reading while Huang High School has the average math score (76.629414) with (65.68%) passing math and average reading score (81.182722) with (81.31%) passing reading. Since Pena High School is a charter school with better scores and smaller student population while Huang High School is a district school with lower scores and larger student population, this can mean that schools with larger student population/ size negatively influences student achievements.

About

Created and manipulated Pandas DataFrames to analyze school and standardized test data

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published