You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Schools that spend between 585-615 dollars per student have a better percent passing reading, percent passing math, and percent overall passing rate than schools that spend more than 615 dollars per student. Also, those schools that spend less per student are smaller schools with 2,000 students or less.
It is also found that the schools with 2000 students or less have a higher percent passing reading, percent passing math, and percent overall passing rate than larger schools that have 2000 – 5000 students. Furthermore, the smaller the school, the less they spend per student.
With that being said,Charter schools have a higher percent passing reading, percent passing math, and percent overall passing rate than District schools. Thus being the top five performing schools based on percent passing rates, whereas the bottom five performing schools based on percent passing rate are District schools.
In all 15 schools, the Average Reading Score is higher than the Average Math Score.
Lasly, for the top 5 performing schools, the percent passing math is about 20 percent lower than the percent passing reading.
# Dependenciesimportpandasaspdimportnumpyasnp
# The path to our csv fileschools_complete="Resources/schools_complete.csv"students_complete="Resources/students_complete.csv"
# create dataframe using mappingschools_complete_df=pd.read_csv(schools_complete)
students_complete_df=pd.read_csv(students_complete)
schools_complete_df.head()
students_complete_df.head()
# finding total schoolstotal_schools=len(schools_complete_df['School ID'])
# The total number of studentstotal_students=schools_complete_df['size'].sum()
# Get the total budgettotal_budget=schools_complete_df['budget'].sum()
# Get the Average Math Scoreaverage_math_score=round(students_complete_df['math_score'].mean(),6)
# Get the Average Reading Scoreaverage_reading_score=round(students_complete_df['reading_score'].mean(),5)
# Get the percentage of students passing mathstudents_passing_math=students_complete_df.loc[students_complete_df["math_score"] >70,:]
percent_passing_math=round(float(students_passing_math['math_score'].count()/total_students)*100,6)
# Get the percentage of students passing readingstudents_passing_reading=students_complete_df.loc[students_complete_df["reading_score"] >70,:]
percent_passing_reading=round(float(students_passing_reading["reading_score"].count()/total_students)*100,6)
# Get the overall passing rate for math and readingoverall_passing_rate=round((percent_passing_math+percent_passing_reading)/2,6)
# Create a disctric Summary Dataframe summary_df=pd.DataFrame({"Total Schools":[total_schools],
"Total Students":[total_students],
"Total Budget":[total_budget],
"Average Math Score":[average_math_score],
"Average Reading Score":[average_reading_score],
"Percent Passing Math":[percent_passing_math],
"Percent Passing Reading":[percent_passing_reading],
"Percent Overall Passing Rate":[overall_passing_rate]})
# summary_df# to put it in the right formatdistrict_summary_df=pd.DataFrame(summary_df, columns=["Total Schools",
"Total Students",
"Total Budget",
"Average Math Score",
"Average Reading Score",
"Percent Passing Math",
"Percent Passing Reading",
"Percent Overall Passing Rate"])
district_summary_df["Total Students"] =district_summary_df["Total Students"].map('{:,}'.format)
district_summary_df["Total Budget"] =district_summary_df["Total Budget"].map('${:,.2f}'.format)
district_summary_df
# Change header name from the dataframe from name to schoolschools_complete_df=schools_complete_df.rename(columns={"name": "school"})
# schools_complete_df.columns
# Merge the two data frames together on = school schools_complete_data_df=pd.merge(schools_complete_df, students_complete_df, on='school')
#testing# schools_complete_data_df.head()
# count number of students in each schoolstudent_count=schools_complete_data_df['school'].value_counts()
# testing# student_count.head()
# changing the school type into a strschool_type=schools_complete_data_df.groupby('school')['type'].unique()
school_type=school_type.str[0]
#testing# school_type
# total budget for each schoolbudget_per_school=schools_complete_data_df.groupby('school')['budget'].count()
# budget_per_school.head()# testing# budget_per_school
# budget for each studentbudget_per_student=schools_complete_df.set_index('school')['budget']/schools_complete_df.set_index('school')['size']
# budget_per_student.head()# testing# budget_per_student
# Average math and readin scores for each schoolschool_average_math=round(schools_complete_data_df.groupby('school')['math_score'].mean(),2)
school_average_reading=round(schools_complete_data_df.groupby('school')['reading_score'].mean(),2)
# dataframe with reading and math passing only scorespassing_df=schools_complete_data_df.loc[(schools_complete_data_df['math_score'] >=70)
& (schools_complete_data_df['reading_score'] >=70)]
passing_math_df=schools_complete_data_df.loc[(schools_complete_data_df['math_score'] >=70)]
passing_reading_df=schools_complete_data_df.loc[(schools_complete_data_df['reading_score'] >=70)]
# testing# passing_math_df.head()# passing_reading_df.head()
# percentage for students passing math and reading percent_passing_math=round((passing_math_df.groupby('school')['math_score'].count()/student_count)*100,1)
percent_passing_reading=round((passing_reading_df.groupby('school')['reading_score'].count()/student_count)*100,1)
# percent_passing_reading# percent_passing_math
# the top schools ( only top 5 )top_performing_schools=school_summary_df.sort_values("% Overall Passing Rate", ascending=False, inplace=False)
top_performing_schools.head()
frompandas.api.typesimportCategoricalDtype#Reset the grade order in the original students data frame "students_complete_df". students_complete_df["grade"] =students_complete_df['grade'].astype(CategoricalDtype(["9th", "10th","11th","12th"]))
# students_complete_dfmath_scores_grade=round(students_complete_df.pivot_table(index="school", columns="grade", values="math_score"),2)
math_scores_grade.index.name=Nonemath_scores_grade
#Reset the grade order in the original students data frame "students_complete_df". students_complete_df['grade'] =students_complete_df['grade'].astype(CategoricalDtype(["9th", "10th","11th","12th"]))
# students_complete_dfreading_scores_grade=round(students_complete_df.pivot_table(index="school", columns="grade", values="reading_score"),2)
reading_scores_grade.index.name=Nonereading_scores_grade
#* Repeat the above breakdown, but this time group schools based on school type (Charter vs. District)*scores_school_type=school_summary_df[["School Type","Average Math Score",
"Average Reading Score",
"% Passing Math",
"% Passing Reading",
"% Overall Passing Rate"]]
scores_school_type=scores_school_type.groupby('School Type').mean()
scores_school_type