Data to help find reach, match, safety

I downloaded this data from IPEDS for schools with certain Carnegie Classifications and grad rates above a certain cutoff. The result was data for 858 schools. Not all of the schools reported SAT scores or ACT scores (154 did not report them). They apparently decided to live in an alternative universe where test scores don’t matter (and objects fall up instead of down). Nevertheless, I did not want to exclude them from the data so I chose to sort based on grad rate instead of test scores. The correlation between SAT (CR + Math) midpoint for 704 schools was +.83 which is very high but not perfect. The correlation dropped to +.74 for the top 200 SAT schools and dropped to +.59 for the top 50 SAT schools. I think this is an artifact of what statisticians call “truncated or restricted range”. Correlations are not accurate reflections of an entire population of schools when you limit the range of scores on one of the variables.

US News publishes a statistic they call “under/overperformance”. It meaures how much below or above the actual grad rate is from the grad rate that would be predicted by their prediction equation which, I think, includes test scores, academic expenditures per student, and public versus private control. I believe underperformance reflects negatively on a school whether or not they have a large engineering/STEM program. Caltech, for example, should do a better job of supporting the perennial best freshman class in the country. They don’t deserve the students they get. It doesn’t matter that science and math are hard subjects in some respects. The faculty should devote more effort to their teaching and less to their research.

By the way, tech schools are not the only ones with lower than expected grad rates. Check out Grinnell, Illinois, NYU, Scripps, Oberlin, Bryn Mawr, Tulane, Binghamtom, and so on.