How to calculate a universitie's "Peer Assessment" score

collegehelp · November 2, 2007, 4:46pm

The US News Peer Assessment score is an intuitive rating of a school by administrators at similar schools. I have used a statistical procedure called multiple regression to find a mathematical “solution” that mimics the subjective peer ratings very closely. It uses a combination of factors from US News and from the NRC ratings of faculty scholarship for physics, english, and psychology departments (representing sciences, humanities, social sciences).

For those who care, the R-square is .94 and the Multiple R is .97, where 1 would be perfect correspondance to the actual US News Peer Assessment score.

The enigmatic formula is as follows:

estimated peer assessment = -1.19636
+(.03042* classes over 50)
+(.07052NRC mean for physics english psych)
-(.00122financial rank)
+(.0000005199697percent fulltime faculty cubed)
-(.00002286classes over 50 cubed)
+(.00254sat 75th percentile)
+(.0000007890567actual graduation rate cubed)
-(.16392NRC physics rating)
+(.000000553739acceptance rate cubed)

It produces the following estimates of peer assessment scores which are compared with the actual US News Peer Assessment scores.

rank, school, estimated peer assessment, actual US News Peer Assessment score
1 Harvard 5.09 4.9
2 Massachu 5.00 4.9
3 Stanford 4.95 4.9
4 Yale Uni 4.87 4.8
5 Princeto 4.78 4.9
6 Cornell 4.70 4.6
7 Columbia 4.58 4.6
8 Pennsylv 4.54 4.5
9 Duke Uni 4.51 4.4
10 Cal Inst 4.51 4.7
11 Johns Ho 4.49 4.6
12 Cal—Berk 4.49 4.8
13 Brown Un 4.46 4.4
14 Virginia 4.44 4.3
15 Cal—Los 4.33 4.2
16 Columbia 4.31 4.6
17 Northwes 4.25 4.3
18 Michigan 4.24 4.5
19 Rice Uni 4.15 4.0
20 Illinois 4.13 4.0
21 Notre Da 4.11 3.9
22 Washingt 4.11 4.1
23 Dartmout 4.10 4.3
24 Carnegie 4.09 4.2
25 Vanderbi 3.99 4.0
26 Emory Un 3.93 4.0
27 North Ca 3.91 4.2
28 Georgia 3.91 4.0
29 Texas—Au 3.89 4.1
30 Wisconsi 3.88 4.1
31 Southern 3.87 4.0
32 Cal—San 3.86 3.8
33 U Washingt 3.85 3.9
34 Florida 3.85 3.6
35 Georgeto 3.81 4.0
36 Brandeis 3.81 3.6
37 Renssela 3.79 3.5
38 Cal—Irvi 3.76 3.6
39 New York 3.72 3.8
40 Pennsylv 3.72 3.8
41 Rocheste 3.71 3.4
42 Coll Wm Mary 3.69 3.7
43 Rutgers- 3.68 3.4
44 Case Wes 3.67 3.5
45 Tufts Un 3.64 3.6
46 Indiana 3.62 3.7
47 Cal—Sant 3.62 3.5
48 Maryland 3.61 3.6
49 Ohio Sta 3.61 3.7
50 Cal—Davi 3.60 3.8
51 Pittsbur 3.58 3.4
52 Texas A& 3.57 3.6
53 Purdue U 3.57 3.8
54 Minnesot 3.49 3.7
55 Boston C 3.48 3.6
56 Boston U 3.47 3.4
57 Colorado 3.43 3.5
58 Virginia 3.43 3.4
59 Massachu 3.43 3.3
60 Iowa 3.42 3.6
61 Clarkson 3.41 2.6
62 Cal—Sant 3.41 3.2
63 Missouri 3.41 2.7
64 Arizona 3.41 3.6
65 Missouri 3.40 3.3
66 Michigan 3.37 3.5
67 Iowa Sta 3.36 3.2
68 Georgia 3.35 3.5
69 Delaware 3.32 3.1
70 Worceste 3.29 2.8
71 Wake For 3.29 3.5
72 Miami (F 3.26 3.2
73 Cal—Rive 3.26 3.1
74 Kansas 3.25 3.4
75 Connecti 3.25 3.2
76 Oregon 3.24 3.3
77 Miami Un 3.23 3.3
78 Arizona 3.22 3.3
79 Lehigh U 3.20 3.2
80 George W 3.20 3.4
81 Tulane U 3.19 3.3
82 SUNY—Bin 3.18 3.0
83 Nebraska 3.16 3.2
84 Colorado 3.16 2.9
85 Syracuse 3.16 3.4
86 Oklahoma 3.14 3.0
87 SUNY—Sto 3.14 3.2
88 Brigham 3.12 2.9
89 Colorado 3.12 3.1
90 Tennesse 3.11 3.1
91 Kentucky 3.09 3.0
92 Clemson 3.09 3.1
93 Vermont 3.09 3.0
94 Universi 3.07 3.1
95 New Hamp 3.02 2.9
96 Michigan 3.01 2.7
97 Vermont 3.01 3.0
98 Howard U 3.01 2.9
99 Stevens 2.99 2.7
100 St. Loui 2.99 2.9
101 Southern 2.98 3.1
102 Clark Un 2.97 2.8
103 UC San D 2.96 2.7
104 Baylor U 2.95 3.2
105 Washingt St 2.95 3.0
106 Arkansas 2.94 2.8
107 Kansas S 2.93 2.9
108 South Ca 2.91 2.9
109 Auburn U 2.90 3.1
110 Alabama 2.89 3.0
111 Loyola U 2.89 2.9
112 Drexel U 2.89 2.9
113 American 2.87 2.9
114 Florida 2.86 3.0
115 Ohio Uni 2.80 3.0
116 Texas Ch 2.78 2.7
117 Denver 2.76 2.7
118 SUNY Col ESF 2.68 2.7
119 Catholic 2.66 2.8

mj93 · November 2, 2007, 5:01pm

Interesting. I always wanted to know what “peer assessment” really meant.

What is “NRC mean” in “NRC mean for psych, phys, english”?

ckmets13 · November 2, 2007, 5:15pm

wow, thats a really interesting model.

collegehelp · November 2, 2007, 5:53pm

oops, here is the corrected formula
I discovered an error after my time limit for editing expired.
So sorry.

The NRC mean of physics, english, and psychology should have been squared. I put the correction in caps.

The NRC mean of physics, english, and psych is the average of the three ratings of faculty scholarship by the National Research Council.

The enigmatic formula is as follows:

estimated peer assessment = -1.19636
+(.03042* classes over 50)
+(.07052NRC mean for physics english psych SQUARED)
-(.00122financial rank)
+(.0000005199697percent fulltime faculty cubed)
-(.00002286classes over 50 cubed)
+(.00254sat 75th percentile)
+(.0000007890567actual graduation rate cubed)
-(.16392NRC physics rating)
+(.000000553739acceptance rate cubed)

UCBChemEGrad · November 2, 2007, 6:03pm

I wonder if Berkeley would have a better predicted PA to actual if you included all 35 NRC program rankings (either aggregately or via some average).

Good job, collegehelp.

afan · November 2, 2007, 6:29pm

How many years did you test? The equations can be unstable when there are too many variables. Back testing with multiple regression can always permit a fit, whether or not there is reliable relationship between the independent and dependent variables. Particularly when you permit multiple combinations (linear, squared, cubed; including physics in both the mean with English and psychology and by itself; why these three fields…). It will be difficult to test multiple years, since the NRC data does not change, even though there will be slight variations in the USNews data.

collegehelp · November 2, 2007, 7:13pm

afan,
I used the 2008 US News data and the most recent NRC data (maybe 10 years old). There were about 130 schools with complete data in my analysis and 9 predictor variables. For some reason, the physics NRC rating and the mean of three NRC ratings squared each added significantly and separately to the overall model. There are several other good models that I found without re-using any variables like classes over 50 and NRC Physics. I simply picked the model that had the highest R-square and in which each predictor variable contributed significantly to the equation. None of the predictor variables were insignificant.

I actually found a parsimonious 3-variable model with an R-square of .9 using NRC english/psych mean, SAT, and full-time faculty.

collegehelp · November 2, 2007, 7:18pm

UCBChemEGrad-
I wanted to use the overall average but couldn’t find the overall averages. Barrons actually suggested that idea to me. So, I did what I could with limited time and simply added three very different ratings. Psych and English are very common and popular majors from different areas, social sciences and humanities. They are the linchpins of those areas, I think, although economists and historians might argue the point. Physics…well it it not too popular but it is very hard and it makes headlines.

xiggi · November 3, 2007, 2:43am

Now, we have a MODEL that confirms what some of us have been saying for the longest time about the Peer Assessment. No wonder why some posters support that PA so darn much!

Let’s take a look at the “top 25” ranking, after adding a column for over/under assessment

21 Notre Da 4.11 3.9 -0.21
1 Harvard 5.09 4.9 -0.19
19 Rice Uni 4.15 4.0 -0.15
14 Virginia 4.44 4.3 -0.14
15 UCLA 4.33 4.2 -0.13
20 Illinois 4.13 4.0 -0.13
9 Duke Uni 4.51 4.4 -0.11
6 Cornell 4.70 4.6 -0.1
2 Massachu 5.00 4.9 -0.1
4 Yale Uni 4.87 4.8 -0.07
13 Brown Un 4.46 4.4 -0.06
3 Stanford 4.95 4.9 -0.05
8 Pennsylv 4.54 4.5 -0.04
22 Washingt 4.11 4.1 -0.01
25 Vanderbi 3.99 4.0 0.01
7 Columbia 4.58 4.6 0.02
17 Northwes 4.25 4.3 0.05
11 Johns Ho 4.49 4.6 0.11
24 Carnegie 4.09 4.2 0.11
5 Princeto 4.78 4.9 0.12
10 Cal Inst 4.51 4.7 0.19
23 Dartmout 4.10 4.3 0.2
18 Michigan 4.24 4.5 0.26
16 Columbia 4.31 4.6 0.29
12 Cal—Berk 4.49 4.8 0.31

So what are the leaders in over-assessment by "peers?'

12 Cal—Berk 4.49 4.8 0.31
16 Columbia 4.31 4.6 0.29
18 Michigan 4.24 4.5 0.26
23 Dartmout 4.10 4.3 0.2
10 Cal Inst 4.51 4.7 0.19
5 Princeto 4.78 4.9 0.12
24 Carnegie 4.09 4.2 0.11
11 Johns Ho 4.49 4.6 0.11

And the under-assessed ones?

21 Notre Da 4.11 3.9 -0.21
1 Harvard 5.09 4.9 -0.19
19 Rice Uni 4.15 4.0 -0.15
14 Virginia 4.44 4.3 -0.14
15 Cal—Los 4.33 4.2 -0.13
20 Illinois 4.13 4.0 -0.13
9 Duke Uni 4.51 4.4 -0.11

Funny how all the gyrations to find plausible explanations to the nature of the PA simply point to the same evidence one finds from merely comparing the “differences” between an objective and a subjective ranking. So, let’s all underscore (again) the three most advantaged schools (and thus over-ranked schools) by the questionable and manipulated PA are:

12 Cal—Berk 4.49 4.8 0.31
16 Columbia 4.31 4.6 0.29
18 Michigan 4.24 4.5 0.26
and were it a top 25 school
30 Wisconsin 3.88 4.1 0.22
**
Could this mean that the issue of the utter lack of validity of Berkeley and Michigan PA is now settled? Well, of course, until one uncovers another obscure and hardly relevant set of numbers that might magically erase the plain truth that there is simply no mathematical method to support scores based on the reputation of the reputation of … an intangible and illusory element.

** a la “all 35 NRC [GRADUATE] program rankings (either aggregately or via some average)” for the sole purpose of “better” predicting the UG PA

tokenadult · November 3, 2007, 7:30am

afan makes some excellent points about the way in which regression models can simply be just-so stories.

I still think that the best way to make the case that a given university is underrated is to speak affirmatively about what makes it a great university.

collegehelp · November 3, 2007, 9:44am

xiggi-
The over- and under-assessment margins are not very large in absolute magnitude. The largest error of the estimate is still rather small. It could very well be that the differences between the actuals and the estimates are due to something that has not been accounted for, such as the consistency and breadth of faculty productivity at Berkeley, Wisconsin, and Michigan.

afan-
I think you are incorrect about the number of variables inevitably leading to a good fit. The RATIO of predictor variables to the number of schools could cause a problem but the ratio in my analysis was more than ten schools per predictor variable. And, adding more variables itself will not improve the fit if the additional variables account for redundant information. In my model, each variable adds significantly to the fit.

I think the main point is that the Peer Assessment, and probably reputations in general, are “rational” and systematic, not random. The above formula is almost certainly not a fluke. If people were completely guessing, there is no way their guesses could be predicted almost perfectly.

There is real wisdom in the collective judgements of university administrators.

Mondo · November 3, 2007, 9:47am

Is Columbia’s ranking #7 or #16?

xiggi · November 3, 2007, 12:31pm

CH, I realize that the margins are not large. However, there are plenty of schools that have extremely small margins (and I believe this what accounts for your high correlation) and then a few that are confined at both end of the spectrum.

As far as finding the elements that contribute to the over/under assessment, I thought that your efforts to incorporate additional data meant to uncover those exact intangibles. Should we simply multiply the impact of “the consistency and breadth of faculty productivity” by factors of 200 to 500% to “better” predict the PA.

Lastly, before making further references to “the consistency and breadth of faculty productivity” should we not again check the definition of the PA proposed and used by USNews: “The peer assessment survey allows the top academics we consult-presidents, provosts, and deans of admissions-to account for intangibles such as faculty dedication to teaching.”

Is dedication to teaching the same as faculty productivity? Please do tell how we go about measuring dedication to teaching at Berkeley and other over-assessed schools? And resorting to replace dedication to teaching with dedication to RESEARCH or using graduate school data is surely NOT the answer! As far as I know, it’s easier to decipher a Camus novel than the explanations given to justify the PA of the schools in reference–although the word “absurd” comes to mind in both exercises.

collegehelp · November 3, 2007, 1:29pm

Mondo-

<h1>16 should be Chicago. Good catch. Sorry about the error.</h1>

xiggi-
I have to agree with you about the quality of teaching. Scholarship and teaching are different. I wish there were a way to quantify or even rate the quality of teaching. But, perhaps quality of teaching if reflected indirectly in factors such as graduation and retention rate, alumni giving, class size…I don’t know. Teaching is a missing piece of the puzzle. So is faculty advising.

jazzymom · November 3, 2007, 2:02pm

I just want to comment that as a not-particularly-analytically-minded parent trying to guide a teenager on his college visit list, I feel an increasing sense of disconnect between the assessment of excellence at the top state universities (whether its PA or a tweaking using NRC stats) and all the news stories I’ve been reading about shrinking budgets, faculty flight, and funding uncertainty. And also the difference between high “reputation” rank based mostly on graduate level programs and the word-of-mouth from other parents about “weed out” classes at the UG level, crowded housing, and impacted programs.

If these high ratings are mostly based on graduate level programs, how can I evaluate the quality of the educational experience my future freshman will walk into?

How much should I rely on the PA, whether it’s usnwr or collegehelp’s, that I’m looking at for Cal/UCLA or Michigan or WI or ILL to assess whether the UG program is going to deliver on the promise implicit in that high ranking?

collegehelp · November 4, 2007, 8:02pm

It is interesting to see how the various factors are weighted in this formula for calculating a peer assessment score. Perhaps it gives us a look at how various factors affect reputation in our unconscious minds, subliminally.

selectivity (SAT) 52.7%
faculty scholarship (NRC) 28.7%
graduation rates 9.0%
percent full time faculty 5.4%
percent of classes with over 50 students 4.2%
acceptance rate - trace

bluebayou · November 4, 2007, 8:54pm

I realize that certain folks have a problem with the PA of certain publics. But, using SAT scores and grad rates in such a model advantages the wealthy schools with wealthier student bodies, perhaps not purposely. Further, faculty don’t equal students, by which I mean that it doesn’t make much sense to me to use SAT scores and grad rates (both of which are correlated with wealthier students) as a proxy for reputation, or whatever PA purports to measure. MIT’s faculty and PA would be the same, even if it started admitting a bunch of sub-600 scorers.

barrons · January 18, 2008, 1:56pm

You could easily come up with a formula that would prove Berkeley Michigan and Wisconsin are underranked. Their faculty quality by most academic measures far exceeds some of the allegedly underranked schools like Notre Dame. So does the depth and overall quality of campus resources from labs to libraries. The use of SAT scores and graduation rates is about the same as taking the average family income of the students and using that instead. But it sounds better.

And Jazzymom–why do you think Ivy schools come raiding the top state schools for professors? Because they have some of the best out there. The are not going to Notre Dame to get top people. They go to UCB, UM and Wisconsin. Think about it.

afan · January 18, 2008, 6:40pm

barrons,

Thanks for reviving this thread. Collegehelp does some wonderful stuff, but this is among the best. It shows that the PA is not, at all, a random reproduction of traditional biases, but instead is based on a series of metrics that, I believe, most would agree relate to university quality.

Since SAT correlates with income, there is little doubt that income could fit quite well in this regression, if the data were available. On the other hand, SAT also predicts college performance, likelihood of graduation, likelihood of going on to graduate or professional school, subseuquent income… so it seems appropriate to keep it in. If it were available it would be interesting to know whether both SAT and income would help predict PA. There is data that shows both SAT and income predict college performance. That is, after controlling for SAT, students from higher income families get higher grades. Depressing, but true.

Collegehelp-my point about number of variables perhaps needs elaboration. The concern is that when you have a very large number of variables to choose from it can be misleading to derive confidence from even an adjusted R2. If you use NRC rankings, and there are 35 departments, then that would say there are 35 potential variables to include, in addition to the SAT, class size, etc. However, if you permit combinations of departmental rankings (physics and English), then the number of departmental figures becomes huge, particularly if you do not set a priori bounds on which combinations you will consider. At the extreme, if you consider all possible combinations of departmental rankings, including anywhere from 1 to all 35 departments, and every possibility in between, the number of possible models becomes enormous. It is then highly likely that, by chance, one of these models will have a near perfect retrospective fit with a data set containing only 130 observations. With that many models, it is highly likely that many of them will do that. If one then further increases the number of models by permitting powers (squares, cubes, etc) for some variables, the problem only gets worse.

The savior here is that your final model does not appear arbitrary. Instead, it includes reasonable variables, and an unsurprising relative importance. An index of student ability, and index of faculty scholarly reputation (remember this PA is a ranking of quality by university faculty evaluating other universities), and an outcome measure.

The class size variable had a small effect, and this might reflect something that the people who contribute to the PA ignore. Perhaps they just don’t care about class size. That said, note that the low weight given to this factor helps the large publics, which could not compete with most privates on class size.

I doubt that MIT’s faculty or PA would stay the same if it started admitting large numbers of students who were not as accomplished at academics coming out of high school. Part of the appeal of MIT to the faculty is the students, they are stimulating, and it permits the faculty to teach to a very talented student body. If this started to fade, then it would become a less special place, and at least of the faculty would find that they could get as much elsewhere.

collegehelp · January 19, 2008, 7:47am

Afan-
Thanks for the positive feedback. I really appreciate it. And thank you for sharing your observations.

How to calculate a universitie's "Peer Assessment" score

CONNECT WITH US