National Merit Cutoff Predictions Class of 2017

@CA1543 wrote: As someone has pointed out there is compression at the top because the scale range is smaller so if more students have to fit into fewer SI’s, it may also tend to push up a state’s SI cut off because of the limited allotment each state gets.

Sorry, I thought the point of this adjusted % table was to take the extra people out of the top 1%. So yeah, if this table is still cramming more than .5% into the 99+ slots then looking at historical % tables would be meaningless. Not clear what is being attempted now.

@123field, I like you comments " My gut tells me the CB and NMC do not want a big controversy when NMSFs are announced. As such, it is simplest to keep this year’s SIs and the corresponding percentiles similar to the relationship of SIs and percentiles in previous years."

As @DoyleB tried to refute above (again), I will repeat - changing the percentiles a full percent (or half a percent) just based on the definition A / definition B change is factually incorrect.

If you’re concording SI’s based on percentiles, you need to adjust the SI by exactly 1 point. That’s it.

Here’s an example.

Say the new table with the new definition tells you that 214 is exactly 99.5%.

You go to the previous table, which has the old definition, and pretend it says that 224 is also exactly 99.5%.

If they were using the same definitions of percentiles, you would just concord 214 → 224.

But because the new definition is >=, while the old is >, you would concord 214 → 223. You just subtract 1 (the smallest discrete unit of the test) from the old score. That’s it. There’s no ambiguity.

If you want to make your own percentile tables, go right ahead. Just don’t blame the changing definitions for full percentage differences at the top end.

Doyle and I don’t agree on a lot of things - but we agree on this!!

My fundamental and logical assessment of cut off were based on assumptions similar to @123field 's comments. But @123field put it into words nicely and simply. Thank you

Just looking back at the Walton Data and thinking of Georgia specifically. Georgia, while not in the ranks of NJ, CA, etc. Is typically a higher scoring state. Last year there were only 10 states that had a higher cutoff. In regards to Walton, 2 years ago they had 25 SF but slipped last year to 16 (one likely scenario is that they had several kids just
miss the 218 cutoff). This year their top 100 students averaged 216.6. As @theshadow had noted, it is highly probable that the median score (50th highest or so) is likely less than that mean of 216.6 due to skew. Just for the sake of discussion let’s assume that Walton gets back to 25 SF or they have a good year and win 30 spots. Based on that 216.6 average it’s possible that 218ish might be that 30th score. That’s identical to last year.

But now onto another thought. Based on just simple math, any higher scoring state (or school for that matter) should have more than 2 commended students for every SF. Regardless of the final cutoff, if Walton had 25-30 SFs, you should expect them to have at least 60+ commended. Based on the fact that Georgia’s cutoff is usually significantly above 99% I’d expect nearly every one of those Top 100 scores to be commended, possibly even more. Some were predicting 210 for a commended cutoff. I’m not seeing it. Is it really likely that the 100th top scorer at Walton got a 210? I guess anything is possible but based on that mean it seems unlikely. You guess that 100th score and that’s likely a good guess for commended in my eyes.

@thshadow and @DoyleB – I read through your comments and appreciate them. Was hoping we could try to develop a reasonable estimate of an SI concordance table or see if we can back into some of the predictions out there such as by Test Masters. Thanks for your analysis.

Trying to clarify a bit what I said above. I would like the Walton data from the Cobb report and the SI tables to be mutually possible, but I’m finding it hard. To my mind, Table 3 in the Cobb report is ambiguously titled: “Mean PSAT scores for the top 100 scores for 11th graders at each school.” http://www.cobbk12.org/news/2016/PSAT2015.pdf

Does that mean: “we took the scores from the top 100 students in each school, broke out their Math and ERW scores, and listed the mean of each subject here”? That is how most people are taking it. Mozart6023 summarizes by saying that at Walton “this year their top 100 students averaged 216.6” (or 1453 TS).

If that is correct, then it is very unlikely that the CB SI %ile table on page 11 of “Understanding your Scores” can be correct. The only way it could is if Walton is having an anomolously successful year. That itself is possible (Applerouth tutoring, redoubled focus on PSAT, banner year) but I think it’s unlikely.

But what if “Mean PSAT scores for the top 100 scores for 11th graders at each school” means: we took the top 100 scores from the school in Math and listed them, and did the same for ERW.” In that scenario, I can see 20-30 NMSF coming out of Walton, and the CB SI %iles still being potentially accurate as published.

@DoyleB and @thshadow
This percentile definition issue reminds me of 7th grade pre-algebra where the open circle means <X and the closed circle means <= X . . . So it sounds like the change was merely something similar and so the best approximation for the continuous function is the smallest discrete jump (hence up one SI point). I see what you mean. So much for convenient reasons for ridiculously high percentiles. I’m going to have to re-explore @pickmen’s conclusion about the student population base. It’s simply astounding to think that CB could possibly use the ENTIRE 11th grade “research sample” - what the hell would "Representative"that have to do with NM? Still having a hard time believing this . . .

@thshadow and @DoyleB I think your statement that the change in percentile definition only shifts the SI by one point is predicated on the assumption that behind the scenes CB distinguishes the multiple 98s, 99s, and 99+s as different numbers in each row, but I am not sure it’s the case. Imagine if the SI% table was instead presented like this

214 - 228 99.5%
205 - 213 99%
202 - 204 98%

and so on.

Then if you want to change the definition of percentiles, you would restate this table as

214 - 228 99%
205 - 213 98%
202 - 204 97%

If you take this approach, the SI% tables match the concordance tables reasonably well. This would also suggest that the cutoffs should not change much for the high scoring states, at least.

@LadyMeowMeow – Hmm it might be that they are separately averaging top scores in each category - but then the “Total” would be contrived - not really actual Total scores of students based on students overall results. That would be a misleading report.

Second half of manifesto incoming…

@thshadow I’ve enjoyed running things past you so far, so let’s continue. I’d like you to help me with something I’m struggling with. We’ve discussed a lot of this before. Let’s use Georgia again.

Let’s say we are looking for some way to generate an SI table for this year that has GA’s cutoff at 216. That doesn’t seem to be an unreasonable number based on the anecdotes we’ve seen.

We know GA is typically in the 3rd or 4th from the bottom of the 99s, which in this year’s table is roughly a 207. Somehow we need to concord a 207 to a 216. We can’t just do it because we want to - we need valid reasons to shift the tables.

Note, however, that’s it’s not just us that need to do that. CB needs to do it as well, because their own concordance tables suggest that a 218 last year corresponds to a 216 this year, not to a 207.

We have a few tools at our disposal to do that. One of them is the change in percentile definition. You and I both agree that this change allows us to shift the rank by only one. So this allows us to concord a 207 to a 208. That’s it.

The population that the SI table refers to appears to be a gold mine. But first of all, we need to decide what population the table refers to. If the table is “test takers”, it helps us a little bit. This year roughly 1.72 million juniors took the test. Last year it was 1.57 million. So the population increased by almost 10%. That helps us a little - to be in the top 16000 last year required a student to be in the 90.0% bracket; this year they need to be in the 90.1% bracket. So that would allow me to concord roughly one point higher in the table, so when accompanied by the percentile change, my 207 could be a 209.

If, however, the SI table refers to the “national” sample, then how much I gain depends on my assumptions of where the students fall in the distribution. Publishing a “national” SI table seems silly, but let’s go there anyway. If the phantom “non test takers” perform identically to the test takers, then the “national” and the “user” tables are identical, and I get no help at all - I can’t move up in the table.

If, on the other hand, the phantoms perform poorly, and none are in the top of the distribution, that helps me a lot. Half of the 99+s become 99s, the 99s become 98s, etc. I’m well on my way to getting the results that I want.

Unfortunately for me, and as you’ve pointed out before, I can’t do this. Why? Because we have the “national” and “user” tables for total score, and they are very similar at the top. Not identical, but close. CB has decided the phantoms perform just slightly worse than the test takers. Thus, the differences in population might allow me to move up a couple of ranks. So now my 207 could become a 210, maybe 211. But nowhere near 216.

So here we are. We can legitimately concord a 207 to a 209 (assuming SI refers to test takers), and maybe to a 210-211 if we assume SI refers to national. We’re not close yet - we can’t get to 216.

What’s left? Applerouth talked about a few possibilities. Maybe the sample population wasn’t representative. That seems unlikely though - they sampled 80000 kids. They do this all the time, and they do it for a living. But it’s a possibility.

Maybe they didn’t account for changes to number of questions, guessing penalty, test difficulty, etc. But CB scored the test. They knew the distribution they wanted to target. They could have changed the scoring tables to fit the distribution they wanted.

Maybe I’m wrong about where the GA cutoff should be, and additionally the concordance tables are way off in the same direction my “sniff test” meter is taking me. This would mean GA’s cutoff is around 208, and all the anecdotes across the country turn out to be red herrings. When the final concordance tables come out, the percentiles will fall substantially. But that seems pretty unlikely to me.

So now I’m stuck. I have no intellectually honest ways to concord this years SI table to CB’s own concorded values, or values that seem reasonable to me. Any suggestions?

And to follow up on @AnnMarie’s point (#1948), when we concord a 37 to a 73, a 74, or a 75 do we instead make a range where 73 is a 37.0, a 74 is a 37.5, and a 75 is a 37.9? Perhaps I SHOULD have been doing concordance that way.

Working back and forth from step functions to continuous is turning out to be a challenge. Viewer “Discretion” advised! LOL

@CA1543 Yes, if they are separately averaging the top scores in each category, then the “Total” would be misleadingly high, a fabrication that represents component scores from more than 100 students.

The reason I think it’s even possible that they did this is that Table 2 has the same categories for all students: Math, ERW, and Total. In that case, the Total is NOT misleading, since every student’s component scores are represented, and the Total Mean Score is a real representation of the average student in the school. But if they had the columns all lined up from making Table 2 & then decided to “take the top 100” for Table 3, then the misleading total could have emerged.

I’m not saying anybody did anything nefarious – I’m just wondering if it could have happened.

(1) regarding this idea that all scaled scores are raised by an entire percentage point as a result of the change from definition 1 to definition 2 (which would, yes, be really really weird): I think this is foreclosed by Compass’s page 7 chart of the effect on math scores at various levels

(2) Compass says that SAT report is changing to be like ACT report with respect to these definitions - what does ACT do?

(3) regarding the idea that SI percentiles in Understanding Scores report might be percentiles of actual test takers from October 2015, I thought that someone on CC said that CB tweet had stated that these were also from research study (maybe I’m remembering wrong?)

(4) if all percentiles in Understanding Scores reports and concordance tables come from the same research study, then they have to be consistent, unless CB made a mistake, right? research studies might predict October 2015 scores incorrectly for various reasons, but that even if so, that can only explain inconsistency between these statistics and the statistics of actual scores, not inconsistency between two different sets of data that rely on the same research studies, right?

Still thinking about Testmasters’ predictions based on concordance: if I recall correctly, they claimed one of the issues they found when following the prelim chart strictly was too much compression in 210-215 SI cutoff range and the cutoffs they developed “smoothed” this out. What I think Testmasters was getting at was the perhaps unintended consequence of lumping 50k plus testers into a very small numerical room in terms of SF and commended break-outs. In other words, if too many kids are standing on the same numerical property, so to speak, too many of them, maybe a majority will be SFs and few will be commended. The only way to create more numerical space and find the 1/3 to 2/3 ratio of SF to commended I think would be to break down each SI into fractions or deciles. But I don’t see this happening or maybe I’m not seeing it right. I would have figured CB or NM Corp would have seen a mile away the problem this type of compression would cause.

When I look at the Walton data, it seems reasonable to predict a GA cutoff of like 217. When looking at Walton’s top 100 students, you are analyzing the data from the very end of the bell curve. So you will have a greater number of students below the mean than over the mean. So let’s say you have the group below and the SI calculated is on the low side given possible given TS. I will assume this because these students score better on the math section.
30 students 1430 SI 214
25 students 1440 SI 215
10 students 1450 SI 217
10 students 1460 (I don’t calculate SI at this points because these students would most like make cut off)
10 students 1470
5 students 1480
5 students 1490
5 students 1500
This are extremely rough estimations but I think you can follow what I am saying. Given this scenario, their could be about 45 NMSF from that school this year with a cut off of 217 and maybe 35 to 40 with a cut off of 218. This would be historically high. However, it is not outlandish. I also predict that this year there will be even more clustering of NMSF in particular areas and schools. Given that this test was easier for the high achieving population, their scores will be even higher relative to others. I believe this test would be easier to prep for given the lack of harder questions so this difference could be even more pronounced over the years as test prep courses become even more familiar with the new PSAT and SAT format. I am curious to see if that comes to pass. What do you think?

@CA1543 re: #1921:
“228 – 99±- 99+
227-- 99+ 99+
226-- 99+ 99+
225-- 99+ 99+
224-- 99+ 99+
223-- 99+ 99+ DC?
222-- 99+ 99+ DC?
221-- 99+ 99+ Cal? DC?
220-- 99+ 99+ Cal?
219-- 99+ 99+ Cal?
218-- 99+ 99+ NY/ TX / GA / Wash?
217-- 99+ 99 – NY/ TX / GA / Wash?
216-- 99+ 99 - NY/ TX / GA / Wash?
215-- 99+ 99
214-- 99+ 99
213-- 99 98
212-- 99 98
211-- 99 98
210-- 99 98
209-- 99 98
208-- 99 98
207-- 99 98
206-- 99 98
205-- 99 98
204-- 98 97
203 – 98 97
202 – 98 97
201 – 97 96
200 – 97 96
199 – 96
198 – 96”

The problem I have with this–at least with respect to GA–is that I believe GA’s cutoff score for the testing years 2010 to 2013 have never fallen in the 99+% range, but rather solidly in the 99% range. For example, in the 2013 test year, GA’s cutoff was 215. It ranked 14th out of 51 (states + DC), but was the 12th in the 99%. We know from another post, that the range for 99% that year was 213-222, so GA was nowhere near the 99+%. For this reason, I think 218 is too high. Personally, just based on gut, I think GA’s cutoff will range somewhere between 214 and 216, which makes me think Testmasters has it about right.

@GABaseballMom – yes I understand your point & review of prior years data is helpful of course. I am though wondering, as many high scores are being reported, if there will end up being a large number of students at SI score that is on the cusp - and this year might not make it because there are not enough NMSF spots to allocate so the SI cutoff is pushed a bit higher than the past in some states while coming down a bit in others. I have no reason really to doubt that Test Masters looked at whatever data they could, the percentile info and past state performance as well to come up with their predictions. They may not prove accurate with “precision” as the real test data may vary and states with similar cut offs in the past could have some anomalies.

Let’s see what our CC stats experts can predict - I was just trying to convert some of the percentiles (as adjusted for the change in the definition if possible) for the 2015 SI scores into estimates for the states to get an idea of how it might correlate with predictions we’d seen published. I am no expert on this by any means.

@LadyMeowMeow I think someone pointed out earlier that CB would not report scores that way. Meaning the top 100 math score and the top 100 ERW scores (but not necessarily the same students) and the add them together to get a mean score.

That being said, (and I don’t know how to do this), I’m guessing you could take schools from Cobb with about 200 test takers; look at their mean for all test takers; then look at the mean for the top 100 to determine if they make sense. To be honest, I’m not sure if this analysis would lead to a conclusion regarding whether your proposal is valid or not.

Yes, I agree. There are going to be more people on the cusp. I am definitely in the camp that would predict higher cut offs more in line with Test Master’s. I do not know what to make of the percentiles listed in PSAT understanding scores 2015. I am following everybody’s analysis on this with a lot of interest. I am not predicting a particular cut off for GA just based on Walton. I was just trying to say that the Walton data does not necessarily only align with a crazy high cut off like 220 or something.