<p>My D is using the Princeton Review practice tests. When we graded her first one, they have you count correct responses, subtract incorrect ones (divided by 4), to come up with a Raw Score. </p>
<p>You then go to the Conversion Table to convert that Raw Score to a Scaled Score.</p>
<p>Why are all those Scaled Scores (all 3 sections) shown as ranges (i.e., 710-750)?</p>
<p>Curves vary from test to test, so they want to give you the low end and the high end. Chances are that the score would be around the middle of the range.</p>
<p>However, it is not generally a good idea to use unofficial tests as indicators of how you will score on the real thing; most people recommend avoiding these tests altogether. Does your daughter have the Official SAT Study Guide?</p>
<p>Yes, Princeton Review is a great resource for the ACT and many AP tests. For the SAT, however, the official practice tests are the most valuable resource. The best way to prepare for the SAT is to practice with material as close to the real thing as possible.</p>
<p>Turtle, do you feel the Princeton practice tests for SAT are not as valid/valuable as they are for ACT? My D did 2-3 of them for ACT – got 31/32 composites and then got a 34 on the actual test.</p>
<p>We were hoping for a similar experience with SAT.</p>
<p>She doesn’t have the Official Guide you mentioned, but I could go grab it tomorrow. She plans to do another full practice test tomorrow, and another on Friday.</p>
<p>Today her PR score ranges were W 760-800, R 730-770, M 640-680.</p>
<p>Giving a score range is actually more reflective than a single score. The range reflects the standard error of measurement that is part of any scaled test score. If you look at your score report you will also see what the error range is.</p>
<p>This is why a 780 is the same as an 800 to most colleges (who understand measurement), even though cc’ers always want to take it again for a “perfect” score.</p>
<p>The fact that a score is within one standard error of another score does not make those scores the same. Retesting 800ers one hundred times and doing the same with 780ers would surely reveal that the 800ers do better on average as a group, thereby illustrating my point.</p>
<p>It doesn’t make them “the same,” it makes them “not statistically different.” Tomorrow the 780 could get and 800 and the 800 could get a 780, all due to chance.</p>
<p>All that retesting would reduce the standard of error, and produce more reliable scores.</p>
<p>It’s also possible that many of the 780’s will end up as 800’s and vice versa.</p>
<p>By definition, those with highest scores after 100 tests have done better than those with lower scores. However, I would not bet any money on predicting the final performance on 100 tests based on one score?</p>
<p>What is the point that you say this illustrates?</p>
<p>With the application of an arbitrary standard for “different” that is sufficiently large, yes.</p>
<p>
</p>
<p>And so could someone score 450 one day and 800 the next; would that render the scores the same? Again, you have to apply some arbitrary standard. Your standard could be large enough that 450 and 800 are “not statistically different,” but you wouldn’t have many supporters.</p>
<p>I agree with this, but only because the group of students who scored 800 includes those whose abilities are well beyond the capability of the SAT to distinguishably measure.</p>
<p>In general (away from the end points of the scale), a score range of ± 30 points is pretty typical, according to the CB. For example, a score of 650 would typically result in a score range of 620-680. A score difference of 20 points (e.g., 640 vs 660) between two people in this context is not significant.</p>
<p>^I completely agree with everything you said.</p>
<p>However, once you start scoring close to 800 (750-800), the getting -1 begins to impact your score significantly. For example, a silly mistake on the math section could result in 30-40 pts difference. </p>
<p>My point it this, the SAT produces less reliable scores as your scores increase. (Someone with 1500 will usually get 1450-1550, however someone with 2300 will usually get 2220-2380). This is because as your scores increase -1 becomes a larger score deduction.</p>
<p>No, I don’t think so; your range is too large. The people I know who score 2380+ almost never went below 2350 on practice tests, and the people who score ~2200 almost never hit 2300.</p>
<p>Though this error will vary slightly on each administration, it is pretty much consistent.</p>
<p>Reliability on each section of the test runs around .9, meaning any changes in score beyond the standard error represents a real difference in performance.</p>
<p>Some definitions that the College Board Uses:</p>
<p>Standard Error of the DifferenceSED The SED is a tool for assessing how much two test scores must differ before they indicate ability differences. To be confident that two scores indicate a true difference in ability, the scores must differ by at least the SED times 1.5. For example, SAT verbal and math scores must differ by 60 points (40 x 1.5) in order to indicate true differences of ability.</p>
<p>Standard Error of MeasurementSEM The SEM is an index of the extent to which students’ obtained scores tend to vary from their true scores. It is expressed in score units of the test. Intervals extending one standard error above and below the true score (see below) for a test taker will include 68 percent of that test taker’s obtained scores. Similarly, intervals extending two standard errors above and below the true score will include 95 percent of the test taker’s obtained scores.</p>
<p>True Score (See Standard Error of Measurement)
*True score is a hypothetical concept indicating what an individual’s score on a test would be if there were no error introduced by the measuring process. It is thought of as the hypothetical average of an infinite number of obtained scores for a test taker with the effect of practice removed. *</p>
<p>The calculation of score, SEM, SED, and the supporting statistical characteristics are reviewed by colleges who would not use the scores if they felt the SEM and SED were arbitrary, rather than statistical. Because the population size is so large, the numbers usually vary little from administration to administration.</p>
<p>
</p>
<p>The ranges can’t be used to compare practice tests to actual tests. They only compare scaled scores earned on a given administration. It is always dangerous to generalize anecdotal data, though that seems to be one of the major reasons to post on CC.</p>