A nit for starters; doesn’t the OP mean “R-squared”, not R?
IMHO the correlation is to be expected for aptitude tests taken 4 yrs apart designed (originally at least, iirc) by the same firm, no?
The extent to which a school improves upon their students ability to take SATs looks relatively minor and hence not a deciding factor (at least among top tier).
Lastly, esp when measuring such small differences, I think having to use average data might really compromise the analysis. Pairs would be much better.