I want to thank you for those links. I have only read the first paper in detail, but to me it has the same issues that I have seen in previous papers.
But first, some background. Most of my 30 years after college has been involved in analyzing data, using techniques such as regressions and simulations, but also others as well. Before tools like MATLAB, SAS, and R became highly popular, I actually wrote statistical analysis code in the 90s that was widely used at the time, and I was a consultant to many companies, including a few Fortune 500 companies, on how to perform statistical analysis properly. I still do statistical analysis today, for a financial company.
When I have issues with academic papers, it often comes down to two things. First, many academics just love regressions. It is their hammer, and many see everything as a nail. While regressions are useful, the major problem with them is that it does not reveal the non-linear behavior that occurs in different portions of the data. When partitioning the data and doing an analysis on a subset of it, the patterns revealed by regressions are often opposite of what the actual pattern is on the larger data set. Here is one simple example. If we were to do a regression of post-graduate income to college GPAs for all college students, we would show a statistically significant positive loading on GPA. However, if we only consider the students at Harvard, we would likely see the opposite effect (the lower GPA students are often more financially successful).
This leads into my second criticism of the article, which is that they built the model using the wrong blueprints. Specifically, they performed their analysis on students that were already partially selected on the basis of their SAT scores. The ideal way to do this test would be to require every student to take the SAT, ignore it for the purpose of admission, and then analyze the GPA as a function of the SAT afterwards. Now, that might not be possible, at which point they should have basically said this is the right way to do it, but we couldn’t, so here is the best we could do. Instead, they just ignored the partitioned data problem that I mentioned above and went ahead and kept hammering until they got results which they presume are meaningful because it passes a t-Test.