College admissions are not independent events

@csdad2 I think we can agree that the “probabilities” (if that is the right word) of admission to colleges is, in most cases, unknown, and is perhaps unknowable from our point of view.

I think what you’re referring to is the concept of Bayesian probability, which is an interpretation of probability based on states of belief. In fact there is an entire Wikipedia article on interpretations of probability ([here](https://en.wikipedia.org/wiki/Probability_interpretation)).

Here’s another scenario (however unrealistic) where admissions to two colleges are provably independent:

Suppose I apply to two colleges P and Q (I’ll avoid A and B since they are overused). P and Q do not communicate in any way, and the amount of information each college has is constant. In particular, the amount of information P has does not depend on whether I applied to or got accepted by Q.

P and Q each have their own functions in terms of GPA, test scores, letters of recommendation, etc. that each output a real number between 0 and 1 (representing the probability of my acceptance). Maybe they’re monotonic in GPA, but that doesn’t matter. Call these numbers p and q. Then college P simulates a truly random probability p event, and accepts me iff the event holds. Same with Q.

In this case, I don’t actually know p or q, and the events of acceptance are likely correlated. But still Pr(Q|P) = q, and Pr(P and Q) = pq, i.e. acceptances to P and Q are independent.

Just saw this thread lol. @MITer94 I am familiar with Newcomb’s problem. Are you familiar with the Bayesian resolution which posits that probability is relative? I think that best concurs with how probability is applied in science and the real world.

Let me rephrase it: suppose I shuffle a deck and peek at the top card - it’s the Queen of hearts. Now I ask you, who haven’t seen the deck, what the probability is that the top card is a spade? Of course you say 1/4. I ask you what the probability is that the top card is the Queen of hearts? You’d say it’s 1/52. But to me the first probability is 0 and the second is 1.

Who’s right? Well, we both are - given the information that we respectively have. That just reflects the fact that there is more information available to me than to you, and so the random variable that is the card has less entropy for me than for you. The most puritanical devotees of frequentism might call this “credence” rather than “true” probability, but personally I prefer this interpretation of probability since in the real world, unlike in this contrived example, we often have no way of “peeking at the cards” beforehand. Philosophically speaking, probabilities don’t just “exist” in the real world; we assess them given the information available to us in order to model the world. In fact, if you believe we live in a deterministic world (or at least macroscopically deterministic; let’s leave aside quantum effects) the probability of anything is 0 or 1.

Of course always answering “0 or 1 but I have no way to tell which” would make probability useless, since the whole point is to express how sure we are of something happening, which is necessarily related to how much information we have.

@rejectedlion2016 Yeah, it just serves to show the many different interpretations of what we call probability.

Same thing with independence, admissions (or any other set of events) seems to be independent when looking at it from one perspective, and not independent when looking at it from another.

No, you misunderstand me. The events are dependent from the definition. That’s a mathematical fact. However dependence doesn’t mean what intuitively one might think it means, and does not imply causality at all. Correlation is not causation.

@rejectedlion2016 Yes, I know that two events are dependent does not imply causality. What I might have misunderstood was the probabilities we use when defining dependence.

For example, if we assigned probabilities of two events based on a Bayesian interpretation (i.e. state of belief) versus a frequentist interpretation, we might come up with two different conclusions. I didn’t say anything about causality. However this is mostly philosophical in nature, which I’d rather not get into too deeply.

Interesting discussion. I agree decisions by different schools are not independent events, but there does seem to be a high degree of noise, or whatever the right term would be, in admissions decisions for the schools admitting low percentages.

For example, of 2145 admits to the Stanford class of 2018, only 214 were also admitted to Harvard. I don’t know the corresponding numbers for Harvard but assume they’re broadly similar, as both schools have ~80% yield (of course the ~ 20% not going to Stanford or Harvard go to a variety of other schools). Assuming there is considerable overlap in the applicant pools, this would seem to say that even if the process for determining who’s a competitive applicant is similar at these two schools, that determining yes vs. no among that competitive applicant group leads to different decisions in a lot of cases. I suspect the same is true in looking at decisions at MIT vs. Caltech, Yale vs. Princeton, etc.

Now there are some students where there’s an obvious reason why they were admitted to one school but not another similar school, such as their parents are prominent, active alums at one school and that doesn’t particularly help at another school. I also think though that at places with very low admissions rates, those making decisions have to work pretty hard to try and differentiate among applicants, and then hard to predict factors start to come into play like, the first application reader at one school really liked an applicant’s essays, where the first reader at a peer school didn’t.

@bluewater2015, even assuming complete overlap of the applicants pool, and discounting yield protection (yes, every school practice that), hooks, etc, mathematically, your data just showed that the average probability of Stanford admittees of getting into Harvard is 10% instead of 5% for the entire pool.

Yes that’s right assuming everyone applied to both. My point was really that the two events, while not independent, don’t all seem all that strongly related.

Not all that strongly related? :open_mouth: I’d say doubling your chances (p(H|S) = 2*p(H)) indicates a pretty strong relationship…

It’s more than double really, as obviously not all of those 2145 Stanford admits applied to Harvard. And not all that strongly related is a subjective criterion at some level . . . of course it’s a statistically significant relationship.

A colleges’s admit rate is not a probability of admission. The applicants are not identical in attributes, like identically shaped playing cards in a deck. Applicants do not each have identical desirability to a school.

A college’s admit rate is like the average number of gazelles that lions in Africa kill on the savannah. A strong, fast lion will catch a gazelle almost every time, just like Malia Obama will get admitted into just about every school she applies to. A slow, weak lion will almost never catch a gazelle. College admit rates are the aggregated outcomes of slow lions and fast lions.

So is the likelihood of getting into harvard related to the likelihood of getting into stanford? Are you a slow lion or a fast lion? Are you Malia’s little sister?

@GMTplus7 I think that obscures the function of probability. Probability helps us to assign a confidence in the absence of complete information. Of course you wouldn’t dispute that we live in a deterministic world, but you would still say the probability of getting a heads when flipping a coin is 1/2, even though given all the information about the physics of the flip it’s deterministically either heads or tails.

It’s not possible for an individual to know for sure how adcoms will see his/her application. People are also notoriously incapable of avoiding bias when asked to judge themselves holistically. By using data about other applicants and similar students you can make predictions and assign a confidence to those predictions, which is really the value of probability.

@GMTplus7, while the very top students will have their pick of schools, there is also plenty of probability involved for the next or next next tier of students. Hypothetically, let’s say Harvard ranks all the students based on their “desirability” to the school from 1 to 40,000. Let’s say Harvard shuffles all the applications and ranks them again. The order is going to change depending on many factors, like “phase of the moon”. While it’s unlikely for an applicant ranked 10,000th to move up to 1,000th in the second time, it may very well be 8,000th or 12,000th which I agree wouldn’t make a difference. On the other hand, someone ranked 1,800th may be 2,200th or vice versa, which could be the difference between an acceptance and rejection assuming the school admits 2000.

Of course, there’s probability. All I’m saying is that **the school’s admit rate is not the probabilty for a given student. **. Too many people treat them as the same thing.

Malia Obama’s probability of getting into far-reach schools might be 85-100% (maybe Caltech doesn’t think she’s a fit, or some small LAC doesn’t want the security headaches), even though these schools’ admit rates are in the single digits.

Forrest Gump’s probability of admission into all the same schools may be 0-2%.

But harvard doesn’t do that. It doesn’t put all 40 thousand (or how many ever they get) applicants in rank order like in a nice histogram.

It has separate buckets of applicants. The development case candidates are in a separate bucket from the recruited athlete bucket. “POTUS for a dad” is a very exclusive bucket.