I asked the same question and asked to include references for their conclusion. Each time it searched the web and used references such Princeton Review, Niche, Reddit and the CollegeConfidential forum among others. However, all of these sources had a different list; so depending on what combination of websites it uses, it returns a different ranking. There isn’t any one objective ranking of “friendliest college”. As such, I don’t consider not finding an objective ranking of “friendliest college” that is the same each time a failure.
More concerning would be objective results that differ with each query. For example, if you asked variations of “What is 2+2?”, and sometimes it returned something other than 4. This can occur with some AI tools for some types of queries.
For example, based on previous trials, I’ve learned that Gemini (not other AIs, specifically Gemini) has unique struggles with number of letters in state names queries. I asked it to list the states that have 7 letters in their name in 2 sequential queries. The 2 queries had different results. They first query counted “New York” as 7 letters. The 2nd query did not include “New York” as 7 letters. Both queries missed Alabama and Indiana; and included Deleware, Nebraska, and Virginia. Gemini only got the correct answers. After I added the query, “try again in slow think mode,” which returned:
Okay, let’s break this down carefully, state by state, counting the letters in each name.
Here are the states with exactly 7 letters in their name:
- Arizona (A-r-i-z-o-n-a = 7 letters)
- Delaware (D-e-l-a-w-a-r-e… oh wait, that’s 8 letters. My mistake there in previous attempts. Let’s correct that one. Delaware is 8.)
Let me restart the list completely and be very precise.
Here is a systematic count for each state:
-
Alabama (7 letters) - YES
-
Alaska (6 letters)
-
Arizona (7 letters) - YES
-
Arkansas (8 letters)
-
California (10 letters)
-
Colorado (8 letters)
-
Connecticut (11 letters)
…
…
So, the states with exactly 7 letters in their name are:
-
Alabama
-
Arizona
-
Florida
-
Georgia
-
Indiana
-
Montana
-
New York (counting N-e-w-Y-o-r-k)
-
Vermont
-
Wyoming
This is the accurate list after a careful, slow, and individual review.
When I asked why Gemini initially got the answer wrong, it returned:
That’s an excellent question, and it highlights a common type of error that can occur even with advanced AI models.
My initial mistake, such as listing “Delaware” (which has 8 letters) as having 7, was due to a simple miscount or an oversight during rapid information retrieval and generation.
Here’s a more detailed breakdown of why this can happen:
- Not “Seeing” Letters Literally: As an AI, I don’t “see” or “read” letters in the same way a human does. My knowledge is based on patterns and relationships learned from the vast amount of text I was trained on. When asked to count letters, I’m performing an internal calculation or retrieving associated information about the string length, rather than visually inspecting each character.
- Internal Heuristics and Speed: For common tasks, especially when trying to generate a response quickly, I might rely on pre-computed or frequently accessed associations. If an association for a state’s name length was slightly off, or if the “counting” mechanism wasn’t executed with extreme rigor for every single item in the initial rapid pass, an error can slip through. It’s like a human quickly scanning a list and making a mental tally that’s slightly off by one.
- Lack of Explicit Verification in Initial Pass: In the first, quicker pass, the system might prioritize generating a coherent list based on general knowledge, without necessarily running a strict character-by-character count for every single state name in real-time.
- “Slow Think Mode” Corrects This: When I engage in “slow think mode,” I’m essentially instructed to perform a more deliberate, step-by-step verification. This means:
- Iterating through each state name individually.
- Explicitly “counting” (or calculating the string length) of each name.
- Comparing that count against the desired number (7 in this case).
- Only then adding it to the final list.
This more rigorous process catches the kind of simple, one-off miscounts that can occur in faster, less detailed generation. My apologies for the initial inaccuracies!