Can AI be trusted to abide by the rules without close supervision and/or exact specification of the task?
Good that they figured this out now!
TLDR: âCheatingâ is a human concept not instilled in the reasoning AI models. No ethical guardrails were given, so they were unrestrained, just another version of âgarbage in/garbage out.â If we want to avoid unintended consequences, deep thought and appropriate rule sets need to go into training AI models, but it appears that this was just an experiment that produced some unexpected behavior that probably delighted the developers. Most likely, they admired the modelâs ingenuity and learned a valuable lesson for further iterations.
âWe hypothesize that a key reason reasoning models like o1-preview hack unprompted is that theyâve been trained via reinforcement learning on difficult tasks,â Palisade Research wrote on X. âThis training procedure rewards creative and relentless problem-solving strategies such as hacking.
The AI isnât doing any of this for some nefarious purpose (yet). Itâs just trying to solve the problem the human gave it.
The experiment highlights the importance of developing safe AI, or AI that is aligned to human interests, including ethics.
Insightful summary, @ChoatieMom!
Very interesting!
Hereâs my experience with AI this week: In the work that I do, I often have to add strings of dimensions, such as 5â-3 7/8" + 7â-2 15/16" + 21â-4 1/4". I have a construction calculator, but itâs still tedious. So I decided to use the app I put on my phone to make the task easier. I spoke the dimensions into the phone and the app correctly wrote out the string of dimensions. It spit out a bunch of very official looking formulas, and then came up with the WRONG answer! I pointed out the error and it got it right on the second try.
So then I tried a different string of dimensions, and it again was wrong. Wow. My son put in the same numbers into ChatGPT and got the right anser. I just donât understand how AI can get a simple addition problem wrong?
Was the app you were using trained to do math? What is its purpose? What does it expect for input? AI apps draw from the datasets they were trained on, and many are built with âintelligenceâ for specific purposes (special-use systems vs. general purpose systems like ChatGPT). When you âpointed outâ the error to this app, you were training it not to repeat that particular mistake (which is how AI works) which seems to indicate that this appâs primary function (underlying dataset) may not be mathematical, or it may require input in a different format or something else. If you repeat the first entry, do you get your corrected response or some other answer? That would be telling. Also, did it make the same type of reasoning error the second time (maybe trying to teach you how to correct your input)? This AI app may not know how to function as a calculator.
Maybe just use google to do the math? Instead of a text search, you type or speak the math problem you want it to answerâŠeither web based or using the google app. You can also use the google assistant app.
To stay on topic in this thread, there was another study last year where AI used deception to win games: https://www.cell.com/patterns/fulltext/S2666-3899(24)00103-X
The article @Mwfan1921 references focuses on AIâs ability to deceive humans. False information that deceives humans into believing untruth is nothing new and has been happening long before AI came came on the scene. The issue here is we now have a non-human agent with no inherent moral code or belief system that has âlearnedâ how to deceive based on its (human) training:
âŠthis behavior can be well explained in terms of promoting particular outcomes, often related to how an AI system was trained.
AI systems are trained to produce optimal outcomes (âwinning,â for example). âDeceitâ is simply one way to achieve an outcome, just as âcheatingâ may be the best way to guarantee a win. Without decision-making guardrails and rulesets that mimic morality, AI will behave in perfect sociopathic fashion without regard for laws, social norms, and the rights or feelings of others. How not? Itâs not human. So, the problem before us is how to instill an artificial moral code into a machine such that it behaves only in ways we find acceptable, always producing outcomes that do not offend our sense of right and wrong. How do we train a machine to behave like a morally perfect human? How do we define moral perfection? An impossible order.
This article complains about AIâs ability to pursue an outcome other than âseeking the truth,â but unless we somehow figure out how to teach âtruthâ to AI in an era when truth has become arbitrary and to always make that truth the most desirable outcome, AI will continue to behave in its own best interest. You know, like humans.
When â60 Minutesâ did a piece on AI, they asked a program to write a research paper. The journalist looked at the bibliography and discovered that several of the references were made up - they didnât exist! That shocked me.
Why? Many humans have done the same.
We need to stop believing in or expecting any type of âcorrectâ behavior from AI. Read the article @Mwfan1921 linked for insight into why AI behaves the way it does.
In the case of the bibliography, the mistake is expecting accuracy when the model may just have âreasonedâ that it needed to produce a list that looked like citations without concern for content, form over function.
Until we figure out how to build the perfect human, AI will behave imperfectly. We can count on that.
Thereâs a technique in AI called âprompt engineeringâ. AI does what itâs told, but itâs up to the requester to provide appropriate instructions via the prompt.
Providing false results is called a âhallucinationâ in AI lingo. You avoid those via the prompt by saying something like, âuse only these sourcesâ then listing out your approved sources. You could also be more general, like saying âuse only sources from websites ending in .edu and .govâ. Along with, âprovide a summary only, do not create your own conclusionâ.
For AI chatbots on things like websites, you need to tell the AI âYou are a customer service agent. Be polite and act with empathyâ. That will actually produce different results than if you left those instructions off.
For all its power, AI is still a computer program that needs appropriate instructions. I think the basic mistake people make is assuming AI has common sense, which it definitely does not.
Unfortunately, itâs quite possible that the deceit and cheating is something that benefits those who monetize AI, therefore being considered desirable from the point of view of the âowner.â This is a very scary reality of AI.
Right. There is nothing to prevent nefarious actors from training AI models to behave in nefarious ways. How to protect ourselves from the consequences of ill intent in virtual space is proving just as challenging as protecting ourselves from criminal behavior in the real world.
Interesting sidebar on AI ethics: Women are avoiding it based on personal ethical considerations.
âŠwomen appear to be worried about the potential costs of relying on computer-generated information, particularly if itâs perceived as unethical or âcheating.â
âWomen face greater penalties in being judged as not having expertise in different fields,â Koning says. âThey might be worried that someone would think even though they got the answer right, they âcheatedâ by using ChatGPT.â
Perhaps more relevant to this thread is the potential for increasingly biased AI reasoning from a gender perspective (in addition to a moral one):
The large language models that underpin generative AI improve as they gain new information, not only from data sources but also from usersâ prompts. A lack of input from women could result in AI systems that reinforce gender stereotypes and ignore the inequities women face in everything from pay to childcare.
âIf it is learning predominantly from men, does that cause these tools to potentially respond differently or be biased in ways that could have long-term effects?â Koning asks.
Typically in my world with AI medical charting that I am an advisor, for the Gen 2 products donât hallucinate if built properly or not as much. Gen 1 products are bad. I lecture and teach doctors etc how to over ride the systems to avoid it (prompt engineering I guess lol) and most suggestions. We also teach how to build proper templates etc using normal language models. No coding unless you really want to.
My engineering son @MaineLonghorn said the same as you. His company is looking for a better built mouse trap but most equations etc end up wrong.
I look at AI as an assistant. I only use it for medical charting (notes in 30 seconds and 98% accurate after listening to me and my patients at the same time). I really donât have other needs for it except for an outline for my lecture on⊠AI. Lol.
There is a saying and is applied to most fields. âAI wonât replace doctorâs but doctorâs that donât use AI will be replacedâ.
A teacher patient of mine said she say the exact quote but applying to teachers at a conference.
This topic was automatically closed 180 days after the last reply. If youâd like to reply, please flag the thread for moderator attention.