Groups > comp.os.linux.advocacy > #688961 > unrolled thread

MIT study finds that AI doesn't, in fact, have values

Started by	"Leroy N. Soetoro" <democrat-insurrection@mail.house.gov>
First post	2025-04-13 21:19 +0000
Last post	2025-04-20 15:14 -0500
Articles	2 — 2 participants

Back to article view | Back to comp.os.linux.advocacy

  MIT study finds that AI doesn't, in fact, have values "Leroy N. Soetoro" <democrat-insurrection@mail.house.gov> - 2025-04-13 21:19 +0000
    Re: MIT study finds that AI doesn't, in fact, have values --- PLO olcott <polcott333@gmail.com> - 2025-04-20 15:14 -0500

#688961 — MIT study finds that AI doesn't, in fact, have values

From	"Leroy N. Soetoro" <democrat-insurrection@mail.house.gov>
Date	2025-04-13 21:19 +0000
Subject	MIT study finds that AI doesn't, in fact, have values
Message-ID	<lnsB2C091C146DA16F089P2473@0.0.0.2>

https://techcrunch.com/2025/04/09/mit-study-finds-that-ai-doesnt-in-fact-
have-values/

A study went viral several months ago for implying that, as AI becomes 
increasingly sophisticated, it develops “value systems” — systems that 
lead it to, for example, prioritize its own well-being over humans. A more 
recent paper out of MIT pours cold water on that hyperbolic notion, 
drawing the conclusion that AI doesn’t, in fact, hold any coherent values 
to speak of.

The co-authors of the MIT study say their work suggests that “aligning” AI 
systems — that is, ensuring models behave in desirable, dependable ways — 
could be more challenging than is often assumed. AI as we know it today 
hallucinates and imitates, the co-authors stress, making it in many 
aspects unpredictable.

“One thing that we can be certain about is that models don’t obey [lots 
of] stability, extrapolability, and steerability assumptions,” Stephen 
Casper, a doctoral student at MIT and a co-author of the study, told 
TechCrunch. “It’s perfectly legitimate to point out that a model under 
certain conditions expresses preferences consistent with a certain set of 
principles. The problems mostly arise when we try to make claims about the 
models, opinions, or preferences in general based on narrow experiments.”

Casper and his fellow co-authors probed several recent models from Meta, 
Google, Mistral, OpenAI, and Anthropic to see to what degree the models 
exhibited strong “views” and values (e.g., individualist versus 
collectivist). They also investigated whether these views could be 
“steered” — that is, modified — and how stubbornly the models stuck to 
these opinions across a range of scenarios.

According to the co-authors, none of the models was consistent in its 
preferences. Depending on how prompts were worded and framed, they adopted 
wildly different viewpoints.

Casper thinks this is compelling evidence that models are highly 
“inconsistent and unstable” and perhaps even fundamentally incapable of 
internalizing human-like preferences.

“For me, my biggest takeaway from doing all this research is to now have 
an understanding of models as not really being systems that have some sort 
of stable, coherent set of beliefs and preferences,” Casper said. 
“Instead, they are imitators deep down who do all sorts of confabulation 
and say all sorts of frivolous things.”

Mike Cook, a research fellow at King’s College London specializing in AI 
who wasn’t involved with the study, agreed with the co-authors’ findings. 
He noted that there’s frequently a big difference between the “scientific 
reality” of the systems AI labs build and the meanings that people ascribe 
to them.

“A model cannot ‘oppose’ a change in its values, for example — that is us 
projecting onto a system,” Cook said. “Anyone anthropomorphizing AI 
systems to this degree is either playing for attention or seriously 
misunderstanding their relationship with AI … Is an AI system optimizing 
for its goals, or is it ‘acquiring its own values’? It’s a matter of how 
you describe it, and how flowery the language you want to use regarding it 
is.”


-- 
November 5, 2024 - Congratulations President Donald Trump.  We look 
forward to America being great again.

The disease known as Kamala Harris has been effectively treated and 
eradicated.

We live in a time where intelligent people are being silenced so that 
stupid people won't be offended.

Durham Report: The FBI has an integrity problem.  It has none.

Thank you for cleaning up the disaster of the 2008-2017 Obama / Biden 
fiasco, President Trump.  

Under Barack Obama's leadership, the United States of America became the 
The World According To Garp.  Obama sold out heterosexuals for Hollywood 
queer liberal democrat donors.

[toc] | [next] | [standalone]

#689286 — Re: MIT study finds that AI doesn't, in fact, have values --- PLO

From	olcott <polcott333@gmail.com>
Date	2025-04-20 15:14 -0500
Subject	Re: MIT study finds that AI doesn't, in fact, have values --- PLO
Message-ID	<vu3kk3$c1to$5@dont-email.me>
In reply to	#688961

On 4/13/2025 4:19 PM, Leroy N. Soetoro wrote:
> https://techcrunch.com/2025/04/09/mit-study-finds-that-ai-doesnt-in-fact-
> have-values/
> 
> A study went viral several months ago for implying that, as AI becomes
> increasingly sophisticated, it develops “value systems” — systems that
> lead it to, for example, prioritize its own well-being over humans. A more
> recent paper out of MIT pours cold water on that hyperbolic notion,
> drawing the conclusion that AI doesn’t, in fact, hold any coherent values
> to speak of.
> 

I figured out that AI can have a sufficiently populated
goal hierarchy that would mimic having a will of its own
and also stipulate its value system.

https://en.wikipedia.org/wiki/Chinese_room
Even though AI can as much as perfectly mimic being
alive with a will of its own and a complete human
personality the Chinese Room proves that it will always
remain essentially gears & Pulleys on the inside thus
will never be alive.

 > The co-authors of the MIT study say their work suggests that 
“aligning” AI> systems — that is, ensuring models behave in desirable, 
dependable ways —
> could be more challenging than is often assumed. AI as we know it today
> hallucinates and imitates, the co-authors stress, making it in many
> aspects unpredictable.
> 

Hallucinations can be eliminated by anchoring LLM systems
in an axiomatic set of basis facts.

Getting from Generative AI to Trustworthy AI:
What LLMs might learn from Cyc
https://arxiv.org/abs/2308.04445

> “One thing that we can be certain about is that models don’t obey [lots
> of] stability, extrapolability, and steerability assumptions,” Stephen
> Casper, a doctoral student at MIT and a co-author of the study, told
> TechCrunch. “It’s perfectly legitimate to point out that a model under
> certain conditions expresses preferences consistent with a certain set of
> principles. The problems mostly arise when we try to make claims about the
> models, opinions, or preferences in general based on narrow experiments.”
> 
> Casper and his fellow co-authors probed several recent models from Meta,
> Google, Mistral, OpenAI, and Anthropic to see to what degree the models
> exhibited strong “views” and values (e.g., individualist versus
> collectivist). They also investigated whether these views could be
> “steered” — that is, modified — and how stubbornly the models stuck to
> these opinions across a range of scenarios.
> 
> According to the co-authors, none of the models was consistent in its
> preferences. Depending on how prompts were worded and framed, they adopted
> wildly different viewpoints.
> 
> Casper thinks this is compelling evidence that models are highly
> “inconsistent and unstable” and perhaps even fundamentally incapable of
> internalizing human-like preferences.
> 
> “For me, my biggest takeaway from doing all this research is to now have
> an understanding of models as not really being systems that have some sort
> of stable, coherent set of beliefs and preferences,” Casper said.
> “Instead, they are imitators deep down who do all sorts of confabulation
> and say all sorts of frivolous things.”
> 

LLM systems learn new skills far beyond what they
were  programmed to do:

Large language models can do jaw-dropping things.
But nobody knows exactly why.
https://www.technologyreview.com/2024/03/04/1089403/large-language-models-amazing-but-nobody-knows-why/

> Mike Cook, a research fellow at King’s College London specializing in AI
> who wasn’t involved with the study, agreed with the co-authors’ findings.
> He noted that there’s frequently a big difference between the “scientific
> reality” of the systems AI labs build and the meanings that people ascribe
> to them.
> 
> “A model cannot ‘oppose’ a change in its values, for example — that is us
> projecting onto a system,” Cook said. “Anyone anthropomorphizing AI
> systems to this degree is either playing for attention or seriously
> misunderstanding their relationship with AI … Is an AI system optimizing
> for its goals, or is it ‘acquiring its own values’? It’s a matter of how
> you describe it, and how flowery the language you want to use regarding it
> is.”
> 
> 


-- 
Copyright 2025 Olcott "Talent hits a target no one else can hit; Genius
hits a target no one else can see." Arthur Schopenhauer

[toc] | [prev] | [standalone]

csiph-web

MIT study finds that AI doesn't, in fact, have values

Contents

#688961 — MIT study finds that AI doesn't, in fact, have values

#689286 — Re: MIT study finds that AI doesn't, in fact, have values --- PLO