Re: Trying to teach ChatGPT algebra

Message-ID	<63fa9a51@news.ausics.net> (permalink)
From	not@telling.you.invalid (Computer Nerd Kev)
Subject	Re: Trying to teach ChatGPT algebra
Newsgroups	comp.misc
References	<k4nvo9Fk1erU1@mid.individual.net> <k4o1ccFkl0rU1@mid.individual.net> <k4pf6gFrekeU1@mid.individual.net> <k4qh10F1sn4U1@mid.individual.net> <k55ilaFn5esU1@mid.individual.net>
Date	2023-02-26 09:31 +1000
Organization	Ausics - https://www.ausics.net

Show all headers | View raw

Sylvia Else <sylvia@email.invalid> wrote:
>> I've pretty much hit a wall with this experiment. Even within the same 
>> session, getting ChatGPT to recognise that it's made a mistake does not 
>> mean it won't make the same mistake again.
>> 
>> It's like trying to teach a dumb student something that is beyond them. 
>> Even when you think they've finally got it, it turns out that they haven't.
>> 
>> And this is just with easy stuff. I have no hope that it would ever 
>> learn to apply more complicated manipulations correctly.
>> 
>> Perhaps my whole approach is misconceived.
> 
> On further research[*] I think that last comment is correct. One is not 
> actually teaching it anything during one of these sessions. One is 
> merely adding to the text that it will use as input to its neural 
> network to determine the next word to output. I wondered why its outputs 
> come as a slowish sequence of words, separated in time by significant 
> intervals. I believe this is because during those intervals it is 
> determining the next most probable word to follow the previous words in 
> the session (both the user's inputs and AI's previous output).
> 
> So it can sometimes appear to be following instructions, but it's not 
> really doing that, and the more complicated the instruction, the less 
> likely the answer is to be correct.

This article suggests that in theory your principle of teaching
these AIs a new task via prompts is valid. It's called "in-context
learning". However as I understand it you need to teach the AI by
example rather than with explanations. The teaching process is
probably still a long way from being as easy as you were hoping
for, but theoretically possible in the right circumstances, and
apparantly sometimes easier than training a dedicated neural
network from scratch.

Solving a machine-learning mystery
 by Adam Zewe, February 7, 2023
 - https://news.mit.edu/2023/large-language-models-in-context-learning-0207
"Large language models like OpenAI's GPT-3 are massive neural 
 networks that can generate human-like text, from poetry to 
 programming code. Trained using troves of internet data, these 
 machine-learning models take a small bit of input text and then 
 predict the text that is likely to come next.

 But that's not all these models can do. Researchers are exploring a 
 curious phenomenon known as in-context learning, in which a large 
 language model learns to accomplish a task after seeing only a few 
 examples -- despite the fact that it wasn't trained for that task. 
 For instance, someone could feed the model several example 
 sentences and their sentiments (positive or negative), then prompt 
 it with a new sentence, and the model can give the correct 
 sentiment.

 Typically, a machine-learning model like GPT-3 would need to be 
 retrained with new data for this new task. During this training 
 process, the model updates its parameters as it processes new 
 information to learn the task. But with in-context learning, the 
 model's parameters aren't updated, so it seems like the model 
 learns a new task without learning anything at all.

 Scientists from MIT, Google Research, and Stanford University are 
 striving to unravel this mystery. They studied models that are very 
 similar to large language models to see how they can learn without 
 updating parameters.

 The researchers' theoretical results show that these massive neural 
 network models are capable of containing smaller, simpler linear 
 models buried inside them. The large model could then implement a 
 simple learning algorithm to train this smaller, linear model to 
 complete a new task, using only information already contained 
 within the larger model. Its parameters remain fixed." ...

Research paper (not light reading):
https://arxiv.org/pdf/2211.15661.pdf

-- 
__          __
#_ < |\| |< _#

Back to comp.misc | Previous | Next — Previous in thread | Find similar

Thread

Trying to teach ChatGPT algebra Sylvia Else <sylvia@email.invalid> - 2023-02-11 09:45 +1100
  Re: Trying to teach ChatGPT algebra Sylvia Else <sylvia@email.invalid> - 2023-02-11 10:13 +1100
    Re: Trying to teach ChatGPT algebra Adrian Caspersz <email@here.invalid> - 2023-02-11 12:15 +0000
      Re: Trying to teach ChatGPT algebra Sylvia Else <sylvia@email.invalid> - 2023-02-12 08:52 +1100
        Re: Trying to teach ChatGPT algebra not@telling.you.invalid (Computer Nerd Kev) - 2023-02-12 08:26 +1000
        Re: Trying to teach ChatGPT algebra Andy Burns <usenet@andyburns.uk> - 2023-02-12 10:14 +0000
        Re: Trying to teach ChatGPT algebra Sylvia Else <sylvia@email.invalid> - 2023-02-16 13:27 +1100
          Re: Trying to teach ChatGPT algebra not@telling.you.invalid (Computer Nerd Kev) - 2023-02-26 09:31 +1000

csiph-web