Berkeley Talks: Improving AI Truth with ChatGPT Developer John Schulman

AI News


read the transcript.

follow Berkeley Conference, a Berkeley news A podcast featuring lectures and conversations at UC Berkeley.

A person in front of a screen with letters projected onto their face

Berkeley graduate John Schulman co-founded OpenAI and led the reinforcement learning team that developed ChatGPT. (Photo of UC Berkeley by Jim Block)

of Berkeley Talks In episode 166, John Schulman, a UC Berkeley alumnus and lead developer of ChatGPT, talks (often convincingly) about how AI language models organize things, addressing this problem. We will give you a solution on how to solve it.

According to Schulman, language models have pattern completion behavior and are trained to maximize the likelihood of responses. And part of the reason he generates untruthful content, which he calls “hallucinations,” is because models are unaware that they are allowed to say “I don’t know” or express uncertainty. Telling the chatbot that it is allowed to do so, he says, will partially solve the problem.

“Sometimes the model is reluctant to challenge its assumptions, sometimes it is caught lying. And that means continuing to lie, so I think there’s a class of problems covered there.

“And another set of hallucinations is that the guess is wrong. I don’t remember, etc. You have to guess a little bit, and sometimes you guess wrong.”

“So is it even possible to solve this problem?” Schulman asks.

He thinks so, and says reinforcement learning is part of the solution.

Schulman’s talk on April 19 was part of a series of public talks by the world’s leading experts on artificial intelligence at Berkeley this spring.

read Berkeley news In a Q&A with Schulman, we discuss why he chose Berkeley for grad school, the fascination with towel-folding robots, and what he sees as the future of artificial intelligence.

Watch the video of the conversation below.

On April 19, John Schulman, a Berkeley alumnus and lead developer of ChatGPT, gave a lecture titled “Reinforcement Learning from Human Feedback: Progress and Challenges.”





Source link

Leave a Reply

Your email address will not be published. Required fields are marked *