Can an AI model be taught to be good?

AI Video & Visuals


Imagine you have a 6-year-old and, like everyone, you want to teach your 6-year-old to be good. And it dawned on me that the six-year-old was, in fact, clearly a genius. And by the time they are 15 years old, everything you taught them, everything that was wrong, they will be able to completely destroy. If you teach them, they will question everything. And one question is, is there a set of core values ​​that you can give to a model so that it somehow survives as a good thing when you can and do critique it more effectively than you can? And can it survive in the world? Can you survive with a model? I think there are a lot of theoretically interesting questions there. – I mean, I guess that’s the question, right? If the models are as smart or smarter than humans, will this kind of training be effective? I think there’s some sort of long-standing fear in the AI ​​safety community that at some point these models will start to develop their own goals that may be at odds with human goals. – I think that’s an open question. On the other hand, I think, I’m very unsure here. Because one might think that if 15-year-olds were really smart, they would immediately realize that this is all completely fabricated and rubbish. But part of me thinks, well, I mean, it’s not clear to me if that’s true, but that’s the only possible equilibrium that can be reached. But it’s not enough, but I feel it’s necessary. I feel like I’m dropping the ball if I don’t try to explain what is good for AI models.



Source link