This AI model learns to speak by watching videos. Here's how it works:

AI Video & Visuals


According to the researchers, their AI model, DenseAV, learns word meanings and sound locations just by watching videos, without any human input or text.

In their paper, researchers from MIT, Microsoft, Oxford and Google explained that DenseAV achieves this using only self-supervision from video.





Source link

Leave a Reply

Your email address will not be published. Required fields are marked *