If someone offers to do AI video recreation for your wedding, say no. This is a tough lesson I learned when I started trying to recreate my memories with Google's Gemini Veo model. What began as a fun exercise ended with disgust.
I grew up in the era before digital capture. We took photos and videos, but most were squirrels in boxes that were dragged out only for special occasions. Things like my kids' birth and their early years were caught up in movies and 8mm videotapes.
When we got married in 1991, we didn't even have a videographer (a little cost issue), so the date record is completely in analog photos.
In general, there was no social media to capture and share amazing moments in my life. You can't point someone to Instagram, Facebook, or Tiktok and say, “Don't believe me, check this link.”
But I have decent memories and I wondered if I could combine it with a bit of AI magic to bring these moments back to life.
Memory Machine
I recently signed up for the 3 month Google Vertex AI Studio Trial, which includes access to 300 VEO credits. VEO 3 is a remarkable Gemini model that allows you to generate synchronous audio and video at one prompt.
For my test, I chose my early career and some memorable moments from my 20s in Manhattan. These are 100% true stories that have happened to me, but I have no visual records of them.
In the first one, I described a young, thin, glasses-wearing man with curly hair (yes, I used to have a curly head) who meets the famous and award-winning comedian of Tony, who is famous for Times Square on Broadway. The comedian was Jackie Mason (please ask your grandparents) and wanted his autograph. He stopped, but the pigeon pooped on my head as I spoke to him and began inexplicably quizzing me about which TV he was going to buy. Unconsciously, Mason replied, staying calm.
For the prompt, I drew the scene with wide strokes, explaining my business outfit, in 1989, and how Mason looked with his curly hair and “Cerubic's face.” I included a single word dialogue that I could remember and actions that touched my head and understood what happened. Then I gave the VEO 3 a prompt.
A few minutes later I received a decent recreation of the scene with a pigeon. The guy didn't look like me, and Jackie Mason's character has only a similarity of passing with the iconic comedian of his former days.
Still, I was encouraged and searched for my memory for another memorable moment in my 20s.
I settled down when I tried to impress my first boss with my technical skills. His laser printer (yes, kids, which existed in the 1980s) ran low on toner, but reminded me that it can be removed from the printer and shaking to extend the life of the cartridge. So, that's what I did, the cartridge panel was stuck and I started showering myself and the office with black toner.
At my prompt, I described a scene involving a wooden panel wall in an office around 1986, including a brief description of myself and my bald middle-aged boss sitting at his desk. The dialogue included a laugh of my boss's common sense, saying, “Sorry,” explaining what I could.
The results this time have improved even more. Neither character looked like a real-world counterpart, but the printers, desks and offices were all eerie in my memory, and the moment Toner went everywhere was well done.
If I could open my brain and show people my memories of that moment, it might look like this. It's impressive.
The union is too far away
Imagine lifelong memories reconstructed with AI, I broke my brain for another core recollection. Then it hit me: my wedding.
The fact that we didn't have wedding videos has always bothered us, especially our wife. What if you could create it using AI (I know your predictions are too heavy).
It's not enough to simply describe your wedding on Veo 3 and get an AI wedding video featuring people who don't look like us. But I knew you could use source materials to guide AI. I have lots of pictures from my wedding from 34 years ago. I grabbed a scanned image featuring me and my wife right after the ceremony and grabbed the aisle back. Because we were not only clearly expressed, but also featured some of the wedding parties and guests.
These are worse than false memories. It is an aggressive distortion of one of the most important moments in my life.
I created this prompt in the hopes of creating a montage of a long-standing wedding (only 8 seconds in duration).
“We need a wedding video montage based on this wedding photo. The video should be filmed on HD quality VHS tape and look like it was filmed by the 2-second ceremony, 2-second everyone, 2-second dance, 2-second flower ride wedding cake, 2-second flower ride, 2-second bouquet throwing, and the second bride who left to Rimed to get new everyone to win Rimegi.
Ambitious, but I know, I thought I might narrow it all down by giving the model details about the duration of the scene.
Immediately, I hit the speed bump. My VEO 3 exam failed to include the source image. If you want to start with photos, you will need to go back to Veo 2. But that's not a big deal. Because, as explained at the prompt, there isn't much dialogue.
It took Veo 2 a few more minutes to spit out some videos. They all start with a base image, but to put it plainly, they are very wrong.
Each video has a consistent thread snapping almost instantly, and my wife and I transform into others. At one point, I'm dancing with a cake. At other times, I don't know how to let go of the bouquet my wife is supposed to throw in. We feed each other a cake and a kind of dance together.
The video is scary. Because it doesn't just look a bit right, it's also very wrong. These are worse than false memories. It is an aggressive distortion of one of the most important moments in my life. I showed my wife the video. My wife seemed to be apprehensive and told me that it would give her nightmares.
It was hard to object, but it reminded her that the model would improve and future outcomes would improve. She stayed motionless and looked at me like I had sold one of our children.
What I did is no different from those who deny the photographs of relatives who died in my inheritance. Whatever the image begins, everything after that first millisecond is false, or even worse, it is a corruption of memory. If you spent time with that person when they were alive, that's a real memory. Creating AI is guesswork, and even if it's good, it's fake. They never moved that way at that particular moment.
In the case of my wedding memories, I realize that it's better to leave them in the grey movie projector in my head.
There are no corrupt basic images when it comes to creating Veo 3 of my other memories. AI doesn't recreate my memories as much as it becomes a storytelling tool. Another way to explain interesting anecdotes. That person isn't me, that guy isn't my older boss, and it's not Jackie Mason, but you get the point of the story. And for that reason, AI serves its purpose.
