AI systems identify fake videos beyond face exchanges and change speeches

Scientists develop tools to detect fake videos — Issue Summary: Existing deepfake detection methods focus primarily on in-person video identification. Most of them can't make inferences unless faces are detected in the video. credit: *arxiv* (2024). doi:10.48550/arxiv.2412.12278

In an age where manipulated videos can spread disinformation, harm or harm bullies, researchers at UC Riverside have created a powerful new system to expose these fakes.

Amit Roy Choudhury, professor of electrical and computer engineering, and Rohit Kundu, a doctoral candidate for both UCR's Marlan and Rosemary Bones College of Engineering, have teamed up with Google scientists to develop an artificial intelligence model in which operations detect operations far beyond swaps. The paper is published on arxiv Preprint server.

Roy-Chowdhury is also co-director of UC Riverside Artificial Intelligence Research and Education (Raise) Institute, UCR's new interdisciplinary research center.

A new system called Universal Network for identifying tampered synthetic videos (Unites) detects counterfeiting by examining faces as well as complete video frames such as backgrounds and motion patterns. This analysis makes it one of the first tools to identify synthetic or doctor videos that are not dependent on facial content.

“Deepfake has evolved,” Kundu said. “They're not just face swaps anymore. People are currently creating completely fake videos using powerful generative models, from face to background. Our system is built to catch all of that.”

Unite's development is because text-to-video freedom and image-to-video generation is now widely available online. These AI platforms allow almost anyone to produce highly convincing videos, pose serious risks to individuals, institutions and democracy itself.

“It's scary how accessible these tools are,” Kundu said. “Anyone with moderate skills can bypass the safety filter and generate realistic videos of public figures saying things they never said.”

Kundu explained that the previous Deepfake detectors are almost entirely focused on facial cues.

“If there's no face in the frame, many detectors simply won't work,” he said. “But disinformation can come in many ways. Changing the background of a scene can easily distort the truth.”

To address this, Unite uses a transformer-based deep learning model to analyze video clips. Detect subtle spatial and temporal inconsistencies. This is often overlooked in previous systems. This model is based on a basic AI framework known as Siglip, which extracts features that are not bound to a particular person or object.

A new training method called “loss of attention” prevents the system from monitoring multiple visual areas in each frame and focusing solely on the face.

The result was a universal detector that could flag a variety of counterfeits, from simple facial swaps to complex, complete synthetic videos produced without actual footage.

“It's one model that handles all these scenarios,” says Kundu. “That's the universal thing.”

Researchers have published the findings on 2025 Computer Vision and Pattern Recognition (CVPR) surveys on computer vision and pattern recognition in Nashville, Tennessee.

Co-authors include Google researchers Hao Xiong, Vishal Mohanty and Athula Balachandra.

Kundu's collaboration with Google provided access to the vast dataset and computing resources needed to train models on a wide range of synthetic content, including videos generated from text and still images.

Although still in development, Unite was able to quickly play an important role in defending video disinformation. Potential users include social media platforms, fact checkers, and newsrooms that work to prevent manipulated videos from going viral.

“People deserve to know if what they see is real or not,” Kundu said. “And as AI gets better at fake reality, we have to make it better for revealing the truth.”

detail:
Rohit Kundu et al., Universal Composite Video Detector: From face and background manipulation to fully generated content, arxiv (2024). doi:10.48550/arxiv.2412.12278

Journal Information:
arxiv

Provided by the University of California – Riverside

Quote: AI System Identifies Fake Videos Beyond Face Swap and Modified Speech (July 25, 2025) Retrieved from https://techxplore.com/news/2025-07-scientists-tool-fake-videos.html

This document is subject to copyright. Apart from fair transactions for private research or research purposes, there is no part that is reproduced without written permission. Content is provided with information only.

Source link