Researchers from TikTok owner ByteDance have demoed a new AI system, OmniHuman-1, that can generate perhaps the most realistic deepfake videos to date.
Deepfaking AI is a commodity. There’s no shortage of apps that can insert someone into a photo, or make a person appear to say something they didn’t actually say. But most deepfakes — and video deepfakes in particular — fail to clear the uncanny valley. There’s usually some tell or obvious sign that AI was involved somewhere.
Not so with OmniHuman-1 — at least from the cherry-picked samples the ByteDance team released.
Here’s a fictional Taylor Swift performance:
Here’s a TED Talk that never took place:
And here’s a deepfaked Einstein lecture:
According to the ByteDance researchers, OmniHuman-1 only needs a single reference image and audio, like speech or vocals, to generate a video. The output video’s aspect ratio is adjustable, as is the subject’s “body proportion” — i.e. how much of their body is shown in the fake clip.
OmniHuman-1 can also edit existing videos, even modifying the movements of a person’s limbs. It’s truly astonishing how convincing the result can be:
Granted, OmniHuman-1 isn’t perfect. The ByteDance team says that “low-quality” reference images won’t yield the best videos, and the system seems to struggle with certain poses. Note the weird gestures with the wine glass in this video:
Still, OmniHuman-1 is easily heads and shoulders above previous deepfake techniques, and it may well be a sign of things to come. While ByteDance hasn’t released the system, the AI community tends not to take long to reverse-engineer models like these.
The implications are worrisome.
Last year, political deepfakes spread like wildfire around the globe. On election day in Taiwan, a Chinese Communist Party-affiliated group posted AI-generated, misleading audio of a politician throwing his support behind a pro-China candidate. In Moldova, deepfake videos depicted the country’s president, Maia Sandu, resigning. And in South Africa, a deepfake of rapper Eminem supporting a South African opposition party circulated ahead of the country’s election.
Deepfakes are also increasingly being used to carry out financial crimes. Consumers are being duped by deepfakes of celebrities offering fraudulent investment opportunities, while corporations are being swindled out of millions by deepfake impersonators. According to Deloitte, AI-generated content contributed to more than $12 billion in fraud losses in 2023, and could reach $40 billion in the U.S. by 2027.
Last February, hundreds in the AI community signed an open letter calling for strict deepfake regulation. In the absence of a law criminalizing deepfakes at the federal level in the U.S., more than 10 states have enacted statutes against AI-aided impersonation. California’s law — currently stalled — would be the first to empower judges to order the posters of deepfakes to take them down or potentially face monetary penalties.
Unfortunately, deepfakes are hard to detect. While some social networks and search engines have taken steps to limit their spread, the volume of deepfake content online continues to grow at an alarmingly fast rate.
In a May 2024 survey from ID verification firm Jumio, 60% of people said they encountered a deepfake in the past year. Seventy-two percent of respondents to the poll said they were worried about being fooled by deepfakes on a daily basis, while a majority supported legislation to address the proliferation of AI-generated fakes.
Kyle Wiggers is a senior reporter at TechCrunch with a special interest in artificial intelligence. His writing has appeared in VentureBeat and Digital Trends, as well as a range of gadget blogs including Android Police, Android Authority, Droid-Life, and XDA-Developers. He lives in Brooklyn with his partner, a piano educator, and dabbles in piano himself. occasionally — if mostly unsuccessfully.