A few weeks ago, Canada-based startup Dessa unveiled a new AI that can flawlessly replicate anyone’s voice using a deep learning system called RealTalk that uses text inputs to produce life-like speech in the style of a real person.
Featured Image VIA
Here’s one they made of Joe Rogan – can you tell whether it’s fake audio or is that really Joe Rogan talking?
Pretty funny, but also absolutely terrifying considering how easily this sort of technology could be abused.
Well Samsung are now all in on this technology and they are syncing it up with actual video, so not only could you fake an audio of someone but also make it look as though they physically said these things on camera.
Here’s one of Rasputin singing along to Beyonce’s ‘Halo’:
OK, it’s not perfect yet, but they’ll get there. If they can pull that quality off with an old photo of Rasputin, imagine what they could do with a clear photo of you and me.
Samsung researchers have been particularly adept at this, with the latest example being the partnership between Samsung’s AI research center in the UK working with Imperial College London to create an AI that animates and syncs up an audio clip with facial movements derived from little more than a photograph of someone.
Samsung also made this video of Albert Einstein, but with audio of things he actually said:
All of that is done via the creation of what’s called a generative adversarial network, or GAN for short. “The videos generated using this model do not only produce lip movements that are synchronized with the audio but also exhibit characteristic facial expressions such as blinks, brow raises, etc.,” the researchers write in an academic paper about their work, available here. “Our improved model works on ‘in-the-wild’ unseen faces and is capable of capturing the emotion of the speaker and reflecting it in the facial expression.
Impressive and exciting stuff, but like I said, terrifying as well. At some point this sort of technology is going to become an app that’s readily available for everyone to use. Not only will the media abuse it, so will everyday people. At worst you won’t be able to believe anything that you see and hear with your own eyes and ears. Surely someone will somehow have to regulate all of this, no?
Another with SKYNET I suppose – it’s only a matter of time.