Every time a realistic-looking deepfakes video makes the rounds—and lately, it feels like there is one there are warnings that the technology has advanced to the extent that these videos generated by artificial intelligence will be used in disinformation and other attacks.
Typically, deepfake videos are generated by putting a person’s face onto the body of someone else, and the facial movements are manipulated to fit the original video using artificial intelligence. The technology isn’t sophisticated enough yet that people can’t tell the generated videos aren’t real, but the technology is improving rapidly, creating more opportunities for malicious actors to co-opt these applications for their own purposes, said Dr. Mark S. Sherman, technical director, cybersecurity foundations, CERT division, at Carnegie Mellon University Software Engineering Institute.
The risk isn’t hypothetical. Back in 2019, an executive in a United Kingdom-based energy company received a phone call from his boss in Germany instructing him to wire €200,000 to a Hungarian supplier within the hour. The call had actually been a deepfake audio, insurance company Euler Hermes Group SA told. The fake audio had imitated the boss’s voice, tonality, punctuation, and even the German accent.
While it was bad that the company lost money, the damage wasn’t catastrophic. And that is what Sherman worries about. Currently, generating deepfake videos requires a good deal of technical expertise, time, processing power, and data, so it is still out of reach of the average user. Typically, transferring a person’s face onto the video of another person involves collecting thousands of pictures of both people, encoding the images using a deep learning neural network, and calculating features. Transferring the face of a person onto a video of another person could easily wind up involving 175 million parameters and millions of updates, Sherman said.
As a way to improve this process, AI researchers have been exploring shortcuts such as transfer learning, or using a trained neural network for one type of data and applying it to a different dataset. This means the neural network needs to learn only a subset of pixels, making this less time-consuming and resource-intensive. Transfer learning is a common technique used in image recognition, so it is logical that they can be used to create deep fakes, Sherman told.
While it’s true that technology is advancing rapidly, it is still not at the point where people can no longer tell real videos apart from fake ones. There is time for enterprises to make plans on how to proceed to protect against deepfakes.