Fooled by Faces: Protect Yourself from AI Face-Swapping Scams

7 min readJun 14, 2024

1. What is “AI Face Swapping”

AI face swapping typically refers to the use of Deepfakes technology, a technique in deep learning, to replace one person’s face with another’s. This technology analyzes the facial features of a target individual and then applies them to another video or image, creating seemingly realistic fake videos or pictures.

While face-swapping technology has its place in entertainment and film production, it also poses significant risks. Specifically, Deepfakes have garnered widespread attention for their potential use in creating child sexual abuse material, celebrity pornographic videos, revenge porn, fake news, hoaxes, bullying, and financial fraud.

2. Development of “AI Face Swapping”

In 2017, Supasorn Suwajanakorn and colleagues from the University of Washington introduced the “Synthesizing Obama” project [1] using deep learning technology, which was presented at Siggraph 2017. In simple terms, this technology altered video clips of Barack Obama to make him appear to be saying words from a new audio track that were unrelated to the original video. The technique involved a combination of Recurrent Neural Networks (RNNs) and Mouth Synthesis, achieving a remarkably convincing effect where the audio and mouth movements in the video matched seamlessly.

An overview of the synthesizing obama technology

Another similar project is Face2Face [2], proposed by Professor Justus Thies, who is currently affiliated with TU Darmstadt in Germany. Face2Face, introduced in 2016 and published at CVPR 2016, involves modifying the facial video clips of a target person (destination) to mimic the real-time facial expressions of another person (source). Due to this project, Justus Thies was received by German Chancellor Angela Merkel in 2019 to discuss the risks and challenges of media manipulation [3].

These technologies demonstrate the foundation of deepfake technology based on Deep Learning has already been established. So, where did deepfake specifically originate?

The answer is: Reddit!

Indeed, the term “Deepfakes” originated around late 2017, proposed by a Reddit user named “deepfakes.” He created the r/deepfakes subreddit, where users shared their created Deepfakes videos, with the most popular being the swapping of celebrities’ faces onto the bodies of actors in pornographic videos.

In almost the same time period, a variety of videos and communities have emerged like bamboo shoots after the rain. People are sharing their own videos on forums, where many visual effects post-production workers are adding various complex effects and super-resolution technologies to deepfake videos, achieving truly photo-realistic results. These videos are being widely disseminated on UGC video platforms like YouTube. According to research statistics, the total views of deepfake-related videos on platforms like YouTube, Instagram, and TikTok have exceeded 1 billion.

Behind these videos, the most successful open-source deepfake software, the DeepFaceLab series [5,6] (a total of 66,000+ stars on GitHub, top 0.0001% of GitHub, one of the top 10 AI open-source projects of 2020, alongside Tensorflow & PyTorch), developed by Ivan Perov, Daiheng Gao (who is also the AI Advisor for GoPlus, focusing on PoH prevention), and others, cannot be overlooked.

The key advantage of the series of open-source deepfake technologies represented by DeepFaceLab and FaceSwap, compared to academic techniques, is their simplicity and user-friendliness — no technical background is required to use them!

Unlike traditional GitHub projects, DeepFaceLab provides one-click script bat files for Windows and Linux platforms, allowing users to train and infer face-swap models in a foolproof manner, significantly lowering the barrier to creating deepfake videos, which is the key to the proliferation and dissemination of deepfake videos.

3. The Harms of “AI Face Swapping”

Case in Traditional Finance

In the traditional finance domain, face-swapping technology has been used for fraud activities. According to reports [7,8], criminals use illegally obtained personal information and computer algorithms to simulate the facial features and voices of victims’ relatives, colleagues, or public officials, impersonating these individuals to defraud. After gaining the victims’ trust, the criminals use pre-prepared scripts to send the victims information about bank card transfers, virtual investment and wealth management, and rebate scams, and further lower the victims’ guard through video calls and voice bombardment. As a result, victims often cannot detect the deception in a short period and end up transferring funds, after which the criminals disappear without a trace.

China’s financial regulatory department warns of the need to be vigilant about “AI face-swapping” fraud.

Case in Web3

The recent widespread attention to “AI face swapping” originated from a case where a user’s OKX digital currency was stolen after their password and key were changed using AI face swapping, resulting in a loss of over $3 million! [9] This news, which broke on June 3, 2024, has raised widespread concern among web3 users about the security of facial recognition authentication.
Previously, web3 users were not very sensitive to the facial verification process in the Proof-of-Human step, as it was rarely used. However, the occurrence of this large-scale theft due to face swapping has led many to doubt the reliability of the KYC and facial verification technologies used by major exchanges like OKX and Binance.
The question arises whether there is a fair and reasonable third-party institution to verify and assess the security of these major exchanges.

Users suffer losses of digital currency due to AI face-swapping fraud.

4. New Technological Developments in 2024 and Warnings to Users

In 2024, shortly after the appearance of OpenAI, EMO [10] (from Daiheng Gao’s team at Alibaba DAMO Academy) emerged, announcing the maturity of long-term face-driven technology based on diffusion models.

This means that malicious actors can now create high-quality face-forged videos using only a single photograph, capable of fooling the verification channels of exchanges.

Furthermore, with the advancement of voice cloning technology (BERT VITS2 [11]), ill-intentioned criminals can deliberately collect users’ audio information and use models to recreate the voice speaking required for exchange verification.

5. GoPlus in Action

Since establishing a partnership with Daiheng Gao in March 2024, GoPlus has been continuously supporting Daiheng’s research in the field of video face forgery, and has funded related laboratories (USTC Cybersecurity Research Institute, Stanford, etc.).

In the image domain, Daiheng previously achieved second place in the Deepfake detection challenge hosted by Meta in 2019. Currently, in the video domain, Deepfake detection is an area worthy of greater attention.

User asset security is a lifeline. GoPlus is willing to work together with users to provide state-of-the-art solutions for exchanges, KYC, and other areas, using the most advanced AI technologies to minimize the occurrence of risks as much as possible.

References:

[1] Suwajanakorn, Supasorn, Steven M. Seitz, and Ira Kemelmacher-Shlizerman. “Synthesizing obama: learning lip sync from audio.” ACM Transactions on Graphics (ToG) 36.4 (2017): 1–13. [2] Thies, Justus, et al. “Face2face: Real-time face capture and reenactment of rgb videos.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

[3] Cabinet Meeting: Synthetic Media — Danger or Opportunity?

[4] Cole, Samantha (24 January 2018). “We Are Truly Fucked: Everyone Is Making AI-Generated Fake Porn Now”. Vice. Archived from the original on 7 September 2019. Retrieved 4 May 2019. [5] DeepFaceLab, https://github.com/iperov/DeepFaceLab

[6] DeepFaceLive, https://github.com/iperov/DeepFaceLive [7] “AI换脸”诈骗如何防金融监管部门提示 https://www.gov.cn/lianbo/bumen/202310/content_6907773.htm

[8] 深度丨知人知面不知心！警惕“AI换脸”金融诈骗术 https://m.21jingji.com/article/20231020/herald/1f23a5a3959ab7cc544e0c0d93ed6886.html

[9] AI换脸绕过OKX审查系统, 用户损失300万刀 https://x.com/BroLeonAus/status/1797553316404371967 [10] EMO: Emote Portrait Alive — Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions https://humanaigc.github.io/emote-portrait-alive/

[11] Bert Vits2 https://github.com/fishaudio/Bert-VITS2