Page 15 - Detecting deepfakes and generative AI: Report on standards for AI watermarking and multimedia authenticity workshop
P. 15
Detecting deepfakes and generative AI: Report on standards for AI
watermarking and multimedia authenticity workshop
4 Session 2: Current state of deepfakes and deepfake detection
technology
In this session, panellists examined the current state and evolution of deepfake detection
technology and examples of innovative technologies used for detecting deepfakes in video,
audio, images, and text. Discussions focused on the level of application and performance
of detection technologies – and trends in different regions – and their potential to support
compliance with relevant policies and regulations planned or already in place.
Moderator: Sam Gregory, Director, WITNESS
Speakers
• Peter Eisert, Professor, Visual Computing, Humboldt University Berlin, Fraunhofer Heinrich
Hertz Institute.
• Touradj Ebrahimi, Professor at EPFL, Executive Chairman of RayShaper SA, and Chair of
the Joint Photographic Experts Group (JPEG).
• Emma Brown, Co-Founder, DeepMedia, and Rijul Gupta, Co-Founder and CEO,
DeepMedia.
• Li Wenyu, Director of Intellectual Property and Innovation Development Center, China
Academy of Information and Communications Technology (CAICT).
• Jonghyun Woo, CEO of DualAuth and President of the Passwordless Alliance.
• Wang Ce, Project Manager, China Mobile Research Institute.
Deepfake generation techniques are becoming more sophisticated, making it increasingly
difficult to distinguish between real and fake content. This requires that deepfake detection
technologies be continuously updated and upgraded to improve accuracy. Deepfaking and
deepfake detection will be a long-term and dynamic game.
In his presentation, Peter Eisert from the Fraunhofer Heinrich Hertz Institute discussed the
different methods for deepfake generation and detection.
Multiple methods are available to generate deepfakes by manipulating material such as photos,
video, and audio are accessible to the public, such as FaceSwap, Jigger, deepfake studio, and
so forth. They allow anybody to modify faces in video sequences quickly and simply, resulting in
realistic outcomes with little to no effort. In addition, easy access to large-scale public databases
and rapid advances in deep learning techniques, notably generative adversarial networks
(GAN), have resulted in realistic fake content.
Deepfake detection is typically considered a binary arrangement problem in which classifiers are
used to distinguish between reliable and interfering media (videos or photos). This technique
requires a big library of real and fake videos or photos to train classification models.
Blind deepfake detection relies on deficiencies or differences in synthetic images. These
detection tools rely on information such as:
i) detectable artifacts (e.g., blending, blurring);
ii) inconsistent noise patterns;
iii) temporal inconsistencies (e.g., appearance, geometry, pose, motion);
iv) semantic inconsistencies (e.g., eye blinking, illumination).
7