Page 18 - Detecting deepfakes and generative AI: Report on standards for AI watermarking and multimedia authenticity workshop
P. 18

Detecting deepfakes and generative AI: Report on standards for AI
                                           watermarking and multimedia authenticity workshop



                      Figure 5: Anatomy of a deepfake attack






























                      Source: DeepMedia.AI

                      Li Wenyu, Director of the Intellectual Property and Innovation Development Centre at CAICT
                      gave a presentation focused on performance evaluation metrics that can help to measure the
                      output of deepfake detection models, ensure the quality and reliability of the models, and
                      further guide the optimisation and improvement of the models to ensure their effectiveness
                      in real-world applications.

                      Accuracy (ACC), area under the curve (AUC), and average precision (AP) are usually used for
                      assessment:

                      •    ACC is the most intuitive performance metric, reflecting the proportion of samples
                           correctly predicted by the model. It is derived by calculating the ratio of the number of
                           true and true-negative examples to the number of all samples. It is a basic method for
                           assessing how good a classification model is.
                      •    AUC is used to measure the performance of a binary classification system and represent
                           that performance with a value between 0 and 1, with closer to 1 indicating better model
                           performance. It depicts the relationship between the rate of true cases and the rate of
                           false-positive cases at different thresholds.
                      •    AP is an important performance evaluation metric in target detection, which takes into
                           account the accuracy of all the categories of classifiers and averages them. A higher AP
                           means that the model has a better detection accuracy on multiple categories.

                      The diversity of information forms and the complexity of content today pose a great challenge
                      for deepfake detection. Detection technologies need to be able to handle various types of data
                      and accurately identify subtle traces of a deepfake. There is a need to promote international
                      cooperation and global dialogue on technical standards for deepfake detection technology
                      based on respect for cultural diversity, transparency, safety, and security. The following areas
                      were identified as having potential for standardization in ITU:

                      i)   Standardization of active defence and traceability – for example, embedding imperceptible
                           watermarks or proof information in multimedia content for content traceability, using
                           blockchain technology to ensure the transparency and tamperproofing of the testing
                           process, and real-time monitoring and tracking using Internet of Things devices.




                  10
   13   14   15   16   17   18   19   20   21   22   23