4.4 Article

A defensive framework for deepfake detection under adversarial settings using temporal and spatial features

出版社

SPRINGER
DOI: 10.1007/s10207-023-00695-x

关键词

Deepfakes; Attention mechanism; Optical flow; Adversarial machine learning; Cross-dataset generalization

向作者/读者索取更多资源

Advancements in artificial intelligence have resulted in an increase in digital forensics and the development of various image manipulation and processing tools. This paper proposes a new defensive framework that effectively identifies deepfakes by utilizing temporal and spatially aware features. The framework involves training a self-attenuated VGG16 neural model using facial landmarks in videos to obtain spatial attributes and generating optical flow feature vectors to extract temporal characteristics. The system achieves a detection accuracy of 98.4% and shows robustness under adversarial settings. It also demonstrates cross-dataset generalization capacity.
Advances in artificial intelligence have led to a surge in digital forensics, resulting in numerous image manipulation and processing tools. Hackers and cybercriminals utilize these techniques to create counterfeit images and videos by placing perturbations on facial traits. We propose a novel defensive framework that employs temporal and spatially aware features to efficiently identify deepfakes. This paper utilizes the facial landmarks in the video to train a self-attenuated VGG16 neural model to obtain the spatial attributes. Further, we generate optical flow feature vectors that extract temporal characteristics from the spatial vector. Another necessity of deepfake detection systems is the need for cross-dataset generalization. We built a custom dataset comprising samples from FaceForensics, Celeb-DF, and Youtube videos. Experimental analysis shows that the system achieves a detection accuracy of 98.4%. We evaluate the robustness of our proposed framework under various adversarial settings, employing the Adversarial Robustness Toolbox, Foolbox, and CleverHans tools. The experimental evaluation shows that the proposed method can classify real and fake videos with an accuracy of 74.27% under diverse holistic conditions. An extensive empirical investigation to evaluate the cross-dataset generalization capacity of the proposed framework is also performed.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据