基于通用对抗扰动的持续音频深度伪造检测 (Continual Audio Deepfake Detection via Universal Adversarial Perturbation) - 专知论文

会员服务 ·

0

伪造检测 · 深度伪造检测 · 深度伪造 · 对抗 · 扰动 ·

Continual Audio Deepfake Detection via Universal Adversarial Perturbation

翻译：基于通用对抗扰动的持续音频深度伪造检测

Wangjie Li,Lin Li,Qingyang Hong

The rapid advancement of speech synthesis and voice conversion technologies has raised significant security concerns in multimedia forensics. Although current detection models demonstrate impressive performance, they struggle to maintain effectiveness against constantly evolving deepfake attacks. Additionally, continually fine-tuning these models using historical training data incurs substantial computational and storage costs. To address these limitations, we propose a novel framework that incorporates Universal Adversarial Perturbation (UAP) into audio deepfake detection, enabling models to retain knowledge of historical spoofing distribution without direct access to past data. Our method integrates UAP seamlessly with pre-trained self-supervised audio models during fine-tuning. Extensive experiments validate the effectiveness of our approach, showcasing its potential as an efficient solution for continual learning in audio deepfake detection.

翻译：语音合成与语音转换技术的快速发展引发了多媒体取证领域的重要安全关切。尽管现有检测模型展现出优异的性能，但在应对不断演进的深度伪造攻击时，其持续有效性面临挑战。此外，利用历史训练数据对模型进行持续微调会产生高昂的计算与存储成本。为克服这些局限，我们提出一种创新框架，将通用对抗扰动（UAP）引入音频深度伪造检测，使模型能够在无需直接访问历史数据的情况下，保持对过往伪造分布的知识记忆。该方法在微调过程中将UAP与预训练的自监督音频模型无缝集成。大量实验验证了本方法的有效性，展现了其作为音频深度伪造检测中持续学习高效解决方案的潜力。

0

相关内容

伪造检测

《通过增强的多域指挥官关键信息需求（CCIR）过程“读取敌人思想”》

《通过增强的多域指挥官关键信息需求（CCIR）过程“读取敌人思想”》

专知会员服务

31+阅读 · 11月15日

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

专知会员服务

17+阅读 · 2022年3月19日

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

专知会员服务

13+阅读 · 2022年3月19日

【ICML2020投稿论文-CMU-DeepMind-Google】用于评估跨语言泛化的大规模多语言多任务基准

【ICML2020投稿论文-CMU-DeepMind-Google】用于评估跨语言泛化的大规模多语言多任务基准

专知会员服务

14+阅读 · 2020年3月27日

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

专知会员服务

11+阅读 · 2019年11月2日

[CVPR 2020]BEDSR-Net：单张文档图像的阴影去除深度网络

[CVPR 2020]BEDSR-Net：单张文档图像的阴影去除深度网络

专知

12+阅读 · 2020年9月30日

【CIKM2020】多模态知识图谱推荐系统，Multi-modal KG for RS

【CIKM2020】多模态知识图谱推荐系统，Multi-modal KG for RS

专知

33+阅读 · 2020年8月24日

Pytorch多模态框架MMF

Pytorch多模态框架MMF

专知

50+阅读 · 2020年6月20日

【CVPR2020-牛津-谷歌】语音到动作:动作识别的跨模态监督，Cross-modal Supervision

【CVPR2020-牛津-谷歌】语音到动作:动作识别的跨模态监督，Cross-modal Supervision

专知

10+阅读 · 2020年3月31日

误差反向传播——CNN

误差反向传播——CNN

统计学习与视觉计算组

30+阅读 · 2018年7月12日

大规模多视角高维图像特征提取

国家自然科学基金

3+阅读 · 2017年12月31日

T-S模糊神经网络的容错同步性分析

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

DMD数字光刻新型光学引擎耦合及其外腔反馈合束研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于组合Hodge理论的图像视频质量评价方法

国家自然科学基金

0+阅读 · 2014年12月31日

VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning

Arxiv

0+阅读 · 12月14日

Deferred Poisoning: Making the Model More Vulnerable via Hessian Singularization

Arxiv

0+阅读 · 12月11日

Uni-Hand: Universal Hand Motion Forecasting in Egocentric Views

Arxiv

0+阅读 · 12月5日

A Zero-shot Explainable Doctor Ranking Framework with Large Language Models

Arxiv

0+阅读 · 11月24日

Adaptive LiDAR Scanning: Harnessing Temporal Cues for Efficient 3D Object Detection via Multi-Modal Fusion

Arxiv

0+阅读 · 11月14日

VIP会员

文章信息

相关主题

深度伪造检测

相关VIP内容

《通过增强的多域指挥官关键信息需求（CCIR）过程“读取敌人思想”》

《通过增强的多域指挥官关键信息需求（CCIR）过程“读取敌人思想”》

专知会员服务

31+阅读 · 11月15日

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

【CVPR 2022】基于灵活模态Transformer的人脸防伪 FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

专知会员服务

17+阅读 · 2022年3月19日

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

【CVPR 2022】长尾视觉数据识别的嵌套式协同学习方法 Nested Collaborative Learning for Long-Tailed Visual Recognition

专知会员服务

13+阅读 · 2022年3月19日

【ICML2020投稿论文-CMU-DeepMind-Google】用于评估跨语言泛化的大规模多语言多任务基准

【ICML2020投稿论文-CMU-DeepMind-Google】用于评估跨语言泛化的大规模多语言多任务基准

专知会员服务

14+阅读 · 2020年3月27日

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

【Facebook AI】对抗性NLI:自然语言理解的新基准，Adversarial NLI: A New Benchmark for Natural Language Understanding

专知会员服务

11+阅读 · 2019年11月2日

热门VIP内容

开通专知VIP会员享更多权益服务

前沿人工智能趋势报告（Frontier AI Trends Report）

【AAAI2026】善始则事半功倍：基于前缀优化的大语言模型推理强化学习

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

音退化问题：基于输入操控的鲁棒语音转换综述

相关资讯

[CVPR 2020]BEDSR-Net：单张文档图像的阴影去除深度网络

[CVPR 2020]BEDSR-Net：单张文档图像的阴影去除深度网络

专知

12+阅读 · 2020年9月30日

【CIKM2020】多模态知识图谱推荐系统，Multi-modal KG for RS

【CIKM2020】多模态知识图谱推荐系统，Multi-modal KG for RS

专知

33+阅读 · 2020年8月24日

Pytorch多模态框架MMF

Pytorch多模态框架MMF

专知

50+阅读 · 2020年6月20日

【CVPR2020-牛津-谷歌】语音到动作:动作识别的跨模态监督，Cross-modal Supervision

【CVPR2020-牛津-谷歌】语音到动作:动作识别的跨模态监督，Cross-modal Supervision

专知

10+阅读 · 2020年3月31日

误差反向传播——CNN

误差反向传播——CNN

统计学习与视觉计算组

30+阅读 · 2018年7月12日

相关论文

VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning

Arxiv

0+阅读 · 12月14日

Deferred Poisoning: Making the Model More Vulnerable via Hessian Singularization

Arxiv

0+阅读 · 12月11日

Uni-Hand: Universal Hand Motion Forecasting in Egocentric Views

Arxiv

0+阅读 · 12月5日

A Zero-shot Explainable Doctor Ranking Framework with Large Language Models

Arxiv

0+阅读 · 11月24日

Adaptive LiDAR Scanning: Harnessing Temporal Cues for Efficient 3D Object Detection via Multi-Modal Fusion

Arxiv

0+阅读 · 11月14日

相关基金

大规模多视角高维图像特征提取

国家自然科学基金

3+阅读 · 2017年12月31日

T-S模糊神经网络的容错同步性分析

国家自然科学基金

0+阅读 · 2015年12月31日

基于自主学习的Ad hoc Agent序贯决策研究

国家自然科学基金

46+阅读 · 2015年12月31日

DMD数字光刻新型光学引擎耦合及其外腔反馈合束研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于组合Hodge理论的图像视频质量评价方法

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员