与 " 单发语音转换 " 的对对立相互信息学习脱钩 (Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion) - 专知论文

会员服务 ·

0

INFORMS · 互信息 · Learning · Performer · 表示 ·

2022 年 8 月 18 日

Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion

翻译：与 " 单发语音转换 " 的对对立相互信息学习脱钩

SiCheng Yang,Methawee Tantrawenith,Haolin Zhuang,Zhiyong Wu,Aolan Sun,Jianzong Wang,ning cheng,Huaizhen Tang,Xintao Zhao,Jie Wang,Helen Meng

from arxiv, 5 pages,5 figures,INTERSPEECH 2022

One-shot voice conversion (VC) with only a single target speaker's speech for reference has become a hot research topic. Existing works generally disentangle timbre, while information about pitch, rhythm and content is still mixed together. To perform one-shot VC effectively with further disentangling these speech components, we employ random resampling for pitch and content encoder and use the variational contrastive log-ratio upper bound of mutual information and gradient reversal layer based adversarial mutual information learning to ensure the different parts of the latent space containing only the desired disentangled representation during training. Experiments on the VCTK dataset show the model achieves state-of-the-art performance for one-shot VC in terms of naturalness and intellgibility. In addition, we can transfer characteristics of one-shot VC on timbre, pitch and rhythm separately by speech representation disentanglement. Our code, pre-trained models and demo are available at https://im1eon.github.io/IS2022-SRDVC/.

翻译：仅用一个目标演讲者演讲作为参考的单发语音转换(VC)已成为一个热门的研究课题。现有的工作一般不协调,而关于音调、节奏和内容的信息仍然混杂在一起。要有效地执行一发VC,进一步分解这些语音组成部分,我们就对音调和内容编码器进行随机抽查,并使用基于相互信息和渐变逆转层的反向对比对式对调相互学习,以确保潜在空间的不同部分,其中仅包含培训期间所希望的分解代表。VCTK数据集的实验显示,该模型在自然性和不易感性方面达到了一发VC的最先进的性能。此外,我们还可以通过语音表达解析,将一发VC的特征分别转换成字调、音调和节奏。我们的代码、预先训练的模型和演示可在https://im1eon.github.io/IS2022-SRVC/上查阅。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

基于Metasurface的THz慢波器件研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

句子语境下的语音加工的神经机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

特异性抗菌NO修饰复合纳米材料的制备及分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

气溶胶对夜间大气化学过程和NOx清除率影响研究

国家自然科学基金

0+阅读 · 2009年12月31日

Zero-Shot Policy Transfer with Disentangled Task Representation of Meta-Reinforcement Learning

Arxiv

0+阅读 · 2022年10月1日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Dissecting Contextual Word Embeddings: Architecture and Representation

Dissecting Contextual Word Embeddings: Architecture and Representation

Arxiv

22+阅读 · 2018年8月27日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

VIP会员

文章信息

相关主题

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

【深度学习表格检测、信息提取和结构化】《Table Detection, Information Extraction and Structuring using Deep Learning》by Vihar Kurama

专知会员服务

38+阅读 · 2020年1月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

《利用人工智能对军事行动进行建模》

《利用人工智能学习、优化与推演美国海军作战部队的战略布局与分散（续文）》

机器人、无人机与实时影像：应对城市爆炸威胁的三大技术方案

《指挥官意图消息中关键概念自动提取》最新47页

相关资讯

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

相关论文

Zero-Shot Policy Transfer with Disentangled Task Representation of Meta-Reinforcement Learning

Arxiv

0+阅读 · 2022年10月1日

Adversarial Robustness of Representation Learning for Knowledge Graphs

Arxiv

10+阅读 · 2022年9月30日

Adversarial Mutual Information for Text Generation

Adversarial Mutual Information for Text Generation

Arxiv

13+阅读 · 2020年6月30日

Dissecting Contextual Word Embeddings: Architecture and Representation

Dissecting Contextual Word Embeddings: Architecture and Representation

Arxiv

22+阅读 · 2018年8月27日

Diverse Image-to-Image Translation via Disentangled Representations

Diverse Image-to-Image Translation via Disentangled Representations

Arxiv

13+阅读 · 2018年8月2日

相关基金

基于Metasurface的THz慢波器件研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

句子语境下的语音加工的神经机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

特异性抗菌NO修饰复合纳米材料的制备及分子机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

气溶胶对夜间大气化学过程和NOx清除率影响研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员