利用神经分立时间代表制学习,有条件的健全一代 (Conditional Sound Generation Using Neural Discrete Time-Frequency Representation Learning) - 专知论文

会员服务 ·

0

Performer · 离散化 · 表示学习 · Better · 多样性 ·

2021 年 7 月 21 日

Conditional Sound Generation Using Neural Discrete Time-Frequency Representation Learning

翻译：利用神经分立时间代表制学习,有条件的健全一代

Xubo Liu,Turab Iqbal,Jinzheng Zhao,Qiushi Huang,Mark D. Plumbley,Wenwu Wang

from arxiv, Submitted to MLSP 2021

Deep generative models have recently achieved impressive performance in speech synthesis and music generation. However, compared to the generation of those domain-specific sounds, the generation of general sounds (such as car horn, dog barking, and gun shot) has received less attention, despite their wide potential applications. In our previous work, sounds are generated in the time domain using SampleRNN. However, it is difficult to capture long-range dependencies within sound recordings using this method. In this work, we propose to generate sounds conditioned on sound classes via neural discrete time-frequency representation learning. This offers an advantage in modelling long-range dependencies and retaining local fine-grained structure within a sound clip. We evaluate our proposed approach on the UrbanSound8K dataset, as compared to a SampleRNN baseline, with the performance metrics measuring the quality and diversity of the generated sound samples. Experimental results show that our proposed method offers significantly better performance in diversity and comparable performance in quality, as compared to the baseline method.

翻译：深层基因模型最近在语音合成和音乐生成方面取得了令人印象深刻的成绩,然而,与产生这些特定领域的声音相比,生成一般声音(如汽车角、狗叫和枪弹射击)尽管具有广泛的潜在应用,但受到的关注却较少。在先前的工作中,声音是在使用采样RNN的时域生成的。然而,很难在使用这种方法的录音录音中捕捉到长距离的依赖性。在这项工作中,我们提议通过神经离散的时间频率显示学习来生成声音,以音频教室为条件。这在制作远距离依赖性模型和将本地精密结构保留在音短片中方面提供了优势。我们评估了我们关于城市Sound8K数据集的拟议方法,与抽样RNNN基线相比,我们用测量生成声音样品质量和多样性的性能指数来衡量了我们提出的方法,实验结果表明,与基线方法相比,我们提出的方法在多样性和质量上的表现和可比的性业绩要好得多。

0

相关内容

Performer

【ICML2021】域自适应回归的子空间距离表示

专知会员服务

23+阅读 · 2021年6月28日

【CIKM2020】持续域自适应的机器阅读理解，Continual Domain Adaptation

【CIKM2020】持续域自适应的机器阅读理解，Continual Domain Adaptation

专知会员服务

12+阅读 · 2020年8月26日

【清华大学】图随机神经网络，Graph Random Neural Networks

【清华大学】图随机神经网络，Graph Random Neural Networks

专知会员服务

156+阅读 · 2020年5月26日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

专知会员服务

37+阅读 · 2020年2月27日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

161+阅读 · 2019年10月12日

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Time-domain Speech Enhancement with Generative Adversarial Learning

Arxiv

0+阅读 · 2021年9月19日

UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data

Arxiv

8+阅读 · 2021年6月10日

Attentive Graph Neural Networks for Few-Shot Learning

Attentive Graph Neural Networks for Few-Shot Learning

Arxiv

40+阅读 · 2020年7月14日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss

Arxiv

3+阅读 · 2020年2月2日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

Self-Attention Generative Adversarial Networks

Arxiv

8+阅读 · 2018年5月21日

Generate To Adapt: Aligning Domains using Generative Adversarial Networks

Arxiv

4+阅读 · 2018年4月1日

Improved Training of Generative Adversarial Networks Using Representative Features

Arxiv

7+阅读 · 2018年1月28日

DeepWalk: Online Learning of Social Representations

Arxiv

8+阅读 · 2014年6月27日

VIP会员

文章信息

相关主题

相关VIP内容

【ICML2021】域自适应回归的子空间距离表示

专知会员服务

23+阅读 · 2021年6月28日

【CIKM2020】持续域自适应的机器阅读理解，Continual Domain Adaptation

【CIKM2020】持续域自适应的机器阅读理解，Continual Domain Adaptation

专知会员服务

12+阅读 · 2020年8月26日

【清华大学】图随机神经网络，Graph Random Neural Networks

【清华大学】图随机神经网络，Graph Random Neural Networks

专知会员服务

156+阅读 · 2020年5月26日

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

神经网络的拓扑结构，TOPOLOGY OF DEEP NEURAL NETWORKS

专知会员服务

35+阅读 · 2020年4月15日

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

【DeepMind】PolyGen: 一种三维网格的自回归生成模型，PolyGen: An Autoregressive Generative Model of 3D Meshes

专知会员服务

37+阅读 · 2020年2月27日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

43+阅读 · 2020年1月28日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

161+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

【斯坦福博士论文】具身智能体中的复杂人类偏好

无人机如何改变战争？未来战场

【KDD2025】一种新颖的可解释性无监督异常检测模型

深度学习时代的模仿学习：新型分类体系与最新研究进展

相关资讯

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图机器学习 2.2-2.4 Properties of Networks, Random Graph

图与推荐

10+阅读 · 2020年3月28日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Time-domain Speech Enhancement with Generative Adversarial Learning

Arxiv

0+阅读 · 2021年9月19日

UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data

Arxiv

8+阅读 · 2021年6月10日

Attentive Graph Neural Networks for Few-Shot Learning

Attentive Graph Neural Networks for Few-Shot Learning

Arxiv

40+阅读 · 2020年7月14日

Evolving Losses for Unsupervised Video Representation Learning

Arxiv

23+阅读 · 2020年2月26日

WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss

Arxiv

3+阅读 · 2020年2月2日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

Self-Attention Generative Adversarial Networks

Arxiv

8+阅读 · 2018年5月21日

Generate To Adapt: Aligning Domains using Generative Adversarial Networks

Arxiv

4+阅读 · 2018年4月1日

Improved Training of Generative Adversarial Networks Using Representative Features

Arxiv

7+阅读 · 2018年1月28日

DeepWalk: Online Learning of Social Representations

Arxiv

8+阅读 · 2014年6月27日

微信扫码咨询专知VIP会员