The Continuous Wavelet Transform (CWT) is an effective tool for feature extraction in acoustic recognition using Convolutional Neural Networks (CNNs), particularly when applied to non-stationary audio. However, its high computational cost poses a significant challenge, often leading researchers to prefer alternative methods such as the Short-Time Fourier Transform (STFT). To address this issue, this paper proposes a method to reduce the computational complexity of CWT by optimizing the length of the wavelet kernel and the hop size of the output scalogram. Experimental results demonstrate that the proposed approach significantly reduces computational cost while maintaining the robust performance of the trained model in acoustic recognition tasks.
翻译:连续小波变换(CWT)是卷积神经网络(CNN)在声学识别中特征提取的有效工具,尤其适用于非平稳音频信号。然而,其高昂的计算成本构成了显著挑战,常使研究者倾向于选择短时傅里叶变换(STFT)等其他方法。为解决此问题,本文提出一种通过优化小波核长度与输出尺度图步长来降低CWT计算复杂度的方法。实验结果表明,所提方案在保持声学识别任务中训练模型鲁棒性能的同时,显著降低了计算成本。