Probabilistic linear discriminant analysis (PLDA) is commonly used in speaker verification systems to score the similarity of speaker embeddings. Recent studies improved the performance of PLDA in domain-matched conditions by diagonalizing its covariance. We suspect such brutal pruning approach could eliminate its capacity in modeling dimension correlation of speaker embeddings, leading to inadequate performance with domain adaptation. This paper explores two alternative covariance regularization approaches, namely, interpolated PLDA and sparse PLDA, to tackle the problem. The interpolated PLDA incorporates the prior knowledge from cosine scoring to interpolate the covariance of PLDA. The sparse PLDA introduces a sparsity penalty to update the covariance. Experimental results demonstrate that both approaches outperform diagonal regularization noticeably with domain adaptation. In addition, in-domain data can be significantly reduced when training sparse PLDA for domain adaptation.
翻译:发言人核查系统通常使用概率线性分辨分析(PLDA)来测分语言嵌入的相似性。最近的研究通过对共差进行分解,提高了PLDA在与域相匹配的条件下的性能。我们怀疑,这种残酷的剪裁方法可能消除其在演示语音嵌入的维度相关性方面的能力,导致在域适应方面表现不力。本文探讨了两种替代的共差正规化方法,即内插的PLDA和稀疏的PLDA,以解决这一问题。内插的PLDA吸收了以前从COSine评分到PLDA内调入的共差的知识。稀有的PLDA引入了宽度惩罚以更新共差性。实验结果表明,两种方法都明显超越了对域适应的异性调整。此外,当对分散的PLDA进行域适应培训时,内部数据可以大大减少。