弹性网络多核学习：结合多源数据进行预测 (Elastic-Net Multiple Kernel Learning: Combining Multiple Data Sources for Prediction)

Multiple Kernel Learning (MKL) models combine several kernels in supervised and unsupervised settings to integrate multiple data representations or sources, each represented by a different kernel. MKL seeks an optimal linear combination of base kernels that maximizes a generalized performance measure under a regularization constraint. Various norms have been used to regularize the kernel weights, including $l1$, $l2$ and $lp$, as well as the "elastic-net" penalty, which combines $l1$- and $l2$-norm to promote both sparsity and the selection of correlated kernels. This property makes elastic-net regularized MKL (ENMKL) especially valuable when model interpretability is critical and kernels capture correlated information, such as in neuroimaging. Previous ENMKL methods have followed a two-stage procedure: fix kernel weights, train a support vector machine (SVM) with the weighted kernel, and then update the weights via gradient descent, cutting-plane methods, or surrogate functions. Here, we introduce an alternative ENMKL formulation that yields a simple analytical update for the kernel weights. We derive explicit algorithms for both SVM and kernel ridge regression (KRR) under this framework, and implement them in the open-source Pattern Recognition for Neuroimaging Toolbox (PRoNTo). We evaluate these ENMKL algorithms against $l1$-norm MKL and against SVM (or KRR) trained on the unweighted sum of kernels across three neuroimaging applications. Our results show that ENMKL matches or outperforms $l1$-norm MKL in all tasks and only underperforms standard SVM in one scenario. Crucially, ENMKL produces sparser, more interpretable models by selectively weighting correlated kernels.

翻译：多核学习（MKL）模型在监督与无监督设置中整合多个核函数，以融合由不同核表示的多数据表征或来源。MKL旨在正则化约束下，通过寻求基核的最优线性组合来最大化广义性能指标。现有研究采用多种范数对核权重进行正则化，包括$l1$、$l2$、$lp$范数，以及结合$l1$与$l2$范数以促进稀疏性与相关核选择的“弹性网络”惩罚项。该特性使弹性网络正则化多核学习（ENMKL）在模型可解释性至关重要且核函数捕获相关信息时（如神经影像分析）具有特殊价值。既往ENMKL方法采用两阶段流程：固定核权重后，使用加权核训练支持向量机（SVM），再通过梯度下降、割平面法或代理函数更新权重。本文提出一种新的ENMKL形式化方法，可推导出核权重的简洁解析更新式。基于此框架，我们分别针对SVM与核岭回归（KRR）推导出显式算法，并在开源神经影像模式识别工具箱（PRoNTo）中实现。通过在三个神经影像应用场景中，将ENMKL算法与$l1$范数MKL及未加权核求和训练的SVM（或KRR）进行对比评估，结果表明：ENMKL在所有任务中均达到或优于$l1$范数MKL，仅在一个场景中略逊于标准SVM。关键的是，ENMKL通过对相关核进行选择性加权，可生成更稀疏、更具可解释性的模型。