The ROC-SVM, originally proposed by Rakotomamonjy, directly maximizes the area under the ROC curve (AUC) and has become an attractive alternative of the conventional binary classification under the presence of class imbalance. However, its practical use is limited by high computational cost, as training involves evaluating all $O(n^2)$. To overcome this limitation, we develop a scalable variant of the ROC-SVM that leverages incomplete U-statistics, thereby substantially reducing computational complexity. We further extend the framework to nonlinear classification through a low-rank kernel approximation, enabling efficient training in reproducing kernel Hilbert spaces. Theoretical analysis establishes an error bound that justifies the proposed approximation, and empirical results on both synthetic and real datasets demonstrate that the proposed method achieves comparable AUC performance to the original ROC-SVM with drastically reduced training time.
翻译:ROC-SVM最初由Rakotomamonjy提出,直接最大化受试者工作特征曲线下面积(AUC),已成为类别不平衡场景下传统二分类方法的有力替代方案。然而,其实际应用受限于高昂的计算成本,因为训练过程需要评估所有$O(n^2)$个样本对。为突破这一限制,我们开发了一种基于不完全U统计量的可扩展ROC-SVM变体,显著降低了计算复杂度。通过低秩核近似方法,我们将该框架进一步扩展至非线性分类领域,实现了再生核希尔伯特空间中的高效训练。理论分析建立了误差界以证明所提近似的合理性,在合成与真实数据集上的实验结果表明,所提方法在保持与原始ROC-SVM相当的AUC性能的同时,大幅缩短了训练时间。