We introduce the Similarity-Distance-Magnitude (SDM) activation function, a more robust and interpretable formulation of the standard softmax activation function, adding Similarity (i.e., correctly predicted depth-matches into training) awareness and Distance-to-training-distribution awareness to the existing output Magnitude (i.e., decision-boundary) awareness, and enabling interpretability-by-exemplar via dense matching. We further introduce the SDM estimator, based on a data-driven partitioning of the class-wise empirical CDFs via the SDM activation, to control the class- and prediction-conditional accuracy among selective classifications. When used as the final-layer activation over pre-trained language models for selective classification, the SDM estimator is more robust to co-variate shifts and out-of-distribution inputs than existing calibration methods using softmax activations, while remaining informative over in-distribution data.
翻译:本文提出了相似性-距离-幅度(SDM)激活函数,该函数是对标准softmax激活函数的一种更鲁棒且可解释的改进形式。它在现有输出幅度(即决策边界)感知的基础上,增加了相似性感知(即正确预测的深度匹配融入训练)和到训练分布的距离感知,并通过密集匹配实现了基于范例的可解释性。我们进一步提出了SDM估计器,该方法基于通过SDM激活对类别经验累积分布函数进行数据驱动的划分,以控制选择性分类中的类别条件准确率与预测条件准确率。当在预训练语言模型的最顶层作为激活函数用于选择性分类时,与使用softmax激活的现有校准方法相比,SDM估计器对协变量偏移和分布外输入具有更强的鲁棒性,同时在分布内数据上仍能保持信息有效性。