图像网络大型开放版图像网络分类程序 (Large-Scale Open-Set Classification Protocols for ImageNet)

Open-Set Classification (OSC) intends to adapt closed-set classification models to real-world scenarios, where the classifier must correctly label samples of known classes while rejecting previously unseen unknown samples. Only recently, research started to investigate on algorithms that are able to handle these unknown samples correctly. Some of these approaches address OSC by including into the training set negative samples that a classifier learns to reject, expecting that these data increase the robustness of the classifier on unknown classes. Most of these approaches are evaluated on small-scale and low-resolution image datasets like MNIST, SVHN or CIFAR, which makes it difficult to assess their applicability to the real world, and to compare them among each other. We propose three open-set protocols that provide rich datasets of natural images with different levels of similarity between known and unknown classes. The protocols consist of subsets of ImageNet classes selected to provide training and testing data closer to real-world scenarios. Additionally, we propose a new validation metric that can be employed to assess whether the training of deep learning models addresses both the classification of known samples and the rejection of unknown samples. We use the protocols to compare the performance of two baseline open-set algorithms to the standard SoftMax baseline and find that the algorithms work well on negative samples that have been seen during training, and partially on out-of-distribution detection tasks, but drop performance in the presence of samples from previously unseen unknown classes.

翻译：开放版分类(OSC) 打算将封闭式分类模型用于真实世界情景,其中,分类者必须正确标签已知类别样本,同时拒绝先前不为人知的未知样本。直到最近,才开始研究能够正确处理这些未知样本的算法。其中一些方法将分类者学会拒绝的负面样本纳入培训中,从而解决开放版分类(OSC),期望这些数据能提高分类者在未知类别上的可靠性。大多数这些方法都用小型和低分辨率图像数据集来评估,如MNIST、SVHN或CIFAR, 这使得难以评估这些样本对真实世界的适用性和相互比较。我们建议了三种开放版协议,提供丰富的自然图像数据集,这些数据集在已知类别和未知类别之间具有不同程度的相似性。这些协议由为提供培训和测试数据而选择的图像网络班子组成,更接近现实世界情景。此外,我们建议采用新的验证度指标来评估深层次学习模型的培训是否既涉及已知样本的分类,也使得难以对未知样本的分类进行比较。我们建议了三个开放型样本的测试程序,在以前用来比较了标准样本的绩效,在标准样本中进行了部分分析。