具有字本玩具数据集的等级图像分类 (Hierarchical Image Classification with A Literally Toy Dataset)

Unsupervised domain adaptation (UDA) in image classification remains a big challenge. In existing UDA image dataset, classes are usually organized in a flattened way, where a plain classifier can be trained. Yet in some scenarios, the flat categories originate from some base classes. For example, buggies belong to the class bird. We define the classification task where classes have characteristics above and the flat classes and the base classes are organized hierarchically as hierarchical image classification. Intuitively, leveraging such hierarchical structure will benefit hierarchical image classification, e.g., two easily confusing classes may belong to entirely different base classes. In this paper, we improve the performance of classification by fusing features learned from a hierarchy of labels. Specifically, we train feature extractors supervised by hierarchical labels and with UDA technology, which will output multiple features for an input image. The features are subsequently concatenated to predict the finest-grained class. This study is conducted with a new dataset named Lego-15. Consisting of synthetic images and real images of the Lego bricks, the Lego-15 dataset contains 15 classes of bricks. Each class originates from a coarse-level label and a middle-level label. For example, class "85080" is associated with bricks (coarse) and bricks round (middle). In this dataset, we demonstrate that our method brings about consistent improvement over the baseline in UDA in hierarchical image classification. Extensive ablation and variant studies provide insights into the new dataset and the investigated algorithm.

翻译：在图像分类中,不受监督的域适应(UDA)仍是一个巨大的挑战。在现有的 UDA 图像数据集中,分类通常以平坦的方式组织,可以对普通分类器进行培训。但在某些情景中,平坦类别源于一些基类。例如,错误属于类鸟。我们定义分类任务,因为类具有上面的特性,平坦类和基类按等级排列为等级等级图像分类。从直觉看,利用这种等级结构将有利于等级图像分类,例如,两个容易混淆的类可能属于完全不同的基类。在本文件中,我们通过使用从标签等级结构中学习的功能来提高分类的性能。具体地说,我们用等级标签和UDA技术来培训特征提取器,这将为输入图像提供多种特性。这些特性随后被混为预测最优级。本级的这项研究将使用一个新的数据集,名为UGEGo-15。混编的合成图像和真实的等级图象。在本文件中,LGo-15 数据集使用从标签的15类中, 和每类中,将显示一个连续的模型的模型, 将显示为每类数据。