Data labeling is often the most challenging task when developing computational pathology models. Pathologist participation is necessary to generate accurate labels, and the limitations on pathologist time and demand for large, labeled datasets has led to research in areas including weakly supervised learning using patient-level labels, machine assisted annotation and active learning. In this paper we explore self-supervised learning to reduce labeling burdens in computational pathology. We explore this in the context of classification of breast cancer tissue using the Barlow Twins approach, and we compare self-supervision with alternatives like pre-trained networks in low-data scenarios. For the task explored in this paper, we find that ImageNet pre-trained networks largely outperform the self-supervised representations obtained using Barlow Twins.
翻译:在开发计算病理学模型时,数据标签往往是最具挑战性的任务。病理学家的参与对于生成准确的标签是必要的,而病理学家的时间和对大型、标签数据集的需求方面的限制也导致了对一些领域的研究,包括缺乏监督的利用患者等级标签的学习、机器辅助说明和积极学习。在本文中,我们探索自我监督的学习,以减少计算病理学中的标签负担。我们在使用巴洛双胞胎方法对乳腺癌组织进行分类时探索了这一点,并将自我监督的观察与诸如低数据情景中预先培训的网络等替代品进行比较。在本文中探讨的任务中,我们发现图像网预先培训的网络大大优于使用巴洛双胞胎系统所获得的自我监督的表述。