基于滑动窗口级对比学习的动态残差编码用于端到端全切片图像表示 (Dynamic Residual Encoding with Slide-Level Contrastive Learning for End-to-End Whole Slide Image Representation)

Whole Slide Image (WSI) representation is critical for cancer subtyping, cancer recognition and mutation prediction.Training an end-to-end WSI representation model poses significant challenges, as a standard gigapixel slide can contain tens of thousands of image tiles, making it difficult to compute gradients of all tiles in a single mini-batch due to current GPU limitations. To address this challenge, we propose a method of dynamic residual encoding with slide-level contrastive learning (DRE-SLCL) for end-to-end WSI representation. Our approach utilizes a memory bank to store the features of tiles across all WSIs in the dataset. During training, a mini-batch usually contains multiple WSIs. For each WSI in the batch, a subset of tiles is randomly sampled and their features are computed using a tile encoder. Then, additional tile features from the same WSI are selected from the memory bank. The representation of each individual WSI is generated using a residual encoding technique that incorporates both the sampled features and those retrieved from the memory bank. Finally, the slide-level contrastive loss is computed based on the representations and histopathology reports ofthe WSIs within the mini-batch. Experiments conducted over cancer subtyping, cancer recognition, and mutation prediction tasks proved the effectiveness of the proposed DRE-SLCL method.

翻译：全切片图像（WSI）表示对于癌症亚型分类、癌症识别和突变预测至关重要。训练端到端的WSI表示模型面临重大挑战，因为标准的千兆像素级切片可能包含数万个图像块，在当前GPU限制下难以在单个小批量中计算所有图像块的梯度。为解决这一挑战，我们提出了一种结合滑动窗口级对比学习的动态残差编码方法（DRE-SLCL），用于端到端的WSI表示。我们的方法利用一个记忆库存储数据集中所有WSI的图像块特征。在训练过程中，一个小批量通常包含多个WSI。对于批次中的每个WSI，随机采样一个图像块子集，并使用图像块编码器计算其特征。随后，从记忆库中选取同一WSI的额外图像块特征。每个独立WSI的表示通过残差编码技术生成，该技术整合了采样特征和从记忆库检索的特征。最后，基于小批量内WSI的表示和病理报告计算滑动窗口级对比损失。在癌症亚型分类、癌症识别和突变预测任务上进行的实验证明了所提出的DRE-SLCL方法的有效性。