Foundation models have shown promise in medical imaging but remain underexplored for three-dimensional imaging modalities. No foundation model currently exists for Digital Breast Tomosynthesis (DBT), despite its use for breast cancer screening. To develop and evaluate a foundation model for DBT (DBT-DINO) across multiple clinical tasks and assess the impact of domain-specific pre-training. Self-supervised pre-training was performed using the DINOv2 methodology on over 25 million 2D slices from 487,975 DBT volumes from 27,990 patients. Three downstream tasks were evaluated: (1) breast density classification using 5,000 screening exams; (2) 5-year risk of developing breast cancer using 106,417 screening exams; and (3) lesion detection using 393 annotated volumes. For breast density classification, DBT-DINO achieved an accuracy of 0.79 (95\% CI: 0.76--0.81), outperforming both the MetaAI DINOv2 baseline (0.73, 95\% CI: 0.70--0.76, p<.001) and DenseNet-121 (0.74, 95\% CI: 0.71--0.76, p<.001). For 5-year breast cancer risk prediction, DBT-DINO achieved an AUROC of 0.78 (95\% CI: 0.76--0.80) compared to DINOv2's 0.76 (95\% CI: 0.74--0.78, p=.57). For lesion detection, DINOv2 achieved a higher average sensitivity of 0.67 (95\% CI: 0.60--0.74) compared to DBT-DINO with 0.62 (95\% CI: 0.53--0.71, p=.60). DBT-DINO demonstrated better performance on cancerous lesions specifically with a detection rate of 78.8\% compared to Dinov2's 77.3\%. Using a dataset of unprecedented size, we developed DBT-DINO, the first foundation model for DBT. DBT-DINO demonstrated strong performance on breast density classification and cancer risk prediction. However, domain-specific pre-training showed variable benefits on the detection task, with ImageNet baseline outperforming DBT-DINO on general lesion detection, indicating that localized detection tasks require further methodological development.
翻译:基础模型在医学影像领域展现出潜力,但在三维成像模态中仍未被充分探索。尽管数字乳腺断层合成(DBT)被用于乳腺癌筛查,但目前尚无针对该模态的基础模型。本研究旨在开发并评估一个面向DBT的基础模型(DBT-DINO),以应对多项临床任务,并评估领域特定预训练的影响。我们采用DINOv2方法,在来自27,990名患者的487,975个DBT体数据中提取超过2500万个二维切片进行自监督预训练。评估了三个下游任务:(1)使用5000例筛查检查进行乳腺密度分类;(2)使用106,417例筛查检查预测5年内罹患乳腺癌的风险;(3)使用393个标注体数据进行病变检测。在乳腺密度分类任务中,DBT-DINO的准确率达到0.79(95%置信区间:0.76-0.81),优于MetaAI DINOv2基线模型(0.73,95%置信区间:0.70-0.76,p<0.001)和DenseNet-121(0.74,95%置信区间:0.71-0.76,p<0.001)。在5年乳腺癌风险预测任务中,DBT-DINO的AUROC为0.78(95%置信区间:0.76-0.80),而DINOv2为0.76(95%置信区间:0.74-0.78,p=0.57)。在病变检测任务中,DINOv2的平均灵敏度更高,为0.67(95%置信区间:0.60-0.74),而DBT-DINO为0.62(95%置信区间:0.53-0.71,p=0.60)。但DBT-DINO在恶性病变检测方面表现更优,检出率为78.8%,高于DINOv2的77.3%。通过使用前所未有的数据集规模,我们开发了首个针对DBT的基础模型DBT-DINO。该模型在乳腺密度分类和癌症风险预测任务中表现出色。然而,领域特定预训练在检测任务中的效益存在差异:ImageNet基线模型在通用病变检测上优于DBT-DINO,这表明局部检测任务仍需进一步的方法学改进。