Diffusion models have achieved significant success in both natural image and medical image domains, encompassing a wide range of applications. Previous investigations in medical images have often been constrained to specific anatomical regions, particular applications, and limited datasets, resulting in isolated diffusion models. This paper introduces a diffusion-based foundation model to address a diverse range of medical image tasks, namely MedDiff-FM. MedDiff-FM leverages 3D CT images from multiple publicly available datasets, covering anatomical regions from head to abdomen, to pre-train a diffusion foundation model, and explores the capabilities of the diffusion foundation model across a variety of application scenarios. The diffusion foundation model handles multi-level integrated image processing both at the image-level and patch-level, utilizes position embedding to establish multi-level spatial relationships, and leverages region classes and anatomical structures to capture certain anatomical regions. MedDiff-FM manages several downstream tasks seamlessly, including image denoising, anomaly detection, and image synthesis. MedDiff-FM is also capable of performing super-resolution, lesion generation, and lesion inpainting by rapidly fine-tuning the diffusion foundation model using ControlNet with task-specific conditions. The experimental results demonstrate the effectiveness of MedDiff-FM in addressing diverse downstream medical image tasks.
翻译:扩散模型在自然图像和医学图像领域均取得了显著成功,涵盖了广泛的应用。先前在医学图像方面的研究通常局限于特定的解剖区域、特定应用和有限的数据集,导致形成了孤立的扩散模型。本文提出了一种基于扩散模型的基础模型,旨在解决多样化的医学图像任务,即MedDiff-FM。MedDiff-FM利用来自多个公开数据集的3D CT图像,覆盖从头部到腹部的解剖区域,预训练了一个扩散基础模型,并探索了该扩散基础模型在各种应用场景中的能力。该扩散基础模型在图像级和块级处理多级集成图像处理,利用位置嵌入建立多级空间关系,并借助区域类别和解剖结构来捕捉特定的解剖区域。MedDiff-FM能够无缝处理多个下游任务,包括图像去噪、异常检测和图像合成。通过使用ControlNet结合任务特定条件对扩散基础模型进行快速微调,MedDiff-FM还能够执行超分辨率、病灶生成和病灶修复。实验结果证明了MedDiff-FM在处理多样化下游医学图像任务方面的有效性。