Convolutional Neural Networks (CNNs) have successfully been used to classify diabetic retinopathy (DR) fundus images in recent times. However, deeper representations in CNNs may capture higher-level semantics at the expense of spatial resolution. To make predictions usable for ophthalmologists, we use a post-attention technique called Gradient-weighted Class Activation Mapping (Grad-CAM) on the penultimate layer of deep learning models to produce coarse localisation maps on DR fundus images. This is to help identify discriminative regions in the images, consequently providing evidence for ophthalmologists to make a diagnosis and potentially save lives by early diagnosis. Specifically, this study uses pre-trained weights from four state-of-the-art deep learning models to produce and compare localisation maps of DR fundus images. The models used include VGG16, ResNet50, InceptionV3, and InceptionResNetV2. We find that InceptionV3 achieves the best performance with a test classification accuracy of 96.07%, and localise lesions better and faster than the other models.
翻译:最近成功地利用进化神经网络(CNNs)对糖尿病视网膜病(DR)图象进行了分类,但是,在CNN的更深层次的表达方式可能会以空间分辨率为代价捕获更高层次的语义学。为了让眼科医生能够进行预测,我们在深层学习模型的倒数倒数第二层上使用一种叫作 " 渐进加权级活动定位图 " (Grad-CAM)的注意后技术来制作DR Fundus图象的粗略本地化图。这是为了帮助辨别图象中的歧视性区域,从而为眼科医生提供证据,以便进行诊断并有可能通过早期诊断来拯救生命。具体地说,本项研究使用四个最先进的深层学习模型预先训练的重量来制作和比较DRfundus图象的本地化图。所使用的模型包括VGG16、ResNet50、IncepionV3和InvitionResNet2。我们发现, IncepionV3在96.07 %的测试分类精确度模型中实现了最佳的性表现,并且使地方性变换的模型比其他的更快和更快。