Background: As available medical image datasets increase in size, it becomes infeasible for clinicians to review content manually for knowledge extraction. The objective of this study was to create an automated clustering resulting in human-interpretable pattern discovery. Methods: Images from the public HAM10000 dataset, including 7 common pigmented skin lesion diagnoses, were tiled into 29420 tiles and clustered via k-means using neural network-extracted image features. The final number of clusters per diagnosis was chosen by either the elbow method or a compactness metric balancing intra-lesion variance and cluster numbers. The amount of resulting non-informative clusters, defined as those containing less than six image tiles, was compared between the two methods. Results: Applying k-means, the optimal elbow cutoff resulted in a mean of 24.7 (95%-CI: 16.4-33) clusters for every included diagnosis, including 14.9% (95% CI: 0.8-29.0) non-informative clusters. The optimal cutoff, as estimated by the compactness metric, resulted in significantly fewer clusters (13.4; 95%-CI 11.8-15.1; p=0.03) and less non-informative ones (7.5%; 95% CI: 0-19.5; p=0.017). The majority of clusters (93.6%) from the compactness metric could be manually mapped to previously described dermatoscopic diagnostic patterns. Conclusions: Automatically constraining unsupervised clustering can produce an automated extraction of diagnostically relevant and human-interpretable clusters of visual patterns from a large image dataset.


翻译:暂无翻译

0
下载
关闭预览

相关内容

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用
专知会员服务
41+阅读 · 2019年10月9日
灾难性遗忘问题新视角:迁移-干扰平衡
CreateAMind
17+阅读 · 2019年7月6日
disentangled-representation-papers
CreateAMind
26+阅读 · 2018年9月12日
论文浅尝 | 利用 RNN 和 CNN 构建基于 FreeBase 的问答系统
开放知识图谱
11+阅读 · 2018年4月25日
国家自然科学基金
2+阅读 · 2015年12月31日
国家自然科学基金
0+阅读 · 2014年12月31日
国家自然科学基金
6+阅读 · 2014年12月31日
VIP会员
相关基金
国家自然科学基金
2+阅读 · 2015年12月31日
国家自然科学基金
0+阅读 · 2014年12月31日
国家自然科学基金
6+阅读 · 2014年12月31日
Top
微信扫码咨询专知VIP会员