Monte Carlo (MC) dropout is a simple and efficient ensembling method that can improve the accuracy and confidence calibration of high-capacity deep neural network models. However, MC dropout is not as effective as more compute-intensive methods such as deep ensembles. This performance gap can be attributed to the relatively poor quality of individual models in the MC dropout ensemble and their lack of diversity. These issues can in turn be traced back to the coupled training and substantial parameter sharing of the dropout models. Motivated by this perspective, we propose a strategy to compute an ensemble of subnetworks, each corresponding to a non-overlapping dropout mask computed via a pruning strategy and trained independently. We show that the proposed subnetwork ensembling method can perform as well as standard deep ensembles in both accuracy and uncertainty estimates, yet with a computational efficiency similar to MC dropout. Lastly, using several computer vision datasets like CIFAR10/100, CUB200, and Tiny-Imagenet, we experimentally demonstrate that subnetwork ensembling also consistently outperforms recently proposed approaches that efficiently ensemble neural networks.
翻译:Monte Carlo (MC) 辍学是一种简单而高效的组合方法,可以提高高容量深神经网络模型的精确度和信心校准,但是,MC的辍学效果不如深团等更精密的计算密集方法那么有效。这一绩效差距可归因于MC的单个模型质量相对较差,而且缺乏多样性。这些问题反过来又可追溯到对辍学模型的结合培训和大量参数共享上。受这个观点的驱动,我们提出了一个计算子网络组合的战略,每个子网络都对应一个不重叠的隐蔽掩码,这些隐蔽掩码都是通过支管战略计算并独立培训的。我们表明,拟议的子网络组合方法既可以在准确性和不确定性的估算中起到标准的深度组合作用,而且具有类似于MC的计算效率。最后,我们利用一些计算机视觉数据集,如CFAR10100、CUB200和Tiny-Imagenet, 实验性地证明,子网络的组合也持续超越最近提议的高效率的网络。