We present a novel logic-based concept called Space Explanations for classifying neural networks that gives provable guarantees of the behavior of the network in continuous areas of the input feature space. To automatically generate space explanations, we leverage a range of flexible Craig interpolation algorithms and unsatisfiable core generation. Based on real-life case studies, ranging from small to medium to large size, we demonstrate that the generated explanations are more meaningful than those computed by state-of-the-art.
翻译:我们提出了一种新颖的基于逻辑的概念——空间解释,用于对神经网络进行分类,该概念能在输入特征空间的连续区域中提供网络行为的可证明保证。为自动生成空间解释,我们利用了一系列灵活的Craig插值算法和不可满足核心生成技术。基于从小型到中型再到大型的实际案例研究,我们证明所生成的解释比现有最先进方法计算出的解释更具意义。