Approximation and learning of classifiers of large data sets by neural networks in terms of high-dimensional geometry and statistical learning theory are investigated. The influence of the VC dimension of sets of input-output functions of networks on approximation capabilities is compared with its influence on consistency in learning from samples of data. It is shown that, whereas finite VC dimension is desirable for uniform convergence of empirical errors, it may not be desirable for approximation of functions drawn from a probability distribution modeling the likelihood that they occur in a given type of application. Based on the concentration-of-measure properties of high dimensional geometry, it is proven that both errors in approximation and empirical errors behave almost deterministically for networks implementing sets of input-output functions with finite VC dimensions in processing large data sets. Practical limitations of the universal approximation property, the trade-offs between the accuracy of approximation and consistency in learning from data, and the influence of depth of networks with ReLU units on their accuracy and consistency are discussed.
翻译:本文基于高维几何与统计学习理论,研究了神经网络对大规模数据集分类器的逼近与学习问题。通过比较网络输入-输出函数集合的VC维对其逼近能力的影响与对数据样本学习一致性的影响,研究发现:尽管有限VC维有利于经验误差的一致收敛,但对于从模拟特定应用类型中函数出现概率的分布中抽取的函数逼近而言,有限VC维可能并非理想特性。基于高维几何的测度集中性质,本文证明在处理大规模数据集时,对于实现有限VC维输入-输出函数集合的网络,其逼近误差与经验误差均表现出近乎确定性的行为。文中进一步探讨了通用逼近性质的实际局限性、逼近精度与数据学习一致性之间的权衡关系,以及带ReLU单元的网络深度对其精度与一致性的影响。