由于反反感染数量不稳定,神经网络培训受到限制 (Limitations of neural network training due to numerical instability of backpropagation)

We study the training of deep neural networks by gradient descent where floating-point arithmetic is used to compute the gradients. In this framework and under realistic assumptions, we demonstrate that it is highly unlikely to find ReLU neural networks that maintain, in the course of training with gradient descent, superlinearly many affine pieces with respect to their number of layers. In virtually all approximation theoretical arguments which yield high order polynomial rates of approximation, sequences of ReLU neural networks with exponentially many affine pieces compared to their numbers of layers are used. As a consequence, we conclude that approximating sequences of ReLU neural networks resulting from gradient descent in practice differ substantially from theoretically constructed sequences. The assumptions and the theoretical results are compared to a numerical study, which yields concurring results.

翻译：我们研究深神经网络的深层神经网络的深层梯度下降,使用浮点计算法来计算梯度。在这个框架和现实假设下,我们证明极不可能找到在梯度下降培训过程中保持与其层数有关的超线性大量线性神经网络。几乎所有近似理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论和理论理论理论理论理论理论理论理论理论理论理论理论理论和理论理论理论理论理论理论理论理论理论理论和理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论和理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论理论

相关内容

Networking

关注 22

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日