Translated title: 在线学习在学生-教师框架中的随机特征模型中的应用 Translated abstract: 深度神经网络是广泛用于预测的算法，其性能通常会随着权重数量的增加而提高，导致过度参数化。我们考虑一种两层神经网络，其第一层被冻结，而最后一层可训练，被称为随机特征模型。我们通过推导学习动态的一组微分方程来研究过度参数化在学生-教师框架下的情况。对于任何有限的隐藏层大小与输入维数之比，学生不能完美地泛化，我们计算了非零渐近泛化误差。只有当学生的隐藏层大小指数级地大于输入维数时，才可能实现完美的泛化。 (Online Learning for the Random Feature Model in the Student-Teacher Framework)

翻译：Translated title: 在线学习在学生-教师框架中的随机特征模型中的应用 Translated abstract: 深度神经网络是广泛用于预测的算法，其性能通常会随着权重数量的增加而提高，导致过度参数化。我们考虑一种两层神经网络，其第一层被冻结，而最后一层可训练，被称为随机特征模型。我们通过推导学习动态的一组微分方程来研究过度参数化在学生-教师框架下的情况。对于任何有限的隐藏层大小与输入维数之比，学生不能完美地泛化，我们计算了非零渐近泛化误差。只有当学生的隐藏层大小指数级地大于输入维数时，才可能实现完美的泛化。

Roman Worschech,Bernd Rosenow

Deep neural networks are widely used prediction algorithms whose performance often improves as the number of weights increases, leading to over-parametrization. We consider a two-layered neural network whose first layer is frozen while the last layer is trainable, known as the random feature model. We study over-parametrization in the context of a student-teacher framework by deriving a set of differential equations for the learning dynamics. For any finite ratio of hidden layer size and input dimension, the student cannot generalize perfectly, and we compute the non-zero asymptotic generalization error. Only when the student's hidden layer size is exponentially larger than the input dimension, an approach to perfect generalization is possible.

翻译：