基于随机矩阵理论的表示模型泛化能力研究：在循环网络中的应用 (Generalization in Representation Models via Random Matrix Theory: Application to Recurrent Networks)

We first study the generalization error of models that use a fixed feature representation (frozen intermediate layers) followed by a trainable readout layer. This setting encompasses a range of architectures, from deep random-feature models to echo-state networks (ESNs) with recurrent dynamics. Working in the high-dimensional regime, we apply Random Matrix Theory to derive a closed-form expression for the asymptotic generalization error. We then apply this analysis to recurrent representations and obtain concise formula that characterize their performance. Surprisingly, we show that a linear ESN is equivalent to ridge regression with an exponentially time-weighted (''memory'') input covariance, revealing a clear inductive bias toward recent inputs. Experiments match predictions: ESNs win in low-sample, short-memory regimes, while ridge prevails with more data or long-range dependencies. Our methodology provides a general framework for analyzing overparameterized models and offers insights into the behavior of deep learning networks.

翻译：我们首先研究了采用固定特征表示（冻结中间层）后接可训练读出层的模型的泛化误差。该设定涵盖了一系列架构，从深度随机特征模型到具有循环动态的储备池网络（ESNs）。在高维体系下，我们应用随机矩阵理论推导出渐近泛化误差的闭式表达式。随后，我们将此分析应用于循环表示，并得到刻画其性能的简洁公式。令人惊讶的是，我们证明线性ESN等价于具有指数时间加权（“记忆”）输入协方差的岭回归，这揭示了对近期输入的明确归纳偏好。实验验证了预测结果：ESNs在低样本、短记忆机制中表现更优，而岭回归在数据量更大或存在长程依赖时更具优势。我们的方法论为分析过参数化模型提供了一个通用框架，并为深度学习网络的行为提供了新的见解。