Translated title: 一个简单的方法用于量化神经网络 Translated abstract: 在这篇短文中，我们提出了一种新的方法，用于量化训练好的神经网络的权重。通过简单的确定性预处理步骤，我们允许在保持对给定训练数据上的网络性能不变的情况下，通过无记忆标量量化量化网络层。一方面，我们的预处理步骤的计算复杂度略高于文献中现有的最先进算法。另一方面，我们的方法不需要任何超参数调整，并允许纯分析，这体现了我们方法的简单易用。在量化单独的网络层时，我们提供了严格的理论保证，即如果训练数据表现良好，例如样本自适应于一些合适的随机分布，相对误差将随着网络参数数量的增加而减小。发展出的方法通过对单层网络的连续应用，也可以便捷地应用于深度网络的量化。 (A simple approach for quantizing neural networks)

2023 年 4 月 4 日

A simple approach for quantizing neural networks

翻译：Translated title: 一个简单的方法用于量化神经网络 Translated abstract: 在这篇短文中，我们提出了一种新的方法，用于量化训练好的神经网络的权重。通过简单的确定性预处理步骤，我们允许在保持对给定训练数据上的网络性能不变的情况下，通过无记忆标量量化量化网络层。一方面，我们的预处理步骤的计算复杂度略高于文献中现有的最先进算法。另一方面，我们的方法不需要任何超参数调整，并允许纯分析，这体现了我们方法的简单易用。在量化单独的网络层时，我们提供了严格的理论保证，即如果训练数据表现良好，例如样本自适应于一些合适的随机分布，相对误差将随着网络参数数量的增加而减小。发展出的方法通过对单层网络的连续应用，也可以便捷地应用于深度网络的量化。

Johannes Maly,Rayan Saab

In this short note, we propose a new method for quantizing the weights of a fully trained neural network. A simple deterministic pre-processing step allows us to quantize network layers via memoryless scalar quantization while preserving the network performance on given training data. On one hand, the computational complexity of this pre-processing slightly exceeds that of state-of-the-art algorithms in the literature. On the other hand, our approach does not require any hyper-parameter tuning and, in contrast to previous methods, allows a plain analysis. We provide rigorous theoretical guarantees in the case of quantizing single network layers and show that the relative error decays with the number of parameters in the network if the training data behaves well, e.g., if it is sampled from suitable random distributions. The developed method also readily allows the quantization of deep networks by consecutive application to single layers.

翻译：