PVeRA：基于概率向量的随机矩阵自适应方法 (PVeRA: Probabilistic Vector-Based Random Matrix Adaptation)

Large foundation models have emerged in the last years and are pushing performance boundaries for a variety of tasks. Training or even finetuning such models demands vast datasets and computational resources, which are often scarce and costly. Adaptation methods provide a computationally efficient solution to address these limitations by allowing such models to be finetuned on small amounts of data and computing power. This is achieved by appending new trainable modules to frozen backbones with only a fraction of the trainable parameters and fitting only these modules on novel tasks. Recently, the VeRA adapter was shown to excel in parameter-efficient adaptations by utilizing a pair of frozen random low-rank matrices shared across all layers. In this paper, we propose PVeRA, a probabilistic version of the VeRA adapter, which modifies the low-rank matrices of VeRA in a probabilistic manner. This modification naturally allows handling inherent ambiguities in the input and allows for different sampling configurations during training and testing. A comprehensive evaluation was performed on the VTAB-1k benchmark and seven adapters, with PVeRA outperforming VeRA and other adapters. Our code for training models with PVeRA and benchmarking all adapters is available https://github.com/leofillioux/pvera.

翻译：近年来，大型基础模型不断涌现，并在多种任务中不断突破性能边界。训练甚至微调此类模型需要海量数据集和计算资源，而这些资源往往稀缺且成本高昂。自适应方法通过允许此类模型在少量数据和计算能力上进行微调，提供了一种计算高效的解决方案来应对这些限制。这是通过将新的可训练模块附加到冻结的主干网络上实现的，这些模块仅包含一小部分可训练参数，并且仅针对新任务调整这些模块。最近，VeRA适配器通过利用一对在所有层之间共享的冻结随机低秩矩阵，在参数高效自适应方面表现出色。在本文中，我们提出了PVeRA，即VeRA适配器的概率版本，它以概率方式修改VeRA的低秩矩阵。这种修改自然地允许处理输入中固有的模糊性，并支持在训练和测试期间采用不同的采样配置。我们在VTAB-1k基准测试和七种适配器上进行了全面评估，PVeRA的表现优于VeRA及其他适配器。我们用于训练PVeRA模型及对所有适配器进行基准测试的代码可在https://github.com/leofillioux/pvera获取。