Deep neural networks suffer from catastrophic forgetting, where performance on previous tasks degrades after training on a new task. This issue arises due to the model's tendency to overwrite previously acquired knowledge with new information. We present a novel approach to address this challenge, focusing on the intersection of memory-based methods and regularization approaches. We formulate a regularization strategy, termed Information Maximization (IM) regularizer, for memory-based continual learning methods, which is based exclusively on the expected label distribution, thus making it class-agnostic. As a consequence, IM regularizer can be directly integrated into various rehearsal-based continual learning methods, reducing forgetting and favoring faster convergence. Our empirical validation shows that, across datasets and regardless of the number of tasks, our proposed regularization strategy consistently improves baseline performance at the expense of a minimal computational overhead. The lightweight nature of IM ensures that it remains a practical and scalable solution, making it applicable to real-world continual learning scenarios where efficiency is paramount. Finally, we demonstrate the data-agnostic nature of our regularizer by applying it to video data, which presents additional challenges due to its temporal structure and higher memory requirements. Despite the significant domain gap, our experiments show that IM regularizer also improves the performance of video continual learning methods.
翻译:深度神经网络遭受灾难性遗忘的困扰,即在训练新任务后,对先前任务的性能会下降。这一问题源于模型倾向于用新信息覆盖先前获得的知识。我们提出了一种新颖的方法来应对这一挑战,重点关注基于记忆的方法与正则化方法的交叉点。我们为基于记忆的持续学习方法制定了一种正则化策略,称为信息最大化(IM)正则化器,该策略完全基于期望的标签分布,因此是类别无关的。因此,IM正则化器可以直接集成到各种基于回放的持续学习方法中,减少遗忘并促进更快的收敛。我们的实证验证表明,在不同数据集上且无论任务数量如何,我们提出的正则化策略始终以最小的计算开销为代价,提高了基线性能。IM的轻量级特性确保其保持实用性和可扩展性,使其适用于效率至关重要的现实世界持续学习场景。最后,我们通过将正则化器应用于视频数据来展示其数据无关性,视频数据由于其时间结构和更高的内存需求而带来额外挑战。尽管存在显著的领域差距,我们的实验表明IM正则化器也提高了视频持续学习方法的性能。