通过上下文大型表格模型桥接流式持续学习 (Bridging Streaming Continual Learning via In-Context Large Tabular Models)

In streaming scenarios, models must learn continuously, adapting to concept drifts without erasing previously acquired knowledge. However, existing research communities address these challenges in isolation. Continual Learning (CL) focuses on long-term retention and mitigating catastrophic forgetting, often without strict real-time constraints. Stream Learning (SL) emphasizes rapid, efficient adaptation to high-frequency data streams, but typically neglects forgetting. Recent efforts have tried to combine these paradigms, yet no clear algorithmic overlap exists. We argue that large in-context tabular models (LTMs) provide a natural bridge for Streaming Continual Learning (SCL). In our view, unbounded streams should be summarized on-the-fly into compact sketches that can be consumed by LTMs. This recovers the classical SL motivation of compressing massive streams with fixed-size guarantees, while simultaneously aligning with the experience-replay desiderata of CL. To clarify this bridge, we show how the SL and CL communities implicitly adopt a divide-to-conquer strategy to manage the tension between plasticity (performing well on the current distribution) and stability (retaining past knowledge), while also imposing a minimal complexity constraint that motivates diversification (avoiding redundancy in what is stored) and retrieval (re-prioritizing past information when needed). Within this perspective, we propose structuring SCL with LTMs around two core principles of data selection for in-context learning: (1) distribution matching, which balances plasticity and stability, and (2) distribution compression, which controls memory size through diversification and retrieval mechanisms.

翻译：在流式场景中，模型必须持续学习，适应概念漂移而不遗忘先前获取的知识。然而，现有研究社区往往孤立地应对这些挑战。持续学习（CL）侧重于长期知识保留和缓解灾难性遗忘，通常不设严格实时约束；流式学习（SL）强调对高频数据流的快速高效适应，但通常忽略遗忘问题。近期研究尝试融合这两种范式，但尚未形成明确的算法重叠。我们认为，大型上下文表格模型（LTMs）为流式持续学习（SCL）提供了天然的桥梁。我们认为，无限数据流应被实时汇总为紧凑的概要，供LTMs处理。这既恢复了SL通过固定大小保证压缩海量流的经典动机，又符合CL中经验回放的核心需求。为阐明这一桥梁，我们展示了SL和CL社区如何隐式采用分治策略来管理可塑性（在当前分布上表现良好）与稳定性（保留历史知识）之间的张力，同时施加最小复杂度约束以促进多样性（避免存储冗余）和检索（在需要时重新调用历史信息）。基于此视角，我们提出围绕上下文学习的两个数据选择核心原则构建基于LTMs的SCL框架：（1）分布匹配——平衡可塑性与稳定性；（2）分布压缩——通过多样性和检索机制控制内存规模。