In streaming scenarios, models must learn continuously, adapting to concept drifts without erasing previously acquired knowledge. However, existing research communities address these challenges in isolation. Continual Learning (CL) focuses on long-term retention and mitigating catastrophic forgetting, often without strict real-time constraints. Stream Learning (SL) emphasizes rapid, efficient adaptation to high-frequency data streams, but typically neglects forgetting. Recent efforts have tried to combine these paradigms, yet no clear algorithmic overlap exists. We argue that large in-context tabular models (LTMs) provide a natural bridge for Streaming Continual Learning (SCL). In our view, unbounded streams should be summarized on-the-fly into compact sketches that can be consumed by LTMs. This recovers the classical SL motivation of compressing massive streams with fixed-size guarantees, while simultaneously aligning with the experience-replay desiderata of CL. To clarify this bridge, we show how the SL and CL communities implicitly adopt a divide-to-conquer strategy to manage the tension between plasticity (performing well on the current distribution) and stability (retaining past knowledge), while also imposing a minimal complexity constraint that motivates diversification (avoiding redundancy in what is stored) and retrieval (re-prioritizing past information when needed). Within this perspective, we propose structuring SCL with LTMs around two core principles of data selection for in-context learning: (1) distribution matching, which balances plasticity and stability, and (2) distribution compression, which controls memory size through diversification and retrieval mechanisms.
翻译:在流式场景中,模型必须持续学习,适应概念漂移而不遗忘先前获取的知识。然而,现有研究社区往往孤立地应对这些挑战。持续学习(CL)侧重于长期知识保留和缓解灾难性遗忘,通常不设严格实时约束;流式学习(SL)强调对高频数据流的快速高效适应,但通常忽略遗忘问题。近期研究尝试融合这两种范式,但尚未形成明确的算法重叠。我们认为,大型上下文表格模型(LTMs)为流式持续学习(SCL)提供了天然的桥梁。我们认为,无限数据流应被实时汇总为紧凑的概要,供LTMs处理。这既恢复了SL通过固定大小保证压缩海量流的经典动机,又符合CL中经验回放的核心需求。为阐明这一桥梁,我们展示了SL和CL社区如何隐式采用分治策略来管理可塑性(在当前分布上表现良好)与稳定性(保留历史知识)之间的张力,同时施加最小复杂度约束以促进多样性(避免存储冗余)和检索(在需要时重新调用历史信息)。基于此视角,我们提出围绕上下文学习的两个数据选择核心原则构建基于LTMs的SCL框架:(1)分布匹配——平衡可塑性与稳定性;(2)分布压缩——通过多样性和检索机制控制内存规模。