Recent advancements in foundation models for tabular data, such as TabPFN, demonstrated that pretrained Transformer architectures can approximate Bayesian inference with high predictive performance. However, Transformers suffer from quadratic complexity with respect to sequence length, motivating the exploration of more efficient sequence models. In this work, we investigate the potential of using Hydra, a bidirectional linear-time structured state space model (SSM), as an alternative to Transformers in TabPFN. A key challenge lies in SSM's inherent sensitivity to the order of input tokens - an undesirable property for tabular datasets where the row order is semantically meaningless. We investigate to what extent a bidirectional approach can preserve efficiency and enable symmetric context aggregation. Our experiments show that this approach reduces the order-dependence, achieving predictive performance competitive to the original TabPFN model.
翻译:近期在表格数据基础模型方面的进展,例如TabPFN,表明预训练的Transformer架构能够以较高的预测性能近似贝叶斯推断。然而,Transformer存在随序列长度呈二次方增长的复杂度问题,这促使了对更高效序列模型的探索。在本研究中,我们探讨了使用Hydra(一种双向线性时间结构化状态空间模型,SSM)作为TabPFN中Transformer替代方案的潜力。一个关键挑战在于SSM对输入标记顺序的固有敏感性——这对于行顺序在语义上无意义的表格数据集来说是一个不理想的特性。我们研究了双向方法在多大程度上能够保持效率并实现对称上下文聚合。实验结果表明,该方法减少了顺序依赖性,达到了与原始TabPFN模型相竞争的预测性能。