Recent advancements in foundation models for tabular data, such as TabPFN, demonstrated that pretrained Transformer architectures can approximate Bayesian inference with high predictive performance. However, Transformers suffer from quadratic complexity with respect to sequence length, motivating the exploration of more efficient sequence models. In this work, we investigate the potential of using Hydra, a bidirectional linear-time structured state space model (SSM), as an alternative to Transformers in TabPFN. A key challenge lies in SSM's inherent sensitivity to the order of input tokens - an undesirable property for tabular datasets where the row order is semantically meaningless. We investigate to what extent a bidirectional approach can preserve efficiency and enable symmetric context aggregation. Our experiments show that this approach reduces the order-dependence, achieving predictive performance competitive to the original TabPFN model.


翻译:近期在表格数据基础模型方面的进展,例如TabPFN,表明预训练的Transformer架构能够以较高的预测性能近似贝叶斯推断。然而,Transformer存在随序列长度呈二次方增长的复杂度问题,这促使了对更高效序列模型的探索。在本研究中,我们探讨了使用Hydra(一种双向线性时间结构化状态空间模型,SSM)作为TabPFN中Transformer替代方案的潜力。一个关键挑战在于SSM对输入标记顺序的固有敏感性——这对于行顺序在语义上无意义的表格数据集来说是一个不理想的特性。我们研究了双向方法在多大程度上能够保持效率并实现对称上下文聚合。实验结果表明,该方法减少了顺序依赖性,达到了与原始TabPFN模型相竞争的预测性能。

0
下载
关闭预览

相关内容

ACM/IEEE第23届模型驱动工程语言和系统国际会议,是模型驱动软件和系统工程的首要会议系列,由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来,模型涵盖了建模的各个方面,从语言和方法到工具和应用程序。模特的参加者来自不同的背景,包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛,参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会,并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。 官网链接:http://www.modelsconference.org/
Top
微信扫码咨询专知VIP会员