STORE：面向排序模型扩展的语义分词、正交旋转与高效注意力机制 (STORE: Semantic Tokenization, Orthogonal Rotation and Efficient Attention for Scaling Up Ranking Models)

Ranking models have become an important part of modern personalized recommendation systems. However, significant challenges persist in handling high-cardinality, heterogeneous, and sparse feature spaces, particularly regarding model scalability and efficiency. We identify two key bottlenecks: (i) Representation Bottleneck: Driven by the high cardinality and dynamic nature of features, model capacity is forced into sparse-activated embedding layers, leading to low-rank representations. This, in turn, triggers phenomena like "One-Epoch" and "Interaction-Collapse," ultimately hindering model scalability.(ii) Computational Bottleneck: Integrating all heterogeneous features into a unified model triggers an explosion in the number of feature tokens, rendering traditional attention mechanisms computationally demanding and susceptible to attention dispersion. To dismantle these barriers, we introduce STORE, a unified and scalable token-based ranking framework built upon three core innovations: (1) Semantic Tokenization fundamentally tackles feature heterogeneity and sparsity by decomposing high-cardinality sparse features into a compact set of stable semantic tokens; and (2) Orthogonal Rotation Transformation is employed to rotate the subspace spanned by low-cardinality static features, which facilitates more efficient and effective feature interactions; and (3) Efficient attention that filters low-contributing tokens to improve computional efficiency while preserving model accuracy. Across extensive offline experiments and online A/B tests, our framework consistently improves prediction accuracy(online CTR by 2.71%, AUC by 1.195%) and training effeciency (1.84 throughput).

翻译：排序模型已成为现代个性化推荐系统的重要组成部分。然而，在处理高基数、异构且稀疏的特征空间时，尤其是在模型可扩展性与效率方面，仍存在显著挑战。我们识别出两个关键瓶颈：（i）表示瓶颈：受高基数特征与特征动态性的驱动，模型容量被迫集中于稀疏激活的嵌入层，导致低秩表示。这进而引发“单周期”与“交互塌缩”等现象，最终阻碍模型的可扩展性。（ii）计算瓶颈：将所有异构特征集成至统一模型中会引发特征令牌数量的爆炸式增长，使得传统注意力机制计算开销巨大且易受注意力分散影响。为消除这些障碍，我们提出了STORE，一个基于令牌的统一可扩展排序框架，其建立在三项核心创新之上：（1）语义分词通过将高基数稀疏特征分解为一组紧凑且稳定的语义令牌，从根本上解决特征异构性与稀疏性问题；（2）正交旋转变换用于旋转由低基数静态特征张成的子空间，以促进更高效且有效的特征交互；（3）高效注意力机制通过过滤低贡献令牌来提升计算效率，同时保持模型精度。在广泛的离线实验与在线A/B测试中，我们的框架持续提升了预测准确性（在线CTR提升2.71%，AUC提升1.195%）与训练效率（吞吐量提升1.84倍）。

相关内容

排序

关注 313

排序是计算机内经常进行的一种操作，其目的是将一组“无序”的记录序列调整为“有序”的记录序列。分内部排序和外部排序。若整个排序过程不需要访问外存便能完成，则称此类排序问题为内部排序。反之，若参加排序的记录数量很大，整个序列的排序过程不可能在内存中完成，则称此类排序问题为外部排序。内部排序的过程是一个逐步扩大记录的有序序列长度的过程。

【CVPR2024】MoReVQA:探索视频问答的模块化推理模型

专知会员服务

18+阅读 · 2024年4月10日

【WWW2024】GraphPro：推荐系统中的图预训练与提示学习

专知会员服务

23+阅读 · 2024年1月26日

【TPAMI2022】关联关系驱动的多模态分类，AF: An Association-based Fusion Method for Multi-Modal Classification

专知会员服务

27+阅读 · 2022年3月22日

GRAPH-BERT ：学习图表示只需要注意力，GRAPH-BERT : Only Attention is Needed for Learning Graph Representations

专知会员服务

78+阅读 · 2020年5月31日