Recommendation is crucial for both user experience and company revenue in Meituan as a leading lifestyle company, and generative recommendation models (GRMs) are shown to produce quality recommendations recently. However, existing systems are limited by insufficient functionality support and inefficient implementations for training GRMs in industrial scenarios. As such, we introduce MTGenRec as an efficient and scalable system for GRM training. Specifically, to handle real-time insertions/deletions of sparse embeddings, MTGenRec employs dynamic hash tables to replace static ones. To improve training efficiency, MTGenRec conducts dynamic sequence balancing to address the computation load imbalances among GPUs and adopts feature ID deduplication alongside automatic table merging to accelerate embedding lookup. Extensive experiments show that MTGenRec improves training throughput by $1.6\times -- 2.4\times$ while achieving good scalability when running over 100 GPUs. MTGenRec has been deployed for many applications in Meituan and is now handling hundreds of millions of requests on a daily basis. On the delivery platform, we observe a 1.22% growth in user order volume and a 1.31% enhancement in online PV_CTR.
翻译:作为领先的生活方式服务平台,推荐系统对美团用户体验和公司营收至关重要,而生成式推荐模型(GRMs)近期被证明能够产生高质量的推荐结果。然而,现有系统在工业场景下训练GRMs时,存在功能支持不足和实现效率低下的局限。为此,我们提出了MTGenRec,一个高效且可扩展的GRM训练系统。具体而言,为处理稀疏嵌入向量的实时插入/删除,MTGenRec采用动态哈希表替代静态表。为提高训练效率,MTGenRec通过动态序列平衡解决GPU间的计算负载不均衡问题,并采用特征ID去重与自动表合并技术以加速嵌入查找。大量实验表明,MTGenRec在超过100个GPU上运行时,训练吞吐量提升了$1.6\times -- 2.4\times$,同时展现出良好的可扩展性。MTGenRec已在美团多个业务中部署,目前每日处理数亿次请求。在外卖平台上,我们观察到用户订单量增长了1.22%,在线曝光点击率(PV_CTR)提升了1.31%。