Gait patterns play a critical role in human identification and healthcare analytics, yet current progress remains constrained by small, narrowly designed models that fail to scale or generalize. Building a unified gait foundation model requires addressing two longstanding barriers: (a) Scalability. Why have gait models historically failed to follow scaling laws? (b) Generalization. Can one model serve the diverse gait tasks that have traditionally been studied in isolation? We introduce FoundationGait, the first scalable, self-supervised pretraining framework for gait understanding. Its largest version has nearly 0.13 billion parameters and is pretrained on 12 public gait datasets comprising over 2 million walking sequences. Extensive experiments demonstrate that FoundationGait, with or without fine-tuning, performs robustly across a wide spectrum of gait datasets, conditions, tasks (e.g., human identification, scoliosis screening, depression prediction, and attribute estimation), and even input modality. Notably, it achieves 48.0% zero-shot rank-1 accuracy on the challenging in-the-wild Gait3D dataset (1,000 test subjects) and 64.5% on the largest in-the-lab OU-MVLP dataset (5,000+ test subjects), setting a new milestone in robust gait recognition. Coming code and model: https://github.com/ShiqiYu/OpenGait.
翻译:步态模式在人类身份识别与健康分析中扮演着关键角色,然而当前进展仍受限于规模小、设计狭窄的模型,这些模型难以扩展或泛化。构建统一的步态基础模型需要解决两个长期存在的障碍:(a) 可扩展性。为何步态模型历来未能遵循缩放定律?(b) 泛化性。一个模型能否服务于传统上孤立研究的多样化步态任务?我们提出了FoundationGait,首个用于步态理解的可扩展、自监督预训练框架。其最大版本拥有近1.3亿参数,并在包含超过200万行走序列的12个公开步态数据集上进行预训练。大量实验表明,无论是否经过微调,FoundationGait在广泛的步态数据集、条件、任务(例如人类身份识别、脊柱侧弯筛查、抑郁预测和属性估计)乃至输入模态上均表现稳健。值得注意的是,它在具有挑战性的野外数据集Gait3D(1000名测试对象)上实现了48.0%的零样本Rank-1准确率,在最大的实验室数据集OU-MVLP(5000+名测试对象)上达到64.5%,为稳健步态识别树立了新的里程碑。即将发布的代码与模型:https://github.com/ShiqiYu/OpenGait。