基于轮廓的步态基础模型 (Silhouette-based Gait Foundation Model)

Gait patterns play a critical role in human identification and healthcare analytics, yet current progress remains constrained by small, narrowly designed models that fail to scale or generalize. Building a unified gait foundation model requires addressing two longstanding barriers: (a) Scalability. Why have gait models historically failed to follow scaling laws? (b) Generalization. Can one model serve the diverse gait tasks that have traditionally been studied in isolation? We introduce FoundationGait, the first scalable, self-supervised pretraining framework for gait understanding. Its largest version has nearly 0.13 billion parameters and is pretrained on 12 public gait datasets comprising over 2 million walking sequences. Extensive experiments demonstrate that FoundationGait, with or without fine-tuning, performs robustly across a wide spectrum of gait datasets, conditions, tasks (e.g., human identification, scoliosis screening, depression prediction, and attribute estimation), and even input modality. Notably, it achieves 48.0% zero-shot rank-1 accuracy on the challenging in-the-wild Gait3D dataset (1,000 test subjects) and 64.5% on the largest in-the-lab OU-MVLP dataset (5,000+ test subjects), setting a new milestone in robust gait recognition. Coming code and model: https://github.com/ShiqiYu/OpenGait.

翻译：步态模式在人类身份识别与健康分析中扮演着关键角色，然而当前进展仍受限于规模小、设计狭窄的模型，这些模型难以扩展或泛化。构建统一的步态基础模型需要解决两个长期存在的障碍：(a) 可扩展性。为何步态模型历来未能遵循缩放定律？(b) 泛化性。一个模型能否服务于传统上孤立研究的多样化步态任务？我们提出了FoundationGait，首个用于步态理解的可扩展、自监督预训练框架。其最大版本拥有近1.3亿参数，并在包含超过200万行走序列的12个公开步态数据集上进行预训练。大量实验表明，无论是否经过微调，FoundationGait在广泛的步态数据集、条件、任务（例如人类身份识别、脊柱侧弯筛查、抑郁预测和属性估计）乃至输入模态上均表现稳健。值得注意的是，它在具有挑战性的野外数据集Gait3D（1000名测试对象）上实现了48.0%的零样本Rank-1准确率，在最大的实验室数据集OU-MVLP（5000+名测试对象）上达到64.5%，为稳健步态识别树立了新的里程碑。即将发布的代码与模型：https://github.com/ShiqiYu/OpenGait。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日