学习已有的东西:在网上字典中少见的手语识别 (Learning from What is Already Out There: Few-shot Sign Language Recognition with Online Dictionaries)

Today's sign language recognition models require large training corpora of laboratory-like videos, whose collection involves an extensive workforce and financial resources. As a result, only a handful of such systems are publicly available, not to mention their limited localization capabilities for less-populated sign languages. Utilizing online text-to-video dictionaries, which inherently hold annotated data of various attributes and sign languages, and training models in a few-shot fashion hence poses a promising path for the democratization of this technology. In this work, we collect and open-source the UWB-SL-Wild few-shot dataset, the first of its kind training resource consisting of dictionary-scraped videos. This dataset represents the actual distribution and characteristics of available online sign language data. We select glosses that directly overlap with the already existing datasets WLASL100 and ASLLVD and share their class mappings to allow for transfer learning experiments. Apart from providing baseline results on a pose-based architecture, we introduce a novel approach to training sign language recognition models in a few-shot scenario, resulting in state-of-the-art results on ASLLVD-Skeleton and ASLLVD-Skeleton-20 datasets with top-1 accuracy of $30.97~\%$ and $95.45~\%$, respectively.

翻译：今天的手语识别模式要求大量培训实验室式视频公司,其收集工作涉及大量劳动力和财政资源。因此,只有少数这类系统可以公开使用,更不用说其用于人口较少的手语的有限本地化能力。我们使用在线文本到视频词典,这些词典本身就拥有各种属性和手语的附加说明数据,而培训模式则以几张快照的方式为这种技术的民主化开辟了一条充满希望的道路。在这项工作中,我们收集并公开提供UWB-SL-Wild几发数据集,这是它由字典剪贴的视频组成的首个原始培训资源。这个数据集代表了现有在线手语数据的实际分布和特点。我们选择了直接与现有数据集WLASL100和ASLVDD直接重叠的遗漏,并分享了它们的班级图,以便能够转移学习实验。除了提供基于布局的架构的基线结果外,我们还引入了一种新型方法,在几幅图片情景中培训签名语言识别模型,这是由字典拼写出来的视频-20美元、SLV-D-SLS-D最高数据和SLAS-D-SLAS-S-SLAS-S-S-S-S-S-SlAS-S-S-SlAS-S-S-S-SlAS-S-S-S-S-SlAS-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-

相关内容

小样本学习

关注 215

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日