研究标题：重视手语翻译中演讲者重合的重要性研究摘要：手语翻译，即识别某人是否在进行手语表达，对于远程会议软件的应用和选择有用的手语数据进行培训手语识别或翻译任务越发重要。本文认为当前手语翻译基准数据集过度乐观地估计了结果，没有很好地实现泛化，因为训练和测试分区之间的演讲者重叠。我们通过详细分析演讲者重叠对当前手语翻译基准数据集的影响来量化这一点。通过比较DGS语料库和Signing in the Wild的有重叠和没有重叠的准确性，我们观察到相对准确性下降了4.17％和6.27％，并提出了新的数据集分区，这些数据集不重叠，可以使性能评估更加现实。我们希望这项工作能有助于提高手语翻译系统的准确性和泛化性。 (On the Importance of Signer Overlap for Sign Language Detection)

翻译：研究标题：重视手语翻译中演讲者重合的重要性研究摘要：手语翻译，即识别某人是否在进行手语表达，对于远程会议软件的应用和选择有用的手语数据进行培训手语识别或翻译任务越发重要。本文认为当前手语翻译基准数据集过度乐观地估计了结果，没有很好地实现泛化，因为训练和测试分区之间的演讲者重叠。我们通过详细分析演讲者重叠对当前手语翻译基准数据集的影响来量化这一点。通过比较DGS语料库和Signing in the Wild的有重叠和没有重叠的准确性，我们观察到相对准确性下降了4.17％和6.27％，并提出了新的数据集分区，这些数据集不重叠，可以使性能评估更加现实。我们希望这项工作能有助于提高手语翻译系统的准确性和泛化性。

Abhilash Pal,Stephan Huber,Cyrine Chaabani,Alessandro Manzotti,Oscar Koller

Sign language detection, identifying if someone is signing or not, is becoming crucially important for its applications in remote conferencing software and for selecting useful sign data for training sign language recognition or translation tasks. We argue that the current benchmark data sets for sign language detection estimate overly positive results that do not generalize well due to signer overlap between train and test partitions. We quantify this with a detailed analysis of the effect of signer overlap on current sign detection benchmark data sets. Comparing accuracy with and without overlap on the DGS corpus and Signing in the Wild, we observed a relative decrease in accuracy of 4.17% and 6.27%, respectively. Furthermore, we propose new data set partitions that are free of overlap and allow for more realistic performance assessment. We hope this work will contribute to improving the accuracy and generalization of sign language detection systems.

翻译：