野生中文字探测和识别机器人本地化 (Text Detection & Recognition in the Wild for Robot Localization)

Signage is everywhere and a robot should be able to take advantage of signs to help it localize (including Visual Place Recognition (VPR)) and map. Robust text detection & recognition in the wild is challenging due to such factors as pose, irregular text, illumination, and occlusion. We propose an end-to-end scene text spotting model that simultaneously outputs the text string and bounding boxes. This model is more suitable for VPR. Our central contribution is introducing utilizing an end-to-end scene text spotting framework to adequately capture the irregular and occluded text regions in different challenging places. To evaluate our proposed architecture's performance for VPR, we conducted several experiments on the challenging Self-Collected Text Place (SCTP) benchmark dataset. The initial experimental results show that the proposed method outperforms the SOTA methods in terms of precision and recall when tested on this benchmark.

翻译：信号无处不在, 机器人应该能够利用信号帮助其本地化( 包括视觉位置识别( VPR) 和地图 ) 。野生强健的文本检测和识别具有挑战性, 原因有如布局、不正常的文本、照明和隐蔽等。我们提出了一个端到端的文本识别模型, 同时输出文本字符串和捆绑框。这个模型更适合 VPR 。我们的核心贡献是使用端到端的现场文本识别框架, 以充分捕捉不同挑战性地点的不正常和隐蔽文本区域。为了评估我们提议的 VPR 结构的性能, 我们在具有挑战性的自译自审文本站基准数据集上进行了几项实验。初步实验结果表明, 拟议的方法在精确度上超过了 SOTA 方法, 并在测试该基准时提醒。

相关内容

声纹识别

关注 0

说话人识别（Speaker Recognition），或者称为声纹识别（Voiceprint Recognition, VPR），是根据语音中所包含的说话人个性信息，利用计算机以及现在的信息识别技术，自动鉴别说话人身份的一种生物特征识别技术。说话人识别研究的目的就是从语音中提取具有说话人表征性的特征，建立有效的模型和系统，实现自动精准的说话人鉴别。

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日