以面具为基础的加强学习重建网络 (Mask-based Latent Reconstruction for Reinforcement Learning) - 专知论文

会员服务 ·

0

学成 · Extensibility · 潜在 · 强化学习 · 掩码 ·

2022 年 5 月 26 日

Mask-based Latent Reconstruction for Reinforcement Learning

翻译：以面具为基础的加强学习重建网络

Tao Yu,Zhizheng Zhang,Cuiling Lan,Yan Lu,Zhibo Chen

from arxiv, A latent-space masked modeling method for improving RL sample efficiency

For deep reinforcement learning (RL) from pixels, learning effective state representations is crucial for achieving high performance. However, in practice, limited experience and high-dimensional input prevent effective representation learning. To address this, motivated by the success of masked modeling in other research fields, we introduce mask-based reconstruction to promote state representation learning in RL. Specifically, we propose a simple yet effective self-supervised method, Mask-based Latent Reconstruction (MLR), to predict the complete state representations in the latent space from the observations with spatially and temporally masked pixels. MLR enables the better use of context information when learning state representations to make them more informative, which facilitates RL agent training. Extensive experiments show that our MLR significantly improves the sample efficiency in RL and outperforms the state-of-the-art sample-efficient RL methods on multiple continuous and discrete control benchmarks. The code will be released soon.

翻译：对于从像素中深入强化学习(RL)而言,学习有效的国家表现对于取得高性能至关重要,但在实践中,经验有限和高维投入阻碍了有效的代表性学习。为了解决这一问题,由于在其他研究领域成功采用蒙面模型,我们引入了以面具为基础的重建,以促进在RL中进行国家代表性学习。具体地说,我们提议了一种简单而有效的自我监督方法,即以面具为基础的远程重建(MLR),用空间和时间蒙面像素来预测从观测的潜伏空间中完全的状态表现。MLR能够让在学习国家表现时更好地利用背景信息,使其更具信息性,从而便利RL代理培训。广泛的实验表明,我们的MLR大大提高了RL的样本效率,并超越了在多连续和离散控制基准上采用的最新样本高效RL方法。该代码将很快发布。

0

相关内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

EADIA调节抑癌基因DCC凋亡通路的分子机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

外源有机物在稻田土壤中分解转化与甲烷排放关联性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calreticulin突变在JAK2 V617F阴性的骨髓增殖性肿瘤中的研究

国家自然科学基金

0+阅读 · 2013年12月31日

电针干预慢性应激模型大鼠CREB-BDNF通路表观遗传学表达的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

TRAIL协同IER3调节NF-κB信号通路介导肝癌细胞凋亡的相关机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

Ghrelin/GHSR通过AMPK信号通路调控动脉粥样硬化斑块稳定性的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Period2基因调控人胶质瘤细胞凋亡的分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Antrum mucosa protein-18（AMP-18）参与胃粘膜上皮细胞癌变的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

早幼粒细胞白血病锌指基因（PLZF）变异对小鼠骨骼和软骨发育的影响研究

国家自然科学基金

0+阅读 · 2009年12月31日

Learning Discriminative Representation via Metric Learning for Imbalanced Medical Image Classification

Arxiv

0+阅读 · 2022年7月14日

What Makes for Automatic Reconstruction of Pulmonary Segments

What Makes for Automatic Reconstruction of Pulmonary Segments

Arxiv

0+阅读 · 2022年7月14日

Single-Pixel Image Reconstruction Based on Block Compressive Sensing and Deep Learning

Arxiv

0+阅读 · 2022年7月14日

PAC Reinforcement Learning for Predictive State Representations

PAC Reinforcement Learning for Predictive State Representations

Arxiv

0+阅读 · 2022年7月12日

Label-Efficient Self-Supervised Speaker Verification With Information Maximization and Contrastive Learning

Arxiv

0+阅读 · 2022年7月12日

Accelerated Reinforcement Learning for Temporal Logic Control Objectives

Arxiv

0+阅读 · 2022年7月12日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

CURL: Contrastive Unsupervised Representations for Reinforcement Learning

Arxiv

17+阅读 · 2020年4月28日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

VIP会员

文章信息

相关主题

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《代码、指挥与冲突：描绘军事人工智能的未来》报告

【斯坦福博士论文】面向地理空间数据的多模态与多尺度建模：时空生成式人工智能

美国启动“自有军事人工智能计划”：采用谷歌Gemini以推动全军人工智能应用

《创新与适应性作为军事成功的关键因素：来自俄乌战争的战略洞见》报告

相关资讯

IEEE ICKG 2022: Call for Papers

IEEE ICKG 2022: Call for Papers

机器学习与推荐算法

3+阅读 · 2022年3月30日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

【论文推荐】最新5篇度量学习（Metric Learning）相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习

专知

17+阅读 · 2018年2月11日

相关论文

Learning Discriminative Representation via Metric Learning for Imbalanced Medical Image Classification

Arxiv

0+阅读 · 2022年7月14日

What Makes for Automatic Reconstruction of Pulmonary Segments

What Makes for Automatic Reconstruction of Pulmonary Segments

Arxiv

0+阅读 · 2022年7月14日

Single-Pixel Image Reconstruction Based on Block Compressive Sensing and Deep Learning

Arxiv

0+阅读 · 2022年7月14日

PAC Reinforcement Learning for Predictive State Representations

PAC Reinforcement Learning for Predictive State Representations

Arxiv

0+阅读 · 2022年7月12日

Label-Efficient Self-Supervised Speaker Verification With Information Maximization and Contrastive Learning

Arxiv

0+阅读 · 2022年7月12日

Accelerated Reinforcement Learning for Temporal Logic Control Objectives

Arxiv

0+阅读 · 2022年7月12日

Reinforcement Learning on Graph: A Survey

Arxiv

67+阅读 · 2022年4月13日

CURL: Contrastive Unsupervised Representations for Reinforcement Learning

Arxiv

17+阅读 · 2020年4月28日

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

Arxiv

15+阅读 · 2020年3月31日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

相关基金

EADIA调节抑癌基因DCC凋亡通路的分子机制研究

国家自然科学基金

0+阅读 · 2016年12月31日

外源有机物在稻田土壤中分解转化与甲烷排放关联性研究

国家自然科学基金

0+阅读 · 2013年12月31日

Partial Spread Bent函数与Bent-Negabent函数的构造及密码学性质研究

国家自然科学基金

0+阅读 · 2013年12月31日

Calreticulin突变在JAK2 V617F阴性的骨髓增殖性肿瘤中的研究

国家自然科学基金

0+阅读 · 2013年12月31日

电针干预慢性应激模型大鼠CREB-BDNF通路表观遗传学表达的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

TRAIL协同IER3调节NF-κB信号通路介导肝癌细胞凋亡的相关机制研究

国家自然科学基金

1+阅读 · 2012年12月31日

Ghrelin/GHSR通过AMPK信号通路调控动脉粥样硬化斑块稳定性的分子机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Period2基因调控人胶质瘤细胞凋亡的分子机制研究

国家自然科学基金

0+阅读 · 2011年12月31日

Antrum mucosa protein-18（AMP-18）参与胃粘膜上皮细胞癌变的分子机制

国家自然科学基金

0+阅读 · 2009年12月31日

早幼粒细胞白血病锌指基因（PLZF）变异对小鼠骨骼和软骨发育的影响研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员