Offline Reinforcement Learning with Additional Covering Distributions - 专知论文

会员服务 ·

0

Learning · 泛函 · 优化器 · 覆盖 · 数据集 ·

2023 年 5 月 22 日

Offline Reinforcement Learning with Additional Covering Distributions

翻译：暂无翻译

We study learning optimal policies from a logged dataset, i.e., offline RL, with function approximation. Despite the efforts devoted, existing algorithms with theoretic finite-sample guarantees typically assume exploratory data coverage or strong realizable function classes, which is hard to be satisfied in reality. While there are recent works that successfully tackle these strong assumptions, they either require the gap assumptions that only could be satisfied by part of MDPs or use the behavior regularization that makes the optimality of learned policy even intractable. To solve this challenge, we provide finite-sample guarantees for a simple algorithm based on marginalized importance sampling (MIS), showing that sample-efficient offline RL for general MDPs is possible with only a partial coverage dataset and weak realizable function classes given additional side information of a covering distribution. Furthermore, we demonstrate that the covering distribution trades off prior knowledge of the optimal trajectories against the coverage requirement of the dataset, revealing the effect of this inductive bias in the learning processes.

翻译：暂无翻译

0

相关内容

Learning

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

共轭聚合物单分子能量转移的量子相干效应研究

国家自然科学基金

0+阅读 · 2015年12月31日

严酷海洋大气环境中冷轧板在非稳态薄液膜下的腐蚀行为与机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

巨噬细胞盐皮质激素受体对动脉粥样硬化的调控作用及其分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

复杂时空社会网络的演化、建模及动力学研究

国家自然科学基金

0+阅读 · 2012年12月31日

甲状腺激素受体辅助蛋白150对斑马鱼骨骼肌发育的影响

国家自然科学基金

0+阅读 · 2009年12月31日

Offline Reinforcement Learning with Imbalanced Datasets

Arxiv

0+阅读 · 2023年7月6日

The Curse of Passive Data Collection in Batch Reinforcement Learning

Arxiv

0+阅读 · 2023年7月5日

SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand Cores

Arxiv

0+阅读 · 2023年7月5日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

Coding for Distributed Multi-Agent Reinforcement Learning

Arxiv

32+阅读 · 2021年1月7日

VIP会员

文章信息

相关主题

相关VIP内容

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【MIT博士论文】弱监督学习：理论、方法与应用

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

锚定情报：合成欺骗时代的地面真相

NeurIPS 2025 | NMKE：基于神经元归因与动态稀疏掩码的终身知识编辑

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

相关论文

Offline Reinforcement Learning with Imbalanced Datasets

Arxiv

0+阅读 · 2023年7月6日

The Curse of Passive Data Collection in Batch Reinforcement Learning

Arxiv

0+阅读 · 2023年7月5日

SRL: Scaling Distributed Reinforcement Learning to Over Ten Thousand Cores

Arxiv

0+阅读 · 2023年7月5日

Recent Advances in Reinforcement Learning in Finance

Arxiv

11+阅读 · 2021年12月8日

Coding for Distributed Multi-Agent Reinforcement Learning

Arxiv

32+阅读 · 2021年1月7日

相关基金

共轭聚合物单分子能量转移的量子相干效应研究

国家自然科学基金

0+阅读 · 2015年12月31日

严酷海洋大气环境中冷轧板在非稳态薄液膜下的腐蚀行为与机理研究

国家自然科学基金

0+阅读 · 2015年12月31日

巨噬细胞盐皮质激素受体对动脉粥样硬化的调控作用及其分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

复杂时空社会网络的演化、建模及动力学研究

国家自然科学基金

0+阅读 · 2012年12月31日

甲状腺激素受体辅助蛋白150对斑马鱼骨骼肌发育的影响

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员