以模型为基础的RL, 具有乐观的外表抽样:结构条件和抽样复杂性 (Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity) - 专知论文

会员服务 ·

0

样本复杂度 · 样本 · 采样法 · 可约的 · state-of-the-art ·

2022 年 10 月 16 日

Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity

翻译：以模型为基础的RL, 具有乐观的外表抽样:结构条件和抽样复杂性

Alekh Agarwal,Tong Zhang

from arxiv, NeurIPS 2022 camera ready version

We propose a general framework to design posterior sampling methods for model-based RL. We show that the proposed algorithms can be analyzed by reducing regret to Hellinger distance in conditional probability estimation. We further show that optimistic posterior sampling can control this Hellinger distance, when we measure model error via data likelihood. This technique allows us to design and analyze unified posterior sampling algorithms with state-of-the-art sample complexity guarantees for many model-based RL settings. We illustrate our general result in many special cases, demonstrating the versatility of our framework.

翻译：我们为基于模型的RL提出了设计后方取样方法的一般框架。我们表明,可以通过降低对Hellinger距离的遗憾,在有条件的概率估计中分析拟议的算法。我们还表明,当我们通过数据可能性来测量模型错误时,乐观的后方取样可以控制这一Hellinger距离。这种技术使我们能够设计和分析具有许多基于模型的RL设置的最新样本复杂性保障的统一后方取样算法。我们在许多特殊情况下展示了我们的总体结果,显示了我们框架的多功能性。

0

相关内容

样本复杂度

样本复杂度

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

MiR-27a/b靶向沉默ABCA1调控胆固醇逆向转运

国家自然科学基金

0+阅读 · 2011年12月31日

Rayleigh信道统计分析和建模

国家自然科学基金

0+阅读 · 2009年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

Near-Optimal Sample Complexity Bounds for Constrained MDPs

Arxiv

0+阅读 · 2022年11月19日

DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization

Arxiv

0+阅读 · 2022年11月19日

A Structure-Guided Diffusion Model for Large-Hole Diverse Image Completion

Arxiv

0+阅读 · 2022年11月18日

A Regularized Conditional GAN for Posterior Sampling in Inverse Problems

Arxiv

0+阅读 · 2022年11月18日

The efficacy and generalizability of conditional GANs for posterior inference in physics-based inverse problems

Arxiv

0+阅读 · 2022年11月18日

VIP会员

文章信息

相关主题

样本复杂度

state-of-the-art

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大模型推理时代的知识编辑

《利用人工智能对军事行动进行建模》

【MIT博士论文】加速科学发现的因果建模实践算法

机器人、无人机与实时影像：应对城市爆炸威胁的三大技术方案

相关资讯

IEEE TII Call For Papers

IEEE TII Call For Papers

CCF多媒体专委会

3+阅读 · 2022年3月24日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

相关论文

Near-Optimal Sample Complexity Bounds for Constrained MDPs

Arxiv

0+阅读 · 2022年11月19日

DiffStyler: Controllable Dual Diffusion for Text-Driven Image Stylization

Arxiv

0+阅读 · 2022年11月19日

A Structure-Guided Diffusion Model for Large-Hole Diverse Image Completion

Arxiv

0+阅读 · 2022年11月18日

A Regularized Conditional GAN for Posterior Sampling in Inverse Problems

Arxiv

0+阅读 · 2022年11月18日

The efficacy and generalizability of conditional GANs for posterior inference in physics-based inverse problems

Arxiv

0+阅读 · 2022年11月18日

相关基金

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

MiR-27a/b靶向沉默ABCA1调控胆固醇逆向转运

国家自然科学基金

0+阅读 · 2011年12月31日

Rayleigh信道统计分析和建模

国家自然科学基金

0+阅读 · 2009年12月31日

磁性Pickering乳液界面流变学研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员