与大型国家空间相比,在深强化学习中减少差异的近似定界进程 (Approximating Martingale Process for Variance Reduction in Deep Reinforcement Learning with Large State Space) - 专知论文

会员服务 ·

0

方差减小 · 状态空间 · Processing（编程语言） · CASES · 方差 ·

2022 年 11 月 29 日

Approximating Martingale Process for Variance Reduction in Deep Reinforcement Learning with Large State Space

翻译：与大型国家空间相比,在深强化学习中减少差异的近似定界进程

Approximating Martingale Process (AMP) is proven to be effective for variance reduction in reinforcement learning (RL) in specific cases such as Multiclass Queueing Networks. However, in the already proven cases, the state space is relatively small and all possible state transitions can be iterated through. In this paper, we consider systems in which state space is large and have uncertainties when considering state transitions, thus making AMP a generalized variance-reduction method in RL. Specifically, we will investigate the application of AMP in ride-hailing systems like Uber, where Proximal Policy Optimization (PPO) is incorporated to optimize the policy of matching drivers and customers.

翻译：在多级排队网络等特定情况下,近似马丁加勒进程(AMP)已证明对减少强化学习差异(RL)是有效的,但在已经证实的案例中,国家空间相对较小,所有可能的州过渡都可以通过迭接。在本文中,我们考虑到国家空间较大并在考虑州过渡时具有不确定性的系统,从而使AMP成为RL普遍减少差异的方法。具体地说,我们将调查AMP在Uber等乘载系统的应用情况,如Uber系统,在Uber系统,将优化匹配驾驶员和客户的政策,将优化准政策优化政策(PPPO)。

0

相关内容

方差减小

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

皮肤间充质干细胞通过其分泌的sTNFR1抑制Th17细胞的分化

国家自然科学基金

0+阅读 · 2015年12月31日

异种合金激光-MIG复合熔钎焊界面组织不均匀性与接头力学性能调控机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

Mg-Zn-(Ce,Gd)合金凝固过程热裂纹萌生、扩展行为及其机理的研究

国家自然科学基金

0+阅读 · 2013年12月31日

活性金属在离子液体中的阳极行为及机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

离子通道TRPM2在血管壁内膜增生中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

What is the Solution for State-Adversarial Multi-Agent Reinforcement Learning?

Arxiv

0+阅读 · 2023年2月1日

Transferring Multiple Policies to Hotstart Reinforcement Learning in an Air Compressor Management Problem

Arxiv

0+阅读 · 2023年1月30日

STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning

Arxiv

0+阅读 · 2023年1月28日

Solving Constrained Reinforcement Learning through Augmented State and Reward Penalties

Arxiv

0+阅读 · 2023年1月27日

Normality-Guided Distributional Reinforcement Learning for Continuous Control

Arxiv

0+阅读 · 2023年1月26日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

246+阅读 · 2019年10月21日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

《无人机系统 - 反无人机系统：测试方法》364页

《无人机蜂群攻击防御的预测建模：面向美军战备的人工智能轨迹预测与最优拦截策略设计》最新报告

美军低成本无人作战攻击系统（LUCAS）：扩大无人机战争规模

《将空中力量带向海洋：美国海军航空发展的四条竞争路径及其教训》报告

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

What is the Solution for State-Adversarial Multi-Agent Reinforcement Learning?

Arxiv

0+阅读 · 2023年2月1日

Transferring Multiple Policies to Hotstart Reinforcement Learning in an Air Compressor Management Problem

Arxiv

0+阅读 · 2023年1月30日

STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning

Arxiv

0+阅读 · 2023年1月28日

Solving Constrained Reinforcement Learning through Augmented State and Reward Penalties

Arxiv

0+阅读 · 2023年1月27日

Normality-Guided Distributional Reinforcement Learning for Continuous Control

Arxiv

0+阅读 · 2023年1月26日

相关基金

皮肤间充质干细胞通过其分泌的sTNFR1抑制Th17细胞的分化

国家自然科学基金

0+阅读 · 2015年12月31日

异种合金激光-MIG复合熔钎焊界面组织不均匀性与接头力学性能调控机理研究

国家自然科学基金

0+阅读 · 2014年12月31日

Mg-Zn-(Ce,Gd)合金凝固过程热裂纹萌生、扩展行为及其机理的研究

国家自然科学基金

0+阅读 · 2013年12月31日

活性金属在离子液体中的阳极行为及机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

离子通道TRPM2在血管壁内膜增生中的作用

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员