分散式多智能体强化学习在连续空间随机博弈中的应用 (Decentralized Multi-Agent Reinforcement Learning for Continuous-Space Stochastic Games) - 专知论文

会员服务 ·

0

随机博弈 · 连续空间 · 策略更新 · 博弈 · 多智能体强化学习 ·

2023 年 3 月 16 日

Decentralized Multi-Agent Reinforcement Learning for Continuous-Space Stochastic Games

翻译：分散式多智能体强化学习在连续空间随机博弈中的应用

Awni Altabaa,Bora Yongacoglu,Serdar Yüksel

Stochastic games are a popular framework for studying multi-agent reinforcement learning (MARL). Recent advances in MARL have focused primarily on games with finitely many states. In this work, we study multi-agent learning in stochastic games with general state spaces and an information structure in which agents do not observe each other's actions. In this context, we propose a decentralized MARL algorithm and we prove the near-optimality of its policy updates. Furthermore, we study the global policy-updating dynamics for a general class of best-reply based algorithms and derive a closed-form characterization of convergence probabilities over the joint policy space.

翻译：随机博弈是研究多智能体强化学习(MARL)的一种流行框架。最近MARL的进展主要关注具有有限状态的游戏。在本文中，我们研究具有一般状态空间和信息结构的随机博弈中的多智能体学习，其中代理人不观察彼此的行动。在这个背景下，我们提出了一种分散式MARL算法，并证明了其策略更新的近最优性。此外，我们还研究了基于最佳应答的算法的全局策略更新动态，并导出了在整个政策空间中收敛概率的一个闭合形式描述。

0

相关内容

随机博弈

【AI+军事】美国HRL实验室AAAI2020《基于强化学习的多智能体任务规划》，Multi-Agent Mission Planning with Reinforcement Learning

【AI+军事】美国HRL实验室AAAI2020《基于强化学习的多智能体任务规划》，Multi-Agent Mission Planning with Reinforcement Learning

专知会员服务

234+阅读 · 2022年4月10日

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

24+阅读 · 2022年3月19日

【ICML2020-上海交大】多智能体确定性Q-Learning， Multi-Agent Determinantal Q-Learning

【ICML2020-上海交大】多智能体确定性Q-Learning， Multi-Agent Determinantal Q-Learning

专知会员服务

38+阅读 · 2020年6月3日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【金融强化学习论文】金融资产组合管理问题的深度强化学习框架（A Deep Reinforcement Learning Framework for theFinancial Portfolio Management Problem）

【金融强化学习论文】金融资产组合管理问题的深度强化学习框架（A Deep Reinforcement Learning Framework for theFinancial Portfolio Management Problem）

专知会员服务

55+阅读 · 2019年12月16日

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

强化学习扫盲贴：从Q-learning到DQN

强化学习扫盲贴：从Q-learning到DQN

夕小瑶的卖萌屋

52+阅读 · 2019年10月13日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于动态规划粘性解及特征正交分解降维方法的偏微分方程最优控制

国家自然科学基金

0+阅读 · 2014年12月31日

云计算环境下移动Agent系统信任安全关键技术研究

国家自然科学基金

2+阅读 · 2014年12月31日

多层卫星通信系统自主协同网络控制协议建模及运行时动态验证方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

非线性软测量系统递推量子随机滤波方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

Experiential Explanations for Reinforcement Learning

Arxiv

0+阅读 · 2023年5月16日

PPO-ABR: Proximal Policy Optimization based Deep Reinforcement Learning for Adaptive BitRate streaming

Arxiv

0+阅读 · 2023年5月14日

Quantile-Based Deep Reinforcement Learning using Two-Timescale Policy Gradient Algorithms

Arxiv

0+阅读 · 2023年5月12日

Cooperative Multi-Agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation

Arxiv

0+阅读 · 2023年5月12日

Deep Reinforcement Learning for Multi-Agent Interaction

Arxiv

46+阅读 · 2022年8月2日

VIP会员

文章信息

相关主题

多智能体强化学习

相关VIP内容

【AI+军事】美国HRL实验室AAAI2020《基于强化学习的多智能体任务规划》，Multi-Agent Mission Planning with Reinforcement Learning

【AI+军事】美国HRL实验室AAAI2020《基于强化学习的多智能体任务规划》，Multi-Agent Mission Planning with Reinforcement Learning

专知会员服务

234+阅读 · 2022年4月10日

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

【MIla】一种意识启发规划的基于模型强化学习，A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning

专知会员服务

24+阅读 · 2022年3月19日

【ICML2020-上海交大】多智能体确定性Q-Learning， Multi-Agent Determinantal Q-Learning

【ICML2020-上海交大】多智能体确定性Q-Learning， Multi-Agent Determinantal Q-Learning

专知会员服务

38+阅读 · 2020年6月3日

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

【基于模型的强化学习的博弈论框架】A Game Theoretic Framework for Model Based Reinforcement Learning

专知会员服务

131+阅读 · 2020年4月19日

【金融强化学习论文】金融资产组合管理问题的深度强化学习框架（A Deep Reinforcement Learning Framework for theFinancial Portfolio Management Problem）

【金融强化学习论文】金融资产组合管理问题的深度强化学习框架（A Deep Reinforcement Learning Framework for theFinancial Portfolio Management Problem）

专知会员服务

55+阅读 · 2019年12月16日

热门VIP内容

开通专知VIP会员享更多权益服务

大模型推理时代的知识编辑

《利用人工智能对军事行动进行建模》

【MIT博士论文】加速科学发现的因果建模实践算法

机器人、无人机与实时影像：应对城市爆炸威胁的三大技术方案

相关资讯

量化金融强化学习论文集合

量化金融强化学习论文集合

专知

14+阅读 · 2019年12月18日

强化学习扫盲贴：从Q-learning到DQN

强化学习扫盲贴：从Q-learning到DQN

夕小瑶的卖萌屋

52+阅读 · 2019年10月13日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Experiential Explanations for Reinforcement Learning

Arxiv

0+阅读 · 2023年5月16日

PPO-ABR: Proximal Policy Optimization based Deep Reinforcement Learning for Adaptive BitRate streaming

Arxiv

0+阅读 · 2023年5月14日

Quantile-Based Deep Reinforcement Learning using Two-Timescale Policy Gradient Algorithms

Arxiv

0+阅读 · 2023年5月12日

Cooperative Multi-Agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation

Arxiv

0+阅读 · 2023年5月12日

Deep Reinforcement Learning for Multi-Agent Interaction

Arxiv

46+阅读 · 2022年8月2日

相关基金

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

基于动态规划粘性解及特征正交分解降维方法的偏微分方程最优控制

国家自然科学基金

0+阅读 · 2014年12月31日

云计算环境下移动Agent系统信任安全关键技术研究

国家自然科学基金

2+阅读 · 2014年12月31日

多层卫星通信系统自主协同网络控制协议建模及运行时动态验证方法研究

国家自然科学基金

1+阅读 · 2013年12月31日

非线性软测量系统递推量子随机滤波方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员