个人化任务强化学习人员审查政策摘要 (Examining Policy Entropy of Reinforcement Learning Agents for Personalization Tasks) - 专知论文

会员服务 ·

0

Agent · Learning · 强化学习 · Less · 回合 ·

2023 年 1 月 31 日

Examining Policy Entropy of Reinforcement Learning Agents for Personalization Tasks

翻译：个人化任务强化学习人员审查政策摘要

Anton Dereventsov,Andrew Starnes,Clayton G. Webster

This effort is focused on examining the behavior of reinforcement learning systems in personalization environments and detailing the differences in policy entropy associated with the type of learning algorithm utilized. We demonstrate that Policy Optimization agents often possess low-entropy policies during training, which in practice results in agents prioritizing certain actions and avoiding others. Conversely, we also show that Q-Learning agents are far less susceptible to such behavior and generally maintain high-entropy policies throughout training, which is often preferable in real-world applications. We provide a wide range of numerical experiments as well as theoretical justification to show that these differences in entropy are due to the type of learning being employed.

翻译：这项工作的重点是审查个人化环境中强化学习系统的行为,并详细说明与所用学习算法类型相关的政策变异性,我们证明政策优化机构在培训期间往往拥有低耗水性政策,这实际上导致代理人优先采取某些行动,避免采取其他行动。相反,我们还表明,在培训过程中,学习机构对此种行为的影响要小得多,一般保持高耗水性政策,在现实世界应用中,这种政策往往更可取。我们提供了广泛的数字实验和理论理由,以表明这些变温性差异是由于学习类型造成的。

0

相关内容

Agent

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

ABCB6基因在眼组织缺损中的功能研究

国家自然科学基金

0+阅读 · 2014年12月31日

一种脉冲电子束物理气相沉积梯度MCrAlY包覆涂层组织结构演变及性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

血红素加氧酶-1在诱导性多能干细胞促进糖尿病创面愈合中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

β淀粉样蛋白对神经突触传递和可塑性的影晌

国家自然科学基金

0+阅读 · 2011年12月31日

Misspecification in Inverse Reinforcement Learning

Arxiv

0+阅读 · 2023年3月24日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

Deep Reinforcement Learning for Multi-Agent Interaction

Arxiv

46+阅读 · 2022年8月2日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

VIP会员

文章信息

相关主题

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

《DeepGCNs: Making GCNs Go as Deep as CNNs》

《DeepGCNs: Making GCNs Go as Deep as CNNs》

专知会员服务

31+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

前沿人工智能趋势报告（Frontier AI Trends Report）

【AAAI2026】善始则事半功倍：基于前缀优化的大语言模型推理强化学习

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

音退化问题：基于输入操控的鲁棒语音转换综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Misspecification in Inverse Reinforcement Learning

Arxiv

0+阅读 · 2023年3月24日

Pretraining in Deep Reinforcement Learning: A Survey

Arxiv

21+阅读 · 2022年11月8日

Deep Reinforcement Learning for Multi-Agent Interaction

Arxiv

46+阅读 · 2022年8月2日

A Survey on Reinforcement Learning for Recommender Systems

Arxiv

22+阅读 · 2021年9月22日

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Arxiv

26+阅读 · 2020年2月10日

相关基金

ABCB6基因在眼组织缺损中的功能研究

国家自然科学基金

0+阅读 · 2014年12月31日

一种脉冲电子束物理气相沉积梯度MCrAlY包覆涂层组织结构演变及性能研究

国家自然科学基金

0+阅读 · 2013年12月31日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

血红素加氧酶-1在诱导性多能干细胞促进糖尿病创面愈合中的作用及机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

β淀粉样蛋白对神经突触传递和可塑性的影晌

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员