Agentic AI aims to create systems that set their own goals, adapt proactively to change, and refine behavior through continuous experience. Recent advances suggest that, when facing multiple and unforeseen tasks, agents could benefit from sharing machine-learned knowledge and reusing policies that have already been fully or partially learned by other agents. However, how to query, select, and retrieve policies from a pool of agents, and how to integrate such policies remains a largely unexplored area. This study explores how an agent decides what knowledge to select, from whom, and when and how to integrate it in its own policy in order to accelerate its own learning. The proposed algorithm, \emph{Modular Sharing and Composition in Collective Learning} (MOSAIC), improves learning in agentic collectives by combining (1) knowledge selection using performance signals and cosine similarity on Wasserstein task embeddings, (2) modular and transferable neural representations via masks, and (3) policy integration, composition and fine-tuning. MOSAIC outperforms isolated learners and global sharing approaches in both learning speed and overall performance, and in some cases solves tasks that isolated agents cannot. The results also demonstrate that selective, goal-driven reuse leads to less susceptibility to task interference. We also observe the emergence of self-organization, where agents solving simpler tasks accelerate the learning of harder ones through shared knowledge.
翻译:智能体人工智能旨在创建能够自主设定目标、主动适应变化并通过持续经验优化行为的系统。近期研究表明,当面对多个不可预见的任务时,智能体可通过共享机器学习知识并重用其他智能体已完全或部分习得的策略而获益。然而,如何从智能体群体中查询、选择与检索策略,以及如何整合这些策略,仍是一个尚未充分探索的领域。本研究探讨了智能体如何决定选择何种知识、从何处选择、在何时以及如何将其整合到自身策略中以加速学习过程。所提出的算法——集体学习中的模块化共享与组合(MOSAIC),通过结合以下机制提升了智能体群体的学习效率:(1)利用性能信号与Wasserstein任务嵌入的余弦相似度进行知识选择;(2)通过掩码实现模块化且可迁移的神经表征;(3)策略整合、组合与微调。MOSAIC在学习速度与整体性能上均优于孤立学习者和全局共享方法,并在某些情况下解决了孤立智能体无法完成的任务。结果还表明,选择性、目标驱动的知识重用能降低任务干扰的敏感性。此外,我们观察到自组织现象的出现,即解决较简单任务的智能体通过共享知识加速了较困难任务的学习进程。