共同更智能：通过共享体验式学习构建具身代理实践社群 (Smarter Together: Creating Agentic Communities of Practice through Shared Experiential Learning)

The transition from human-centric to agent-centric software development practices is disrupting existing knowledge sharing environments for software developers. Traditional peer-to-peer repositories and developer communities for shared technical knowledge and best practice have witnessed dramatic drops in participation in a short period of time. At the same time, agentic functional equivalents are yet to emerge leaving AI agents, which already generate a significant proportion of all new software code produced, without access to repositories of valuable shared learning. In this paper, we introduce Spark, a novel shared agentic memory architecture which is designed to emulate the collective intelligence and know-how of human developer communities. Spark enables AI coding agents to both contribute to and draw from a persistent and continuously evolving experiential memory. Agents operating in the same general problem space use the Spark shared memory as a repository of new knowledge to achieve collective continual learning. We evaluate Spark as a coach for AI coding agents performing software development tasks. We demonstrate that recommendations made by Spark improve the quality of code generated by generic code generation models at varying sizes and capability tiers. Boosted by Spark, a small open-weights model with 30 billion parameters was able to match the code quality afforded by a much larger state-of-the-art model. Separately, we measure the intrinsic quality of recommendations generated by Spark against a wide range of criteria inspired by software development best practice, and achieve helpfulness levels of up to 98.2% in the top two (out of five) qualitative helpfulness bands.

翻译：从以人为中心到以代理为中心的软件开发实践的转变，正在颠覆软件开发者现有的知识共享环境。传统的用于共享技术知识与最佳实践的端对端代码库及开发者社区，在短期内出现了参与度的急剧下降。与此同时，具身代理的功能等效物尚未出现，导致已生成全部新软件代码中相当大比例的AI代理，无法访问宝贵的共享学习知识库。本文介绍Spark，一种新颖的共享具身代理记忆架构，旨在模拟人类开发者社群的集体智慧与诀窍。Spark使AI编程代理能够向一个持久且持续演化的体验式记忆库贡献知识并从中汲取知识。在相同通用问题领域内操作的代理，将Spark共享记忆作为新知识的存储库，以实现集体持续学习。我们评估了Spark作为执行软件开发任务的AI编程代理的'教练'。我们证明，Spark提供的建议提升了不同规模与能力层级的通用代码生成模型所生成代码的质量。在Spark的助力下，一个拥有300亿参数的小型开放权重模型，能够匹配一个规模大得多的最先进模型所提供的代码质量。此外，我们依据受软件开发最佳实践启发的广泛标准，衡量了Spark所生成建议的内在质量，并在五级定性帮助程度的前两级中，达到了高达98.2%的帮助水平。

相关内容

Spark

关注 51

Apache Spark 是专为大规模数据处理而设计的快速通用的计算引擎。Spark是UC Berkeley AMP lab (加州大学伯克利分校的AMP实验室)所开源的类Hadoop MapReduce的通用并行框架，Spark，拥有Hadoop MapReduce所具有的优点；但不同于MapReduce的是Job中间输出结果可以保存在内存中，从而不再需要读写HDFS，因此Spark能更好地适用于数据挖掘与机器学习等需要迭代的MapReduce的算法。

【AAAI2025】穿越多模态领域：通过低秩序列多模态适配器实现高效迁移学习

专知会员服务

14+阅读 · 2024年12月13日

【TPAMI2022】关联关系驱动的多模态分类，AF: An Association-based Fusion Method for Multi-Modal Classification

专知会员服务

27+阅读 · 2022年3月22日

[ICML2021]记忆高效在线元学习

专知会员服务

25+阅读 · 2021年9月25日

【KDD2020】多任务多关系嵌入的Twitter意识形态检测，TIMME-Twitter Ideology-detection via Multi-task Multi-relational Embedding

专知会员服务

18+阅读 · 2020年6月8日