Lark：面向多利益相关者大语言模型智能体的生物启发式神经进化框架 (Lark: Biologically Inspired Neuroevolution for Multi-Stakeholder LLM Agents)

We present Lark, a biologically inspired decision-making framework that couples LLM-driven reasoning with an evolutionary, stakeholder-aware Multi-Agent System (MAS). To address verbosity and stakeholder trade-offs, we integrate four mechanisms: (i) plasticity, which applies concise adjustments to candidate solutions; (ii) duplication and maturation, which copy high-performing candidates and specialize them into new modules; (iii) ranked-choice stakeholder aggregation using influence-weighted Borda scoring; and (iv) compute awareness via token-based penalties that reward brevity. The system iteratively proposes diverse strategies, applies plasticity tweaks, simulates stakeholder evaluations, aggregates preferences, selects top candidates, and performs duplication/maturation while factoring compute cost into final scores. In a controlled evaluation over 30 rounds comparing 14 systems, Lark Full achieves a mean rank of 2.55 (95% CI [2.17, 2.93]) and a mean composite score of 29.4/50 (95% CI [26.34, 32.46]), finishing Top-3 in 80% of rounds while remaining cost competitive with leading commercial models ($0.016 per task). Paired Wilcoxon tests confirm that all four mechanisms contribute significantly as ablating duplication/maturation yields the largest deficit (ΔScore = 3.5, Cohen's d_z = 2.53, p < 0.001), followed by plasticity (ΔScore = 3.4, d_z = 1.86), ranked-choice voting (ΔScore = 2.4, d_z = 1.20), and token penalties (ΔScore = 2.2, d_z = 1.63). Rather than a formal Markov Decision Process with constrained optimization, Lark is a practical, compute-aware neuroevolutionary loop that scales stakeholder-aligned strategy generation and makes trade-offs transparent through per-step metrics. Our work presents proof-of-concept findings and invites community feedback as we expand toward real-world validation studies.

翻译：本文提出Lark，一种受生物学启发的决策框架，它将大语言模型驱动的推理与进化的、利益相关者感知的多智能体系统相结合。为应对冗长性与利益相关者权衡问题，我们整合了四种机制：（i）可塑性，对候选方案进行简洁调整；（ii）复制与成熟，复制高性能候选方案并将其特化为新模块；（iii）采用影响力加权Borda计分的排序选择式利益相关者偏好聚合；（iv）通过基于令牌的惩罚机制实现计算感知，以奖励简洁性。该系统迭代地生成多样化策略，应用可塑性微调，模拟利益相关者评估，聚合偏好，筛选最优候选方案，并执行复制/成熟操作，同时将计算成本纳入最终评分。在包含30轮次、比较14个系统的受控评估中，完整版Lark取得了平均排名2.55（95%置信区间[2.17, 2.93]）和平均综合得分29.4/50（95%置信区间[26.34, 32.46]），在80%的轮次中位列前三，同时保持与主流商业模型相当的成本竞争力（单任务成本0.016美元）。配对Wilcoxon检验证实所有四种机制均贡献显著：消融复制/成熟机制导致最大性能损失（ΔScore = 3.5, Cohen's d_z = 2.53, p < 0.001），其次为可塑性机制（ΔScore = 3.4, d_z = 1.86）、排序选择投票机制（ΔScore = 2.4, d_z = 1.20）和令牌惩罚机制（ΔScore = 2.2, d_z = 1.63）。与采用约束优化的形式化马尔可夫决策过程不同，Lark是一个实用的、具备计算感知能力的神经进化循环系统，它可扩展地生成利益相关者对齐的策略，并通过每步度量指标使权衡过程透明化。本研究呈现了概念验证结果，并邀请学界反馈以推进后续现实世界验证研究。