In this work, we report what happens when two large language models respond to each other for many turns without any outside input in a multi-agent setup. The setup begins with a short seed sentence. After that, each model reads the other's output and generates a response. This continues for a fixed number of steps. We used Mistral Nemo Base 2407 and Llama 2 13B hf. We observed that most conversations start coherently but later fall into repetition. In many runs, a short phrase appears and repeats across turns. Once repetition begins, both models tend to produce similar output rather than introducing a new direction in the conversation. This leads to a loop where the same or similar text is produced repeatedly. We describe this behavior as a form of convergence. It occurs even though the models are large, trained separately, and not given any prompt instructions. To study this behavior, we apply lexical and embedding-based metrics to measure how far the conversation drifts from the initial seed and how similar the outputs of the two models becomes as the conversation progresses.
翻译:本研究中,我们报告了在多智能体设置下,两个大型语言模型在无外部输入的情况下相互响应多轮对话时发生的情况。该设置从一个简短的种子句子开始。随后,每个模型读取对方的输出并生成回复。这一过程持续固定的步数。我们使用了 Mistral Nemo Base 2407 和 Llama 2 13B hf 模型进行实验。我们观察到,大多数对话开始时具有连贯性,但随后陷入重复。在许多运行中,一个短句出现并在多轮对话中重复。一旦重复开始,两个模型倾向于产生相似的输出,而非引入对话的新方向。这导致一个循环,其中相同或相似的文本被反复生成。我们将这种行为描述为一种收敛形式。尽管模型规模庞大、分别训练且未提供任何提示指令,这种现象仍然发生。为研究此行为,我们应用了基于词汇和嵌入的度量方法,以量化对话偏离初始种子的程度,以及随着对话进行两个模型输出之间的相似性变化。