As modern artificial intelligence (AI) systems become more advanced and capable, they can leverage a wide range of tools and models to perform complex tasks. Today, the task of orchestrating these models is often performed by Large Language Models (LLMs) that rely on qualitative descriptions of models for decision-making. However, the descriptions provided to these LLM-based orchestrators do not reflect true model capabilities and performance characteristics, leading to suboptimal model selection, reduced accuracy, and increased energy costs. In this paper, we conduct an empirical analysis of LLM-based orchestration limitations and propose GUIDE, a new energy-aware model selection framework that accounts for performance-energy trade-offs by incorporating quantitative model performance characteristics in decision-making. Experimental results demonstrate that GUIDE increases accuracy by 0.90%-11.92% across various evaluated tasks, and achieves up to 54% energy efficiency improvement, while reducing orchestrator model selection latency from 4.51 s to 7.2 ms.
翻译:随着现代人工智能系统日益先进与强大,其能够利用广泛的工具与模型执行复杂任务。当前,这些模型的编排任务通常由大型语言模型承担,这些LLM依赖对模型的定性描述进行决策。然而,提供给此类基于LLM的编排器的描述未能反映模型的真实能力与性能特征,导致模型选择欠佳、准确率降低及能耗成本增加。本文对基于LLM的编排机制局限性进行了实证分析,并提出GUIDE——一种新的能耗感知模型选择框架,该框架通过在决策中纳入量化模型性能特征,兼顾性能与能耗的权衡。实验结果表明,GUIDE在各项评估任务中将准确率提升了0.90%至11.92%,能耗效率最高提升54%,同时将编排器的模型选择延迟从4.51秒降低至7.2毫秒。