Recent advances in large language models (LLMs) have dramatically improved performance on a wide range of tasks, driving rapid enterprise adoption. Yet, the cost of adopting these AI services is understudied. Unlike traditional software licensing in which costs are predictable before usage, commercial LLM services charge per token of input text in addition to generated output tokens. Crucially, while firms can control the input, they have limited control over output tokens, which are effectively set by generation dynamics outside of business control. This research shows that subtle shifts in linguistic style can systematically alter the number of output tokens without impacting response quality. Using an experiment with OpenAI's API, this study reveals that non-polite prompts significantly increase output tokens leading to higher enterprise costs and additional revenue for OpenAI. Politeness is merely one instance of a broader phenomenon in which linguistic structure can drive unpredictable cost variation. For enterprises integrating LLM into applications, this unpredictability complicates budgeting and undermines transparency in business-to-business contexts. By demonstrating how end-user behavior links to enterprise costs through output token counts, this work highlights the opacity of current pricing models and calls for new approaches to ensure predictable and transparent adoption of LLM services.
翻译:近期大语言模型(LLMs)的显著进展大幅提升了各类任务的性能表现,推动了企业的快速采用。然而,采用这些AI服务的成本尚未得到充分研究。与可预先预测成本的传统软件授权模式不同,商用LLM服务按输入文本的token和生成输出token双向计费。关键在于,企业虽能控制输入内容,但对输出token的控制能力有限——输出token数量本质上由企业无法掌控的生成动态决定。本研究表明,语言风格的细微变化会系统性改变输出token数量,且不影响响应质量。通过使用OpenAI API进行的实验,本研究发现非礼貌提示会显著增加输出token,导致企业成本上升并为OpenAI带来额外收入。礼貌性仅是更广泛现象的一个例证:语言结构可能引发不可预测的成本波动。对于将LLM集成至应用的企业而言,这种不可预测性使预算编制复杂化,并削弱了企业间合作的透明度。通过揭示终端用户行为如何通过输出token数量与企业成本产生关联,本研究凸显了当前定价模式的不透明性,呼吁采用新方法以确保LLM服务采用的可预测性与透明度。