迈向自动形式化的通用框架 (Towards a Common Framework for Autoformalization)

from arxiv, Presented at NeLaMKRR@KR, 2025 (arXiv:2511.09575). A shorter version of this work will appear in the Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2026)

Autoformalization has emerged as a term referring to the automation of formalization - specifically, the formalization of mathematics using interactive theorem provers (proof assistants). Its rapid development has been driven by progress in deep learning, especially large language models (LLMs). More recently, the term has expanded beyond mathematics to describe the broader task of translating informal input into formal logical representations. At the same time, a growing body of research explores using LLMs to translate informal language into formal representations for reasoning, planning, and knowledge representation - often without explicitly referring to this process as autoformalization. As a result, despite addressing similar tasks, the largely independent development of these research areas has limited opportunities for shared methodologies, benchmarks, and theoretical frameworks that could accelerate progress. The goal of this paper is to review - explicit or implicit - instances of what can be considered autoformalization and to propose a unified framework, encouraging cross-pollination between different fields to advance the development of next generation AI systems.

翻译：自动形式化作为一个术语，指的是形式化过程的自动化——具体而言，即使用交互式定理证明器（证明助手）对数学进行形式化。其快速发展得益于深度学习的进步，特别是大语言模型（LLMs）的推动。近年来，该术语已超越数学领域，用于描述将非正式输入转换为形式化逻辑表示的更广泛任务。与此同时，越来越多的研究探索利用LLMs将非正式语言转换为形式化表示，以用于推理、规划和知识表示——这些研究通常并未明确将此过程称为自动形式化。因此，尽管这些研究领域处理相似的任务，但其相对独立的发展限制了共享方法、基准和理论框架的机会，而这些本可加速整体进展。本文旨在回顾可被视为自动形式化的显性或隐性实例，并提出一个统一框架，以促进不同领域之间的交叉融合，从而推动下一代人工智能系统的发展。