With the recent success of generative models in image and text, the question of their evaluation has recently gained a lot of attention. While most methods from the state of the art rely on scalar metrics, the introduction of Precision and Recall (PR) for generative model has opened up a new avenue of research. The associated PR curve allows for a richer analysis, but their estimation poses several challenges. In this paper, we present a new framework for estimating entire PR curves based on a binary classification standpoint. We conduct a thorough statistical analysis of the proposed estimates. As a byproduct, we obtain a minimax upper bound on the PR estimation risk. We also show that our framework extends several landmark PR metrics of the literature which by design are restrained to the extreme values of the curve. Finally, we study the different behaviors of the curves obtained experimentally in various settings.
翻译:随着生成模型在图像与文本领域取得显著成功,其评估方法近期受到广泛关注。尽管当前主流方法多依赖于标量指标,但生成模型的精确率与召回率(PR)概念的引入开辟了新的研究方向。相应的PR曲线能够提供更丰富的分析维度,但其估计面临诸多挑战。本文提出一种基于二分类视角的完整PR曲线估计新框架,并对所提估计量进行了系统的统计分析。作为衍生成果,我们获得了PR估计风险的极小化极大上界。同时证明,该框架可扩展文献中多个局限于曲线极值的经典PR度量方法。最后,我们通过实验研究了不同场景下PR曲线的差异化表现特征。