MeanFlow (MF) has recently been established as a framework for one-step generative modeling. However, its ``fastforward'' nature introduces key challenges in both the training objective and the guidance mechanism. First, the original MF's training target depends not only on the underlying ground-truth fields but also on the network itself. To address this issue, we recast the objective as a loss on the instantaneous velocity $v$, re-parameterized by a network that predicts the average velocity $u$. Our reformulation yields a more standard regression problem and improves the training stability. Second, the original MF fixes the classifier-free guidance scale during training, which sacrifices flexibility. We tackle this issue by formulating guidance as explicit conditioning variables, thereby retaining flexibility at test time. The diverse conditions are processed through in-context conditioning, which reduces model size and benefits performance. Overall, our $\textbf{improved MeanFlow}$ ($\textbf{iMF}$) method, trained entirely from scratch, achieves $\textbf{1.72}$ FID with a single function evaluation (1-NFE) on ImageNet 256$\times$256. iMF substantially outperforms prior methods of this kind and closes the gap with multi-step methods while using no distillation. We hope our work will further advance fastforward generative modeling as a stand-alone paradigm.
翻译:均值流(MF)最近已被确立为一步生成建模的框架。然而,其“快速前向”特性在训练目标和引导机制方面引入了关键挑战。首先,原始MF的训练目标不仅依赖于底层真实场,还依赖于网络本身。为解决此问题,我们将目标重新表述为关于瞬时速度$v$的损失,并通过预测平均速度$u$的网络进行重新参数化。我们的重构产生了一个更标准的回归问题,并提升了训练稳定性。其次,原始MF在训练期间固定了无分类器引导的尺度,这牺牲了灵活性。我们通过将引导公式化为显式的条件变量来解决此问题,从而在测试时保持灵活性。多样化的条件通过上下文条件处理,这减少了模型规模并有利于性能。总体而言,我们的$\\textbf{改进均值流}$($\\textbf{iMF}$)方法完全从零开始训练,在ImageNet 256$\\times$256数据集上以单次函数评估(1-NFE)实现了$\\textbf{1.72}$的FID分数。iMF显著优于此类先前方法,并在不使用蒸馏技术的情况下缩小了与多步方法的差距。我们希望我们的工作能进一步推动快速前向生成建模作为一个独立范式的发展。