Realistic and diverse multi-agent driving scenes are crucial for evaluating autonomous vehicles, but safety-critical events which are essential for this task are rare and underrepresented in driving datasets. Data-driven scene generation offers a low-cost alternative by synthesizing complex traffic behaviors from existing driving logs. However, existing models often lack controllability or yield samples that violate physical or social constraints, limiting their usability. We present OMEGA, an optimization-guided, training-free framework that enforces structural consistency and interaction awareness during diffusion-based sampling from a scene generation model. OMEGA re-anchors each reverse diffusion step via constrained optimization, steering the generation towards physically plausible and behaviorally coherent trajectories. Building on this framework, we formulate ego-attacker interactions as a game-theoretic optimization in the distribution space, approximating Nash equilibria to generate realistic, safety-critical adversarial scenarios. Experiments on nuPlan and Waymo show that OMEGA improves generation realism, consistency, and controllability, increasing the ratio of physically and behaviorally valid scenes from 32.35% to 72.27% for free exploration capabilities, and from 11% to 80% for controllability-focused generation. Our approach can also generate $5\times$ more near-collision frames with a time-to-collision under three seconds while maintaining the overall scene realism.
翻译:真实且多样的多智能体驾驶场景对于评估自动驾驶车辆至关重要,但对此任务必不可少的安全关键事件在驾驶数据集中却稀少且代表性不足。数据驱动的场景生成提供了一种低成本替代方案,通过从现有驾驶日志中合成复杂的交通行为。然而,现有模型往往缺乏可控性,或产生违反物理或社会约束的样本,限制了其实用性。我们提出了OMEGA,一种优化引导、无需训练的框架,在基于扩散的场景生成模型采样过程中强制执行结构一致性和交互感知。OMEGA通过约束优化重新锚定每个反向扩散步骤,引导生成朝向物理合理且行为连贯的轨迹。基于此框架,我们将自我-攻击者交互建模为分布空间中的博弈论优化,通过近似纳什均衡来生成真实的安全关键对抗场景。在nuPlan和Waymo上的实验表明,OMEGA提升了生成的真实性、一致性和可控性,将物理和行为有效场景的比例从自由探索能力的32.35%提高至72.27%,在侧重可控性的生成中从11%提升至80%。我们的方法还能生成$5\times$更多碰撞时间低于三秒的近碰撞帧,同时保持整体场景的真实性。