Learning predictive models from high-dimensional sensory observations is fundamental for cyber-physical systems, yet the latent representations learned by standard world models lack physical interpretability. This limits their reliability, generalizability, and applicability to safety-critical tasks. We introduce Physically Interpretable World Models (PIWM), a framework that aligns latent representations with real-world physical quantities and constrains their evolution through partially known physical dynamics. Physical interpretability in PIWM is defined by two complementary properties: (i) the learned latent state corresponds to meaningful physical variables, and (ii) its temporal evolution follows physically consistent dynamics. To achieve this without requiring ground-truth physical annotations, PIWM employs weak distribution-based supervision that captures state uncertainty naturally arising from real-world sensing pipelines. The architecture integrates a VQ-based visual encoder, a transformer-based physical encoder, and a learnable dynamics model grounded in known physical equations. Across three case studies (Cart Pole, Lunar Lander, and Donkey Car), PIWM achieves accurate long-horizon prediction, recovers true system parameters, and significantly improves physical grounding over purely data-driven models. These results demonstrate the feasibility and advantages of learning physically interpretable world models directly from images under weak supervision.
翻译:从高维感官观测中学习预测模型是信息物理系统的基础,然而标准世界模型学习到的潜在表征缺乏物理可解释性。这限制了其可靠性、泛化能力以及在安全关键任务中的应用。我们提出了物理可解释世界模型(PIWM),该框架将潜在表征与现实世界的物理量对齐,并通过部分已知的物理动力学约束其演化过程。PIWM中的物理可解释性由两个互补属性定义:(i)学习到的潜在状态对应有意义的物理变量;(ii)其时间演化遵循物理一致的动力学规律。为在不依赖真实物理标注的情况下实现这一目标,PIWM采用基于分布的弱监督方法,捕捉现实世界传感流程中自然产生的状态不确定性。该架构整合了基于VQ的视觉编码器、基于Transformer的物理编码器,以及基于已知物理方程的可学习动力学模型。在三个案例研究(Cart Pole、Lunar Lander和Donkey Car)中,PIWM实现了准确的长时程预测,恢复了真实系统参数,并显著提升了纯数据驱动模型的物理基础。这些结果证明了在弱监督下直接从图像中学习物理可解释世界模型的可行性和优势。