Bayesian inference is optimal when the statistical model is well-specified, while outside this setting Bayesian inference can catastrophically fail; accordingly a wealth of post-Bayesian methodologies have been proposed. Predictively oriented (PrO) approaches lift the statistical model $P_θ$ to an (infinite) mixture model $\int P_θ\; \mathrm{d}Q(θ)$ and fit this predictive distribution via minimising an entropy-regularised objective functional. In the well-specified setting one expects the mixing distribution $Q$ to concentrate around the true data-generating parameter in the large data limit, while such singular concentration will typically not be observed if the model is misspecified. Our contribution is to demonstrate that one can empirically detect model misspecification by comparing the standard Bayesian posterior to the PrO `posterior' $Q$. To operationalise this, we present an efficient numerical algorithm based on variational gradient descent. A simulation study, and a more detailed case study involving a Bayesian inverse problem in seismology, confirm that model misspecification can be automatically detected using this framework.
翻译:当统计模型设定正确时,贝叶斯推断具有最优性;然而在此设定之外,贝叶斯推断可能发生灾难性失效。为此,学界提出了大量后贝叶斯方法。预测导向方法将统计模型 $P_θ$ 提升为(无限)混合模型 $\\int P_θ\\; \\mathrm{d}Q(θ)$,并通过最小化熵正则化目标泛函来拟合该预测分布。在模型设定正确的条件下,混合分布 $Q$ 有望在大数据极限下集中于真实数据生成参数;反之,若模型存在误设,则通常不会观察到此类奇异集中现象。本文的贡献在于证明:通过比较标准贝叶斯后验与预测导向的“后验” $Q$,可以实证检测模型误设。为实现这一目标,我们提出了一种基于变分梯度下降的高效数值算法。仿真研究及涉及地震学贝叶斯反问题的详细案例研究均证实,该框架可自动检测模型误设。