Bayesian inference has many advantages for complex models, but standard Monte Carlo methods for summarizing the posterior can be computationally demanding, and it is attractive to consider optimization-based variational methods. Our work considers Gaussian approximations with sparse precision matrices which are tractable to optimize in high-dimensions. The optimal Gaussian approximation is usually defined as being closest to the posterior in Kullback-Leibler divergence, but it is useful to consider other divergences when the Gaussian assumption is crude, to capture important posterior features for given applications. Our work studies the weighted Fisher divergence, which focuses on gradient differences between the target posterior and its approximation, with the Fisher and score-based divergences as special cases. We make three main contributions. First, we compare approximations for weighted Fisher divergences under mean-field assumptions for Gaussian and non-Gaussian targets with Kullback-Leibler approximations. Second, we go beyond mean-field and consider approximations with sparse precision matrices reflecting posterior conditional independence structure for hierarchical models. Using stochastic gradient descent to enforce sparsity, we develop two approaches to minimize the Fisher and score-based divergences, based on the reparametrization trick and a batch approximation of the objective. Finally, we study the performances of our methods using logistic regression, generalized linear mixed models and stochastic volatility models.
翻译:暂无翻译