We study an information-theoretic minimax problem for finite multivariate Markov chains on $d$-dimensional product state spaces. Given a family $\mathcal B=\{P_1,\ldots,P_n\}$ of $π$-stationary transition matrices and a class $\mathcal F = \mathcal{F}(\mathbf{S})$ of factorizable models induced by a partition $\mathbf S$ of the coordinate set $[d]$, we seek to minimize the worst-case information loss by analyzing $$\min_{Q\in\mathcal F}\max_{P\in\mathcal B} D_{\mathrm{KL}}^π(P\|Q),$$ where $D_{\mathrm{KL}}^π(P\|Q)$ is the $π$-weighted KL divergence from $Q$ to $P$. We recast the above minimax problem into concave maximization over the $n$-probability-simplex via strong duality and Pythagorean identities that we derive. This leads us to formulate an information-theoretic game and show that a mixed strategy Nash equilibrium always exists; and propose a projected subgradient algorithm to approximately solve the minimax problem with provable guarantee. By transforming the minimax problem into an orthant submodular function in $\mathbf{S}$, this motivates us to consider a max-min-max submodular optimization problem and investigate a two-layer subgradient-greedy procedure to approximately solve this generalization. Numerical experiments for Markov chains on the Curie-Weiss and Bernoulli-Laplace models illustrate the practicality of these proposed algorithms and reveals sparse optimal structures in these examples.
翻译:暂无翻译