Designing efficient and rigorous numerical methods for sequential decision-making under uncertainty is a difficult problem that arises in many applications frameworks. In this paper we focus on the numerical solution of a subclass of impulse control problem for piecewise deterministic Markov process (PDMP) when the jump times are hidden. We first state the problem as a partially observed Markov decision process (POMDP) on a continuous state space and with controlled transition kernels corresponding to some specific skeleton chains of the PDMP. Then we proceed to build a numerically tractable approximation of the POMDP by tailor-made discretizations of the state spaces. The main difficulty in evaluating the discretization error comes from the possible random jumps of the PDMP between consecutive epochs of the POMDP and requires special care. Finally we discuss the practical construction of discretization grids and illustrate our method on simulations.
翻译:为不确定性下的序贯决策设计高效且严谨的数值方法是一个在众多应用框架中出现的难题。本文聚焦于分段确定性马尔可夫过程(PDMP)在跳跃时间不可观测情形下的一类脉冲控制问题的数值求解。我们首先将该问题表述为连续状态空间上的部分观测马尔可夫决策过程(POMDP),其受控转移核对应于PDMP的特定骨架链。随后,通过定制化的状态空间离散化,构建了该POMDP的数值可处理近似。离散化误差评估的主要困难源于PDMP在POMDP连续时段之间可能发生的随机跳跃,这需要特别处理。最后,我们讨论了离散化网格的实际构建方法,并通过仿真示例展示了所提方法。