Online linear programming (OLP) has found broad applications in revenue management and resource allocation. State-of-the-art OLP algorithms achieve low regret by repeatedly solving linear programming (LP) subproblems that incorporate updated resource information. However, LP-based methods are computationally expensive and often inefficient for large-scale applications. In contrast, recent first-order OLP algorithms are more computationally efficient but typically suffer from worse regret guarantees. To address these shortcomings, we propose a new algorithm that combines the strengths of LP-based and first-order OLP methods. The algorithm re-solves the LP subproblems periodically at a predefined frequency $f$ and uses the latest dual prices to guide online decision-making. In addition, a first-order method runs in parallel during each interval between LP re-solves, smoothing resource consumption. Our algorithm achieves $\mathscr{O}(\log (T/f) + \sqrt{f})$ regret, delivering a "wait-less" online decision-making process that balances the computational efficiency of first-order methods and the superior regret guarantee of LP-based methods.
翻译:在线线性规划(OLP)在收益管理与资源分配领域具有广泛应用。当前最先进的OLP算法通过反复求解包含更新后资源信息的线性规划(LP)子问题来实现低遗憾度。然而,基于LP的方法计算成本高昂,在大规模应用中往往效率低下。相比之下,近期的一阶OLP算法计算效率更高,但通常遗憾度保证较差。为克服这些缺陷,我们提出一种新算法,融合了基于LP方法与一阶OLP方法的优势。该算法以预设频率$f$周期性地重解LP子问题,并利用最新的对偶价格指导在线决策。此外,在每次LP重解间隔期间,并行运行的一阶方法可平滑资源消耗。我们的算法实现了$\mathscr{O}(\log (T/f) + \sqrt{f})$的遗憾度,提供了一种“无等待”的在线决策流程,平衡了一阶方法的计算效率与基于LP方法的优越遗憾度保证。