Finding an $\epsilon$-stationary point of a nonconvex function with a Lipschitz continuous Hessian is a central problem in optimization. Regularized Newton methods are a classical tool and have been studied extensively, yet they still face a trade-off between global and local convergence. Whether a parameter-free algorithm of this type can simultaneously achieve optimal global complexity and quadratic local convergence remains an open question. To bridge this long-standing gap, we propose a new class of regularizers constructed from the current and previous gradients, and leverage the conjugate gradient approach with a negative curvature monitor to solve the regularized Newton equation. The proposed algorithm is adaptive, requiring no prior knowledge of the Hessian Lipschitz constant, and achieves a global complexity of $O(\epsilon^{-3/2})$ in terms of the second-order oracle calls, and $\tilde{O}(\epsilon^{-7/4})$ for Hessian-vector products, respectively. When the iterates converge to a point where the Hessian is positive definite, the method exhibits quadratic local convergence. Preliminary numerical results, including training the physics-informed neural networks, illustrate the competitiveness of our algorithm.
翻译:寻找具有Lipschitz连续Hessian矩阵的非凸函数的$\epsilon$-稳定点是优化领域的核心问题。正则化牛顿方法作为经典工具已被广泛研究,但其仍面临全局收敛与局部收敛之间的权衡。此类无参数算法能否同时实现最优全局复杂度与二次局部收敛,至今仍是一个开放性问题。为弥合这一长期存在的差距,我们提出了一类基于当前与历史梯度构造的新型正则化器,并利用共轭梯度方法结合负曲率监控来求解正则化牛顿方程。所提算法具有自适应性,无需预先获知Hessian矩阵的Lipschitz常数,在二阶Oracle调用方面实现了$O(\epsilon^{-3/2})$的全局复杂度,在Hessian-向量乘积计算方面达到$\tilde{O}(\epsilon^{-7/4})$的复杂度。当迭代点收敛至Hessian矩阵正定的位置时,该方法展现出二次局部收敛特性。包括物理信息神经网络训练在内的初步数值实验,验证了本算法的竞争力。