We extend the convergence analysis of AdaSLS and AdaSPS in [Jiang and Stich, 2024] to the nonconvex setting, presenting a unified convergence analysis of stochastic gradient descent with adaptive Armijo line-search (AdaSLS) and Polyak stepsize (AdaSPS) for nonconvex optimization. Our contributions include: (1) an $\mathcal{O}(1/\sqrt{T})$ convergence rate for general nonconvex smooth functions, (2) an $\mathcal{O}(1/T)$ rate under quasar-convexity and interpolation, and (3) an $\mathcal{O}(1/T)$ rate under the strong growth condition for general nonconvex functions.
翻译:我们将[Jiang and Stich, 2024]中AdaSLS与AdaSPS的收敛性分析扩展至非凸优化场景,提出了针对非凸优化的自适应Armijo线搜索随机梯度下降法(AdaSLS)与Polyak步长法(AdaSPS)的统一收敛性分析框架。主要贡献包括:(1)针对一般非凸光滑函数证明了$\mathcal{O}(1/\sqrt{T})$的收敛速率;(2)在拟星凸性与插值条件下实现了$\mathcal{O}(1/T)$的加速收敛;(3)在强增长条件下对一般非凸函数获得了$\mathcal{O}(1/T)$的收敛速率。