基于确定性 PAC-Bayes 的梯度下降通用化 (Generalisation under gradient descent via deterministic PAC-Bayes) - 专知论文

会员服务 ·

0

通用化 · PAC学习理论 · 梯度 · 优化算法 · 迭代优化 ·

2023 年 4 月 4 日

Generalisation under gradient descent via deterministic PAC-Bayes

翻译：基于确定性 PAC-Bayes 的梯度下降通用化

Eugenio Clerico,Tyler Farghly,George Deligiannidis,Benjamin Guedj,Arnaud Doucet

We establish disintegrated PAC-Bayesian generalisation bounds for models trained with gradient descent methods or continuous gradient flows. Contrary to standard practice in the PAC-Bayesian setting, our result applies to optimisation algorithms that are deterministic, without requiring any de-randomisation step. Our bounds are fully computable, depending on the density of the initial distribution and the Hessian of the training objective over the trajectory. We show that our framework can be applied to a variety of iterative optimisation algorithms, including stochastic gradient descent (SGD), momentum-based schemes, and damped Hamiltonian dynamics.

翻译：我们为使用梯度下降方法或连续梯度流训练的模型建立了 disintegrated PAC-Bayesian 通用化界限。与 PAC-Bayesian 设置中的标准实践相反，我们的结果适用于确定性优化算法，而不需要任何去随机化步骤。我们的界限是完全可计算的，并且取决于初始分布的密度和沿轨迹的训练目标的 Hessian。我们展示了我们的框架可以应用于各种迭代优化算法，包括随机梯度下降（SGD）、动量基算法和阻尼哈密尔顿动力学等。

0

相关内容

通用化

【干货书】深度学习数学：理解神经网络，347页pdf

【干货书】深度学习数学：理解神经网络，347页pdf

专知会员服务

267+阅读 · 2022年7月3日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

不可错过！图宾根大学《深度学习》课程，12讲述神经网络、GNN、GAN、序列模型等主题，附Slides与151页pdf笔记

不可错过！图宾根大学《深度学习》课程，12讲述神经网络、GNN、GAN、序列模型等主题，附Slides与151页pdf笔记

专知

18+阅读 · 2021年5月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

最优控制问题H1-Galerkin混合有限元方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

一类具有光滑结构的非光滑随机优化的分解方法

国家自然科学基金

0+阅读 · 2013年12月31日

线性积分方程的Galerkin快速谱方法

国家自然科学基金

0+阅读 · 2009年12月31日

Testing Directed Acyclic Graph via Structural, Supervised and Generative Adversarial Learning

Arxiv

0+阅读 · 2023年5月23日

Towards Understanding the Dynamics of Gaussian--Stein Variational Gradient Descent

Arxiv

0+阅读 · 2023年5月23日

Uniform-in-Time Wasserstein Stability Bounds for (Noisy) Stochastic Gradient Descent

Arxiv

0+阅读 · 2023年5月20日

Efficient and Deterministic Search Strategy Based on Residual Projections for Point Cloud Registration

Arxiv

0+阅读 · 2023年5月19日

A Compound Gaussian Network for Solving Linear Inverse Problems

Arxiv

0+阅读 · 2023年5月19日

VIP会员

文章信息

相关主题

PAC学习理论

相关VIP内容

【干货书】深度学习数学：理解神经网络，347页pdf

【干货书】深度学习数学：理解神经网络，347页pdf

专知会员服务

267+阅读 · 2022年7月3日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

前沿人工智能趋势报告（Frontier AI Trends Report）

【AAAI2026】善始则事半功倍：基于前缀优化的大语言模型推理强化学习

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

音退化问题：基于输入操控的鲁棒语音转换综述

相关资讯

不可错过！图宾根大学《深度学习》课程，12讲述神经网络、GNN、GAN、序列模型等主题，附Slides与151页pdf笔记

不可错过！图宾根大学《深度学习》课程，12讲述神经网络、GNN、GAN、序列模型等主题，附Slides与151页pdf笔记

专知

18+阅读 · 2021年5月8日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

相关论文

Testing Directed Acyclic Graph via Structural, Supervised and Generative Adversarial Learning

Arxiv

0+阅读 · 2023年5月23日

Towards Understanding the Dynamics of Gaussian--Stein Variational Gradient Descent

Arxiv

0+阅读 · 2023年5月23日

Uniform-in-Time Wasserstein Stability Bounds for (Noisy) Stochastic Gradient Descent

Arxiv

0+阅读 · 2023年5月20日

Efficient and Deterministic Search Strategy Based on Residual Projections for Point Cloud Registration

Arxiv

0+阅读 · 2023年5月19日

A Compound Gaussian Network for Solving Linear Inverse Problems

Arxiv

0+阅读 · 2023年5月19日

相关基金

两类带导数的非线性Schrodinger方程拟周期解的存在性

国家自然科学基金

0+阅读 · 2015年12月31日

最优控制问题H1-Galerkin混合有限元方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

一类具有光滑结构的非光滑随机优化的分解方法

国家自然科学基金

0+阅读 · 2013年12月31日

线性积分方程的Galerkin快速谱方法

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员