A Latent Space Theory for Emergent Abilities in Large Language Models - 专知论文

会员服务 ·

0

联合分布 · 边缘分布 · 语言模型化 · 边缘化 · 稀疏 ·

2023 年 4 月 24 日

A Latent Space Theory for Emergent Abilities in Large Language Models

翻译：暂无翻译

from arxiv, 17 pages, 3 figures

Languages are not created randomly but rather to communicate information. There is a strong association between languages and their underlying meanings, resulting in a sparse joint distribution that is heavily peaked according to their correlations. Moreover, these peak values happen to match with the marginal distribution of languages due to the sparsity. With the advent of LLMs trained on big data and large models, we can now precisely assess the marginal distribution of languages, providing a convenient means of exploring the sparse structures in the joint distribution for effective inferences. In this paper, we categorize languages as either unambiguous or {\epsilon}-ambiguous and present quantitative results to demonstrate that the emergent abilities of LLMs, such as language understanding, in-context learning, chain-of-thought prompting, and effective instruction fine-tuning, can all be attributed to Bayesian inference on the sparse joint distribution of languages.

翻译：暂无翻译

1

相关内容

联合分布

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

复杂工程产品基于多可信度近似的设计优化研究

国家自然科学基金

0+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

云环境下考虑用户偏好的会计信息系统可信性动态评估模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

可调型异相与固载催化剂的前驱体：系列含有机硅钌化合物的制备与应用

国家自然科学基金

0+阅读 · 2012年12月31日

调控家蚕发育非编码RNA（non-coding RNA, ncRNA）的功能解析

国家自然科学基金

0+阅读 · 2011年12月31日

Language Models can Solve Computer Tasks

Arxiv

0+阅读 · 2023年6月7日

Iterative Translation Refinement with Large Language Models

Arxiv

0+阅读 · 2023年6月6日

The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter

Arxiv

0+阅读 · 2023年6月6日

CUE: An Uncertainty Interpretation Framework for Text Classifiers Built on Pre-Trained Language Models

Arxiv

0+阅读 · 2023年6月6日

SelfEvolve: A Code Evolution Framework via Large Language Models

Arxiv

0+阅读 · 2023年6月5日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

【干货书】机器学习设计模式，408页pdf，Machine Learning Design Patterns

专知会员服务

138+阅读 · 2022年2月6日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

美海军作战管理系统：变革战场空间的二十年

《任务与武器驱动美海军舰队设计》报告

俄罗斯“沙希德”/“天竺葵”攻击无人机

《利用动态图对网络攻击进行建模与仿真：在云安全评估中的应用》90页

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

Language Models can Solve Computer Tasks

Arxiv

0+阅读 · 2023年6月7日

Iterative Translation Refinement with Large Language Models

Arxiv

0+阅读 · 2023年6月6日

The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter

Arxiv

0+阅读 · 2023年6月6日

CUE: An Uncertainty Interpretation Framework for Text Classifiers Built on Pre-Trained Language Models

Arxiv

0+阅读 · 2023年6月6日

SelfEvolve: A Code Evolution Framework via Large Language Models

Arxiv

0+阅读 · 2023年6月5日

相关基金

复杂工程产品基于多可信度近似的设计优化研究

国家自然科学基金

0+阅读 · 2015年12月31日

Anderson型多酸的不对称修饰及可控组装研究

国家自然科学基金

1+阅读 · 2014年12月31日

云环境下考虑用户偏好的会计信息系统可信性动态评估模型研究

国家自然科学基金

0+阅读 · 2012年12月31日

可调型异相与固载催化剂的前驱体：系列含有机硅钌化合物的制备与应用

国家自然科学基金

0+阅读 · 2012年12月31日

调控家蚕发育非编码RNA（non-coding RNA, ncRNA）的功能解析

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员