快速权重编程与线性Transformer：从机器学习到神经生物学 (Fast weight programming and linear transformers: from machine learning to neurobiology) - 专知论文

会员服务 ·

0

编程 · 重编程 · RNN · 生物 · 神经网络 ·

Fast weight programming and linear transformers: from machine learning to neurobiology

翻译：快速权重编程与线性Transformer：从机器学习到神经生物学

Kazuki Irie,Samuel J. Gershman

Recent advances in artificial neural networks for machine learning, and language modeling in particular, have established a family of recurrent neural network (RNN) architectures that, unlike conventional RNNs with vector-form hidden states, use two-dimensional (2D) matrix-form hidden states. Such 2D-state RNNs, known as Fast Weight Programmers (FWPs), can be interpreted as a neural network whose synaptic weights (called fast weights) dynamically change over time as a function of input observations, and serve as short-term memory storage; corresponding synaptic weight modifications are controlled or programmed by another network (the programmer) whose parameters are trained (e.g., by gradient descent). In this Primer, we review the technical foundations of FWPs, their computational characteristics, and their connections to transformers and state space models. We also discuss connections between FWPs and models of synaptic plasticity in the brain, suggesting a convergence of natural and artificial intelligence.

翻译：近年来，人工神经网络在机器学习特别是语言建模领域取得了显著进展，其中一类循环神经网络（RNN）架构突破了传统RNN采用向量形式隐藏状态的设计，转而采用二维矩阵形式的隐藏状态。这类二维状态RNN被称为快速权重编程器（FWPs），可被解释为一种突触权重（称为快速权重）随时间根据输入观测动态变化的神经网络，其权重变化充当短期记忆存储功能；相应的突触权重调整由另一个网络（编程器）通过训练参数（例如通过梯度下降）进行控制或编程。本文综述了FWPs的技术基础、计算特性及其与Transformer及状态空间模型的关联，并探讨了FWPs与大脑突触可塑性模型之间的联系，揭示了自然智能与人工智能的融合趋势。

0

相关内容

人们为了让计算机解决各种棘手的问题，使用编程语言 编写程序代码并通过计算机运算得到最终结果的过程。

【WWW2024】博弈论式反事实解释图神经网络

【WWW2024】博弈论式反事实解释图神经网络

专知会员服务

32+阅读 · 2024年2月17日

《子空间学习机 (SLM)：一种新的分类和回归方法》2022最新35页技术报告，美陆军研究实验室

《子空间学习机 (SLM)：一种新的分类和回归方法》2022最新35页技术报告，美陆军研究实验室

专知会员服务

31+阅读 · 2022年11月28日

可解释的自然语言处理方法简介

专知会员服务

81+阅读 · 2021年5月30日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

TensorFlow官方开源的神经结构学习（Neural Structured Learning）库

TensorFlow官方开源的神经结构学习（Neural Structured Learning）库

专知会员服务

18+阅读 · 2019年10月18日

【深度度量学习系列】Triplet-loss原理与应用

【深度度量学习系列】Triplet-loss原理与应用

AINLP

61+阅读 · 2020年10月7日

Capsule Networks，胶囊网络，57页ppt，布法罗大学

Capsule Networks，胶囊网络，57页ppt，布法罗大学

专知

12+阅读 · 2020年2月29日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning，33页ppt

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning，33页ppt

专知

72+阅读 · 2020年2月29日

一文详解深度学习在命名实体识别(NER)中的应用

一文详解深度学习在命名实体识别(NER)中的应用

AINLP

24+阅读 · 2018年10月23日

论文浅尝 | 当知识图谱遇上零样本学习——零样本学习综述

论文浅尝 | 当知识图谱遇上零样本学习——零样本学习综述

开放知识图谱

22+阅读 · 2018年9月26日

视觉识别中的实用鲁棒回归技术研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于多样化查询的多标记主动学习研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于最大相关熵准则的支持向量机模型与算法研究

国家自然科学基金

3+阅读 · 2015年12月31日

模糊认知集群优化的聚类算法

国家自然科学基金

8+阅读 · 2015年12月31日

基于结构学习的非平行支持向量机最优化方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

An Introduction to Deep Reinforcement and Imitation Learning

Arxiv

0+阅读 · 12月8日

In-Context and Few-Shots Learning for Forecasting Time Series Data based on Large Language Models

Arxiv

0+阅读 · 12月8日

NeuroMemFPP: A recurrent neural approach for memory-aware parameter estimation in fractional Poisson process

Arxiv

0+阅读 · 12月5日

Plan of Knowledge: Retrieval-Augmented Large Language Models for Temporal Knowledge Graph Question Answering

Arxiv

0+阅读 · 11月6日

Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs

Arxiv

0+阅读 · 11月5日

VIP会员

文章信息

相关主题

相关VIP内容

【WWW2024】博弈论式反事实解释图神经网络

【WWW2024】博弈论式反事实解释图神经网络

专知会员服务

32+阅读 · 2024年2月17日

《子空间学习机 (SLM)：一种新的分类和回归方法》2022最新35页技术报告，美陆军研究实验室

《子空间学习机 (SLM)：一种新的分类和回归方法》2022最新35页技术报告，美陆军研究实验室

专知会员服务

31+阅读 · 2022年11月28日

可解释的自然语言处理方法简介

专知会员服务

81+阅读 · 2021年5月30日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning

专知会员服务

159+阅读 · 2020年2月29日

TensorFlow官方开源的神经结构学习（Neural Structured Learning）库

TensorFlow官方开源的神经结构学习（Neural Structured Learning）库

专知会员服务

18+阅读 · 2019年10月18日

热门VIP内容

开通专知VIP会员享更多权益服务

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

前沿人工智能趋势报告（Frontier AI Trends Report）

音退化问题：基于输入操控的鲁棒语音转换综述

相关资讯

【深度度量学习系列】Triplet-loss原理与应用

【深度度量学习系列】Triplet-loss原理与应用

AINLP

61+阅读 · 2020年10月7日

Capsule Networks，胶囊网络，57页ppt，布法罗大学

Capsule Networks，胶囊网络，57页ppt，布法罗大学

专知

12+阅读 · 2020年2月29日

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning，33页ppt

元迁移学习的小样本学习，Meta-transfer Learning for Few-shot Learning，33页ppt

专知

72+阅读 · 2020年2月29日

一文详解深度学习在命名实体识别(NER)中的应用

一文详解深度学习在命名实体识别(NER)中的应用

AINLP

24+阅读 · 2018年10月23日

论文浅尝 | 当知识图谱遇上零样本学习——零样本学习综述

论文浅尝 | 当知识图谱遇上零样本学习——零样本学习综述

开放知识图谱

22+阅读 · 2018年9月26日

相关论文

An Introduction to Deep Reinforcement and Imitation Learning

Arxiv

0+阅读 · 12月8日

In-Context and Few-Shots Learning for Forecasting Time Series Data based on Large Language Models

Arxiv

0+阅读 · 12月8日

NeuroMemFPP: A recurrent neural approach for memory-aware parameter estimation in fractional Poisson process

Arxiv

0+阅读 · 12月5日

Plan of Knowledge: Retrieval-Augmented Large Language Models for Temporal Knowledge Graph Question Answering

Arxiv

0+阅读 · 11月6日

Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs

Arxiv

0+阅读 · 11月5日

相关基金

视觉识别中的实用鲁棒回归技术研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于多样化查询的多标记主动学习研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于最大相关熵准则的支持向量机模型与算法研究

国家自然科学基金

3+阅读 · 2015年12月31日

模糊认知集群优化的聚类算法

国家自然科学基金

8+阅读 · 2015年12月31日

基于结构学习的非平行支持向量机最优化方法研究

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员