Super Vision Transformer - 专知论文

会员服务 ·

0

可约的 · Vision · 变换 · Performer · MoDELS ·

2023 年 5 月 10 日

Super Vision Transformer

翻译：暂无翻译

Mingbao Lin,Mengzhao Chen,Yuxin Zhang,Chunhua Shen,Rongrong Ji,Liujuan Cao

We attempt to reduce the computational costs in vision transformers (ViTs), which increase quadratically in the token number. We present a novel training paradigm that trains only one ViT model at a time, but is capable of providing improved image recognition performance with various computational costs. Here, the trained ViT model, termed super vision transformer (SuperViT), is empowered with the versatile ability to solve incoming patches of multiple sizes as well as preserve informative tokens with multiple keeping rates (the ratio of keeping tokens) to achieve good hardware efficiency for inference, given that the available hardware resources often change from time to time. Experimental results on ImageNet demonstrate that our SuperViT can considerably reduce the computational costs of ViT models with even performance increase. For example, we reduce 2x FLOPs of DeiT-S while increasing the Top-1 accuracy by 0.2% and 0.7% for 1.5x reduction. Also, our SuperViT significantly outperforms existing studies on efficient vision transformers. For example, when consuming the same amount of FLOPs, our SuperViT surpasses the recent state-of-the-art (SOTA) EViT by 1.1% when using DeiT-S as their backbones. The project of this work is made publicly available at https://github.com/lmbxmu/SuperViT.

翻译：暂无翻译

0

相关内容

可约的

【ICCV 2021 】Vision Transformer中的相对位置编码

专知会员服务

30+阅读 · 2021年7月30日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

复域差分, 差分方程和微分方程的研究

国家自然科学基金

0+阅读 · 2011年12月31日

视觉工作记忆存储机制的知觉根源

国家自然科学基金

0+阅读 · 2011年12月31日

OFDM光信号传输及信号处理基础理论与关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于HHT的超光谱图像高精度分类算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

Multi-Tailed Vision Transformer for Efficient Inference

Arxiv

0+阅读 · 2023年6月26日

Knowledge-Infused Self Attention Transformers

Arxiv

0+阅读 · 2023年6月23日

On the Informativeness of Supervision Signals

Arxiv

0+阅读 · 2023年6月22日

A Survey on Vision Transformer

Arxiv

17+阅读 · 2022年2月23日

SiT: Self-supervised vIsion Transformer

Arxiv

19+阅读 · 2021年4月8日

VIP会员

文章信息

相关主题

相关VIP内容

【ICCV 2021 】Vision Transformer中的相对位置编码

专知会员服务

30+阅读 · 2021年7月30日

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

前沿人工智能趋势报告（Frontier AI Trends Report）

【AAAI2026】善始则事半功倍：基于前缀优化的大语言模型推理强化学习

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

音退化问题：基于输入操控的鲁棒语音转换综述

相关资讯

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

谷歌发表的史上最强NLP模型BERT的官方代码和预训练模型可以下载了

AINLP

12+阅读 · 2018年11月1日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

相关论文

Multi-Tailed Vision Transformer for Efficient Inference

Arxiv

0+阅读 · 2023年6月26日

Knowledge-Infused Self Attention Transformers

Arxiv

0+阅读 · 2023年6月23日

On the Informativeness of Supervision Signals

Arxiv

0+阅读 · 2023年6月22日

A Survey on Vision Transformer

Arxiv

17+阅读 · 2022年2月23日

SiT: Self-supervised vIsion Transformer

Arxiv

19+阅读 · 2021年4月8日

相关基金

Kronheimer-Nakajima quiver 模空间与有理曲面

国家自然科学基金

1+阅读 · 2013年12月31日

复域差分, 差分方程和微分方程的研究

国家自然科学基金

0+阅读 · 2011年12月31日

视觉工作记忆存储机制的知觉根源

国家自然科学基金

0+阅读 · 2011年12月31日

OFDM光信号传输及信号处理基础理论与关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

基于HHT的超光谱图像高精度分类算法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员