SCOTT: Self-Consistent Chain-of-Thought Distillation - 专知论文

会员服务 ·

0

CoT · 蒸馏 · MoDELS · Performance · Learning ·

2023 年 5 月 3 日

SCOTT: Self-Consistent Chain-of-Thought Distillation

翻译：暂无翻译

Peifeng Wang,Zhengyang Wang,Zheng Li,Yifan Gao,Bing Yin,Xiang Ren

from arxiv, 11 pages, 8 figures. Accepted to ACL 2023

Large language models (LMs) beyond a certain scale, demonstrate the emergent capability of generating free-text rationales for their predictions via chain-of-thought (CoT) prompting. While CoT can yield dramatically improved performance, such gains are only observed for sufficiently large LMs. Even more concerning, there is little guarantee that the generated rationales are consistent with LM's predictions or faithfully justify the decisions. In this work, we propose a faithful knowledge distillation method to learn a small, self-consistent CoT model from a teacher model that is orders of magnitude larger. To form better supervision, we elicit rationales supporting the gold answers from a large LM (teacher) by contrastive decoding, which encourages the teacher to generate tokens that become more plausible only when the answer is considered. To ensure faithful distillation, we use the teacher-generated rationales to learn a student LM with a counterfactual reasoning objective, which prevents the student from ignoring the rationales to make inconsistent predictions. Experiments show that, while yielding comparable end-task performance, our method can generate CoT rationales that are more faithful than baselines do. Further analysis suggests that such a model respects the rationales more when making decisions; thus, we can improve its performance more by refining its rationales.

翻译：暂无翻译

0

相关内容

CoT

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

离子注入合成In纳米颗粒在Al薄膜中超导性质的研究

国家自然科学基金

0+阅读 · 2015年12月31日

一类四阶非线性方程的非协调有限元方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

Lin28B调控大鼠视网膜Müller细胞的重编程及其分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

组蛋白乙酰化修饰在12-脂氧化酶影响糖尿病性肾小球肥大中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

无约束不确定RFID数据流近似去噪

国家自然科学基金

0+阅读 · 2012年12月31日

Squeezing nnU-Nets with Knowledge Distillation for On-Board Cloud Detection

Arxiv

0+阅读 · 2023年6月16日

Knowledge Distillation of Large Language Models

Arxiv

0+阅读 · 2023年6月14日

Multimodal Prompting with Missing Modalities for Visual Recognition

Arxiv

11+阅读 · 2023年3月6日

Learning from History: Modeling Temporal Knowledge Graphs with Sequential Copy-Generation Networks

Arxiv

11+阅读 · 2020年12月15日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

VIP会员

文章信息

相关主题

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

167+阅读 · 2020年3月18日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

大模型推理时代的知识编辑

《利用人工智能对军事行动进行建模》

【MIT博士论文】加速科学发现的因果建模实践算法

机器人、无人机与实时影像：应对城市爆炸威胁的三大技术方案

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Squeezing nnU-Nets with Knowledge Distillation for On-Board Cloud Detection

Arxiv

0+阅读 · 2023年6月16日

Knowledge Distillation of Large Language Models

Arxiv

0+阅读 · 2023年6月14日

Multimodal Prompting with Missing Modalities for Visual Recognition

Arxiv

11+阅读 · 2023年3月6日

Learning from History: Modeling Temporal Knowledge Graphs with Sequential Copy-Generation Networks

Arxiv

11+阅读 · 2020年12月15日

Text Detection and Recognition in the Wild: A Review

Arxiv

20+阅读 · 2020年6月8日

相关基金

离子注入合成In纳米颗粒在Al薄膜中超导性质的研究

国家自然科学基金

0+阅读 · 2015年12月31日

一类四阶非线性方程的非协调有限元方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

Lin28B调控大鼠视网膜Müller细胞的重编程及其分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

组蛋白乙酰化修饰在12-脂氧化酶影响糖尿病性肾小球肥大中的作用研究

国家自然科学基金

0+阅读 · 2012年12月31日

无约束不确定RFID数据流近似去噪

国家自然科学基金

0+阅读 · 2012年12月31日

微信扫码咨询专知VIP会员