结构化测试统计的精确 Peried- Permodation 测试 (Exact Paired-Permutation Testing for Structured Test Statistics) - 专知论文

会员服务 ·

0

确切的 · 统计量 · 蒙特卡罗 · Performer · 近似 ·

2022 年 5 月 3 日

Exact Paired-Permutation Testing for Structured Test Statistics

翻译：结构化测试统计的精确 Peried- Permodation 测试

Ran Zmigrod,Tim Vieira,Ryan Cotterell

Significance testing -- especially the paired-permutation test -- has played a vital role in developing NLP systems to provide confidence that the difference in performance between two systems (i.e., the test statistic) is not due to luck. However, practitioners rely on Monte Carlo approximation to perform this test due to a lack of a suitable exact algorithm. In this paper, we provide an efficient exact algorithm for the paired-permutation test for a family of structured test statistics. Our algorithm runs in $\mathcal{O}(GN (\log GN )(\log N ))$ time where $N$ is the dataset size and $G$ is the range of the test statistic. We found that our exact algorithm was $10$x faster than the Monte Carlo approximation with $20000$ samples on a common dataset.

翻译：质量测试 -- -- 特别是配对变异测试 -- -- 在开发NLP系统以使人们相信两个系统(即测试统计)的性能差异不是运气造成的,因而在建立NLP系统方面发挥了至关重要的作用。然而,由于缺乏合适的精确算法,从业人员依赖蒙特卡洛近似值来进行这项测试。在本文中,我们为一组结构化测试统计数据的组合的配对变异测试提供了高效的精确算法。我们的算法以$\mathcal{O}(GN (log GN)(\log N))) 运行,用美元运行,用美元计算数据集大小为N$,用$G$作为测试统计数据的范围。我们发现,我们的精确算法比蒙特卡洛近似值高出1 000美元,共同数据集的样本为2,000美元。

0

相关内容

确切的

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

肝细胞肝癌中高表达的PRC1基因功能及其受CTCF调控的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

具有状态约束的Navier-Stokes方程的最优控制问题

国家自然科学基金

0+阅读 · 2013年12月31日

玉米耐渍候选基因zmERF33的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

颗粒材料中的偶应力效应及Cosserat介质本构模拟研究

国家自然科学基金

0+阅读 · 2011年12月31日

裂隙岩体渗透性“#32467;构－基质－应力”#21327;同控制机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

U-statistics of growing order and sub-Gaussian mean estimators with sharp constants

Arxiv

0+阅读 · 2022年6月21日

Continuous mean distance of a weighted graph

Arxiv

0+阅读 · 2022年6月20日

Counting colorings of triangle-free graphs

Arxiv

0+阅读 · 2022年6月17日

AutoML Two-Sample Test

Arxiv

0+阅读 · 2022年6月17日

Spectral CUSUM for Online Network Structure Change Detection

Arxiv

0+阅读 · 2022年6月17日

VIP会员

文章信息

相关主题

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

前沿人工智能趋势报告（Frontier AI Trends Report）

【AAAI2026】善始则事半功倍：基于前缀优化的大语言模型推理强化学习

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

音退化问题：基于输入操控的鲁棒语音转换综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

U-statistics of growing order and sub-Gaussian mean estimators with sharp constants

Arxiv

0+阅读 · 2022年6月21日

Continuous mean distance of a weighted graph

Arxiv

0+阅读 · 2022年6月20日

Counting colorings of triangle-free graphs

Arxiv

0+阅读 · 2022年6月17日

AutoML Two-Sample Test

Arxiv

0+阅读 · 2022年6月17日

Spectral CUSUM for Online Network Structure Change Detection

Arxiv

0+阅读 · 2022年6月17日

相关基金

肝细胞肝癌中高表达的PRC1基因功能及其受CTCF调控的机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

具有状态约束的Navier-Stokes方程的最优控制问题

国家自然科学基金

0+阅读 · 2013年12月31日

玉米耐渍候选基因zmERF33的功能分析

国家自然科学基金

0+阅读 · 2012年12月31日

颗粒材料中的偶应力效应及Cosserat介质本构模拟研究

国家自然科学基金

0+阅读 · 2011年12月31日

裂隙岩体渗透性“#32467;构－基质－应力”#21327;同控制机理研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员