Ties Matter: Modifying Kendall's Tau for Modern Metric Meta-Evaluation - 专知论文

会员服务 ·

0

机器翻译 · 得分 · motivation · 成对型 · Machine Translation ·

2023 年 5 月 23 日

Ties Matter: Modifying Kendall's Tau for Modern Metric Meta-Evaluation

翻译：暂无翻译

Daniel Deutsch,George Foster,Markus Freitag

Kendall's tau is frequently used to meta-evaluate how well machine translation (MT) evaluation metrics score individual translations. Its focus on pairwise score comparisons is intuitive but raises the question of how ties should be handled, a gray area that has motivated different variants in the literature. We demonstrate that, in settings like modern MT meta-evaluation, existing variants have weaknesses arising from their handling of ties, and in some situations can even be gamed. We propose a novel variant that gives metrics credit for correctly predicting ties, as well as an optimization procedure that automatically introduces ties into metric scores, enabling fair comparison between metrics that do and do not predict ties. We argue and provide experimental evidence that these modifications lead to fairer Kendall-based assessments of metric performance.

翻译：暂无翻译

0

相关内容

机器翻译

机器翻译，又称为自动翻译，是利用计算机将一种自然语言(源语言)转换为另一种自然语言(目标语言)的过程。它是计算语言学的一个分支，是人工智能的终极目标之一，具有重要的科学研究价值。

知识荟萃

精品入门和进阶教程、论文和代码整理等

更多

查看相关VIP内容、论文、资讯等

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

在骨髓间充质干细胞过表达CX3CL1 与Wnt3a基因调节小胶质细胞活性促进神经元再生

国家自然科学基金

0+阅读 · 2014年12月31日

早期阿尔茨海默病中β淀粉样蛋白影响在体海马长时程抑制的机制

国家自然科学基金

0+阅读 · 2014年12月31日

PKCε通路双重调控SDF-1/CXCR4轴介导间充质干细胞归巢与旁分泌治疗DHCA诱导肺损伤研究

国家自然科学基金

0+阅读 · 2013年12月31日

microRNA调控HIF-1α介导颞叶癫痫耐药的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

脂肪因子Chemerin在骨骼肌胰岛素抵抗发生中的作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

A Survey on Evaluation of Large Language Models

Arxiv

0+阅读 · 2023年7月6日

Volumetric Occupancy Detection: A Comparative Analysis of Mapping Algorithms

Arxiv

0+阅读 · 2023年7月6日

Evaluating the Evaluators: Are Current Few-Shot Learning Benchmarks Fit for Purpose?

Arxiv

0+阅读 · 2023年7月6日

An Overview on Machine Translation Evaluation

An Overview on Machine Translation Evaluation

Arxiv

14+阅读 · 2022年2月22日

GeomCA: Geometric Evaluation of Data Representations

GeomCA: Geometric Evaluation of Data Representations

Arxiv

11+阅读 · 2021年5月26日

VIP会员

文章信息

相关主题

Machine Translation

相关VIP内容

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

前沿人工智能趋势报告（Frontier AI Trends Report）

【AAAI2026】善始则事半功倍：基于前缀优化的大语言模型推理强化学习

Andrej Karpathy：2025 年 LLM 年度回顾（2025 LLM Year in Review）

音退化问题：基于输入操控的鲁棒语音转换综述

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

【代码资源】GAN | 七份最热GAN文章及代码分享（Github 1000+Stars）

专知

13+阅读 · 2018年6月24日

相关论文

A Survey on Evaluation of Large Language Models

Arxiv

0+阅读 · 2023年7月6日

Volumetric Occupancy Detection: A Comparative Analysis of Mapping Algorithms

Arxiv

0+阅读 · 2023年7月6日

Evaluating the Evaluators: Are Current Few-Shot Learning Benchmarks Fit for Purpose?

Arxiv

0+阅读 · 2023年7月6日

An Overview on Machine Translation Evaluation

An Overview on Machine Translation Evaluation

Arxiv

14+阅读 · 2022年2月22日

GeomCA: Geometric Evaluation of Data Representations

GeomCA: Geometric Evaluation of Data Representations

Arxiv

11+阅读 · 2021年5月26日

相关基金

在骨髓间充质干细胞过表达CX3CL1 与Wnt3a基因调节小胶质细胞活性促进神经元再生

国家自然科学基金

0+阅读 · 2014年12月31日

早期阿尔茨海默病中β淀粉样蛋白影响在体海马长时程抑制的机制

国家自然科学基金

0+阅读 · 2014年12月31日

PKCε通路双重调控SDF-1/CXCR4轴介导间充质干细胞归巢与旁分泌治疗DHCA诱导肺损伤研究

国家自然科学基金

0+阅读 · 2013年12月31日

microRNA调控HIF-1α介导颞叶癫痫耐药的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

脂肪因子Chemerin在骨骼肌胰岛素抵抗发生中的作用及其机制

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员