数学推理论文 - 专知

会员服务 ·

数学推理

FATE: A Formal Benchmark Series for Frontier Algebra of Multiple Difficulty Levels

Arxiv

0+阅读 · 11月4日

FATE: A Formal Benchmark Series for Frontier Algebra of Multiple Difficulty Levels

Arxiv

0+阅读 · 11月6日

Catch Me If You Can: How Smaller Reasoning Models Pretend to Reason with Mathematical Fidelity

Arxiv

0+阅读 · 11月29日

DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

Arxiv

0+阅读 · 11月27日

An Investigation of Robustness of LLMs in Mathematical Reasoning: Benchmarking with Mathematically-Equivalent Transformation of Advanced Mathematical Problems

Arxiv

0+阅读 · 12月4日

Rectify Evaluation Preference: Improving LLMs' Critique on Math Reasoning via Perplexity-aware Reinforcement Learning

Arxiv

0+阅读 · 11月13日

IndiMathBench: Autoformalizing Mathematical Reasoning Problems with a Human Touch

Arxiv

0+阅读 · 11月30日

SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning

Arxiv

0+阅读 · 12月2日

LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models

Arxiv

0+阅读 · 2023年4月4日

Self-Refine: Iterative Refinement with Self-Feedback

Arxiv

0+阅读 · 2023年3月30日

Nature Language Reasoning, A Survey

Arxiv

81+阅读 · 2023年3月26日

参考链接

微信扫码咨询专知VIP会员