Spark论文 - 专知

会员服务 ·

Spark

Apache Spark 是专为大规模数据处理而设计的快速通用的计算引擎。Spark是UC Berkeley AMP lab (加州大学伯克利分校的AMP实验室)所开源的类Hadoop MapReduce的通用并行框架，Spark，拥有Hadoop MapReduce所具有的优点；但不同于MapReduce的是Job中间输出结果可以保存在内存中，从而不再需要读写HDFS，因此Spark能更好地适用于数据挖掘与机器学习等需要迭代的MapReduce的算法。

High-Dimensional Data Processing: Benchmarking Machine Learning and Deep Learning Architectures in Local and Distributed Environments

Arxiv

0+阅读 · 12月11日

True Random Number Generators on IQM Spark

Arxiv

0+阅读 · 12月10日

Declarative Data Pipeline for Large Scale ML Services

Arxiv

0+阅读 · 11月3日

Declarative Data Pipeline for Large Scale ML Services

Arxiv

0+阅读 · 11月5日

Spark-Prover-X1: Formal Theorem Proving Through Diverse Data Training

Arxiv

0+阅读 · 12月1日

Spark-Prover-X1: Formal Theorem Proving Through Diverse Data Training

Arxiv

0+阅读 · 11月18日

Spark-Prover-X1: Formal Theorem Proving Through Diverse Data Training

Arxiv

0+阅读 · 11月17日

Performance and Stability of Barrier Mode Parallel Systems with Heterogeneous and Redundant Jobs

Arxiv

0+阅读 · 12月16日

Smarter Together: Creating Agentic Communities of Practice through Shared Experiential Learning

Arxiv

0+阅读 · 11月11日

SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning

Arxiv

0+阅读 · 12月2日

Riemannian-Geometric Fingerprints of Generative Models

Arxiv

0+阅读 · 10月28日

When Intelligence Fails: An Empirical Study on Why LLMs Struggle with Password Cracking

Arxiv

0+阅读 · 10月26日

Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning

Arxiv

0+阅读 · 10月27日

AQORA: A Fast Learned Adaptive Query Optimizer with Stage-Level Feedback for Spark SQL

Arxiv

0+阅读 · 10月27日

Aircraft Collision Avoidance Systems: Technological Challenges and Solutions on the Path to Regulatory Acceptance

Arxiv

0+阅读 · 10月23日

参考链接

微信扫码咨询专知VIP会员