This report presents Pelican-VL 1.0, a new family of open-source embodied brain models with parameter scales ranging from 7 billion to 72 billion. Our explicit mission is clearly stated as: To embed powerful intelligence into various embodiments. Pelican-VL 1.0 is currently the largest-scale open-source embodied multimodal brain model. Its core advantage lies in the in-depth integration of data power and intelligent adaptive learning mechanisms. Specifically, metaloop distilled a high-quality dataset from a raw dataset containing 4+ billion tokens. Pelican-VL 1.0 is trained on a large-scale cluster of 1000+ A800 GPUs, consuming over 50k+ A800 GPU-hours per checkpoint. This translates to a 20.3% performance uplift from its base model and outperforms 100B-level open-source counterparts by 10.6%, placing it on par with leading proprietary systems on well-known embodied benchmarks. We establish a novel framework, DPPO (Deliberate Practice Policy Optimization), inspired by human metacognition to train Pelican-VL 1.0. We operationalize this as a metaloop that teaches the AI to practice deliberately, which is a RL-Refine-Diagnose-SFT loop.


翻译:本报告介绍了Pelican-VL 1.0,这是一个新系列的开源具身脑模型,其参数量级从70亿到720亿不等。我们明确阐述的使命是:将强大的智能嵌入到各种具身体中。Pelican-VL 1.0是目前规模最大的开源具身多模态脑模型。其核心优势在于数据能力与智能自适应学习机制的深度融合。具体而言,元循环(metaloop)从一个包含超过40亿标记的原始数据集中蒸馏出了一个高质量数据集。Pelican-VL 1.0在由1000多块A800 GPU组成的大规模集群上进行训练,每个检查点消耗超过5万A800 GPU小时。这使其性能相较于基础模型提升了20.3%,并超越了百亿级开源同类模型10.6%,在知名的具身基准测试中达到了与领先的专有系统相当的水平。我们建立了一个新颖的框架DPPO(刻意练习策略优化),其灵感来源于人类的元认知,用于训练Pelican-VL 1.0。我们将其实现为一个元循环,该循环教导AI进行刻意练习,这是一个RL(强化学习)-精炼-诊断-监督微调(SFT)的循环过程。

0
下载
关闭预览

相关内容

ACM/IEEE第23届模型驱动工程语言和系统国际会议,是模型驱动软件和系统工程的首要会议系列,由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来,模型涵盖了建模的各个方面,从语言和方法到工具和应用程序。模特的参加者来自不同的背景,包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛,参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会,并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。 官网链接:http://www.modelsconference.org/
Top
微信扫码咨询专知VIP会员