Machine learning (ML)-based steering can improve the performance of ensemble-based simulations by allowing for online selection of more scientifically meaningful computations. We present DeepDriveMD, a framework for ML-driven steering of scientific simulations that we have used to achieve orders-of-magnitude improvements in molecular dynamics (MD) performance via effective coupling of ML and HPC on large parallel computers. We discuss the design of DeepDriveMD and characterize its performance. We demonstrate that DeepDriveMD can achieve between 100-1000x acceleration for protein folding simulations relative to other methods, as measured by the amount of simulated time performed, while covering the same conformational landscape as quantified by the states sampled during a simulation. Experiments are performed on leadership-class platforms on up to 1020 nodes. The results establish DeepDriveMD as a high-performance framework for ML-driven HPC simulation scenarios, that supports diverse MD simulation and ML back-ends, and which enables new scientific insights by improving the length and time scales accessible with current computing capacity.
翻译:机器学习(ML)制导可以通过在线选择更具有科学意义的计算方法来改进基于共振的模拟的性能。我们展示了DeepDriveMD,这是一个由ML驱动的科学模拟指导框架,这是我们用来通过在大型平行计算机上有效结合ML和HPC来提高分子动态性能的磁级改进的一个框架。我们讨论了DeepDriveMD的设计及其性能特征。我们证明DreepDriveMD可以实现100-1000x的蛋白质折叠模拟相对于其他方法的加速,以模拟时间的量来衡量,同时覆盖模拟期间抽样各州所量化的同一符合性貌。实验是在多达1020个节点的领导阶级平台上进行的。结果将DeepDriveMD确定为MD作为由ML驱动的HPC模拟情景的一个高性能框架,用于支持多种MD模拟和ML后端,并通过改进现有计算能力可以进入的时间和时间尺度,从而获得新的科学洞察力。