Graph neural networks (GNNs) process large-scale graphs consisting of a hundred billion edges. In contrast to traditional deep learning, unique behaviors of the emerging GNNs are engaged with a large set of graphs and embedding data on storage, which exhibits complex and irregular preprocessing. We propose a novel deep learning framework on large graphs, HolisticGNN, that provides an easy-to-use, near-storage inference infrastructure for fast, energy-efficient GNN processing. To achieve the best end-to-end latency and high energy efficiency, HolisticGNN allows users to implement various GNN algorithms and directly executes them where the actual data exist in a holistic manner. It also enables RPC over PCIe such that the users can simply program GNNs through a graph semantic library without any knowledge of the underlying hardware or storage configurations. We fabricate HolisticGNN's hardware RTL and implement its software on an FPGA-based computational SSD (CSSD). Our empirical evaluations show that the inference time of HolisticGNN outperforms GNN inference services using high-performance modern GPUs by 7.1x while reducing energy consumption by 33.2x, on average.
翻译:由1000亿边缘组成的大型图像神经网(GNNs)处理由1000亿边缘组成的大型图表。与传统的深层学习不同,与新兴GNNs的独特行为相比,新GNNs采用大量图表和存储嵌入数据,这些图表和嵌入数据显示复杂和不正规的预处理。我们建议在大型图表HolistigGNNN(HolistigNNN)上建立一个新的深层次学习框架,为快速、节能GNN的处理提供方便使用、近存取推导基础设施。为了实现最佳端到端的悬浮和高能效,HolistGNNN为用户提供了各种GNN的算法,并在实际数据存在的情况下直接执行这些算法。它还使RPC能够对 PCIe 进行直接处理,这样用户就可以在对基本硬件或储存配置一无所知的情况下,通过一个图形语义图书馆对GNNPS进行程序。我们编织了HGNN的硬件RTL硬件,并在基于FGGSD的计算系统(CSD)进行软件。我们的经验评估显示,在使用高性能状态降低GNNNNPUS的现代性能时,通过高性能降低GNT。