This paper explores the ability of Graph Neural Networks (GNNs) in learning various forms of information for link prediction, alongside a brief review of existing link prediction methods. Our analysis reveals that GNNs cannot effectively learn structural information related to the number of common neighbors between two nodes, primarily due to the nature of set-based pooling of the neighborhood aggregation scheme. Also, our extensive experiments indicate that trainable node embeddings can improve the performance of GNN-based link prediction models. Importantly, we observe that the denser the graph, the greater such the improvement. We attribute this to the characteristics of node embeddings, where the link state of each link sample could be encoded into the embeddings of nodes that are involved in the neighborhood aggregation of the two nodes in that link sample. In denser graphs, every node could have more opportunities to attend the neighborhood aggregation of other nodes and encode states of more link samples to its embedding, thus learning better node embeddings for link prediction. Lastly, we demonstrate that the insights gained from our research carry important implications in identifying the limitations of existing link prediction methods, which could guide the future development of more robust algorithms.
翻译:本文探讨了图神经网络(GNNs)在学习链接预测所需多种信息形式方面的能力,并对现有链接预测方法进行了简要综述。我们的分析表明,由于邻域聚合方案基于集合的池化特性,GNNs无法有效学习与两节点间共同邻居数量相关的结构信息。此外,大量实验表明可训练的节点嵌入能够提升基于GNN的链接预测模型的性能。值得注意的是,我们观察到图结构越密集,这种提升效果越显著。我们将此归因于节点嵌入的特性:每个链接样本的链接状态可被编码到参与该链接样本中两节点邻域聚合的节点嵌入中。在更密集的图中,每个节点有更多机会参与其他节点的邻域聚合,并将更多链接样本的状态编码至其嵌入中,从而学习到更适用于链接预测的节点嵌入。最后,我们论证了本研究获得的见解对于识别现有链接预测方法的局限性具有重要意义,可为未来开发更鲁棒的算法提供指导。