Transformer based models, especially large language models (LLMs) dominate the field of NLP with their mass adoption in tasks such as text generation, summarization and fake news detection. These models offer ease of deployment and reliability for most applications, however, they require significant amounts of computational power for training as well as inference. This poses challenges in their adoption in resource-constrained applications, especially in the open-source community where compute availability is usually scarce. This work proposes a graph-based approach for Environmental Claim Detection, exploring Graph Neural Networks (GNNs) and Hyperbolic Graph Neural Networks (HGNNs) as lightweight yet effective alternatives to transformer-based models. Re-framing the task as a graph classification problem, we transform claim sentences into dependency parsing graphs, utilizing a combination of word2vec \& learnable part-of-speech (POS) tag embeddings for the node features and encoding syntactic dependencies in the edge relations. Our results show that our graph-based models, particularly HGNNs in the poincaré space (P-HGNNs), achieve performance superior to the state-of-the-art on environmental claim detection while using up to \textbf{30x fewer parameters}. We also demonstrate that HGNNs benefit vastly from explicitly modeling data in hierarchical (tree-like) structures, enabling them to significantly improve over their euclidean counterparts.
翻译:基于Transformer的模型,尤其是大规模语言模型(LLMs),凭借其在文本生成、摘要和虚假新闻检测等任务中的广泛应用,主导了自然语言处理领域。这些模型为大多数应用提供了便捷的部署和可靠性,然而,它们在训练和推理阶段都需要大量的计算资源。这给其在资源受限的应用中,特别是在计算资源通常匮乏的开源社区的采用带来了挑战。本研究提出了一种基于图的环境声明检测方法,探索图神经网络(GNNs)和双曲图神经网络(HGNNs)作为基于Transformer模型的轻量级且有效的替代方案。通过将该任务重新构建为图分类问题,我们将声明句子转换为依存句法分析图,结合使用word2vec和可学习的词性(POS)标签嵌入作为节点特征,并在边关系中编码句法依存关系。我们的结果表明,我们的基于图的模型,特别是在庞加莱空间中的双曲图神经网络(P-HGNNs),在环境声明检测任务上实现了优于现有最先进方法的性能,同时使用的参数数量减少了高达30倍。我们还证明,HGNNs通过显式建模数据中的层次化(树状)结构而显著受益,使其性能相较于其欧几里得对应模型有显著提升。