Mining information from graph databases is becoming overly important. To approach this problem, current methods focus on identifying subgraphs with specific topologies; as of today, no work has been dedicated to jointly expressing the syntax and semantics of mining operations over rich property graphs. We define MINE GRAPH RULE, a new operator for mining association rules from property graph databases, by following a research trend that has already been pursued for relational and XML databases. We describe the syntax and semantics of the operator, which allows measuring the support and confidence of each rule, and then we show many examples of increasing complexity, thereby providing a gentle introduction to the rich expressive power of the language, which is designed to be easy-to-use by GQL experts. Although the emphasis of this paper is on providing the syntax and semantics of the MINE GRAPH RULE operator, with several examples of use, we also developed an implementation of the operator on top of Neo4j, the most successful/adopted graph database system to date; the implementation is available as a portable Neo4j plugin, which we use to showcase real-world applications. At the end of our paper, we show the execution performance in a variety of synthetically generated settings, by varying the text of operators, the size of the graph, the ratio between node types, the method for creating relationships, and the maximum support and confidence; we also show our operator at work on two real-life graphs respectively describing music playlists and archived literature, and provide interesting examples of extracted association rules.
翻译:暂无翻译