From c2a91d1e9df16fdd0b1456a97a00312f514ee63b Mon Sep 17 00:00:00 2001 From: DavIvek Date: Mon, 14 Oct 2024 15:35:21 +0200 Subject: [PATCH] add leiden docs --- .../available-algorithms.md | 1 + .../leiden_community_detection.mdx | 160 ++++++++++++++++++ 2 files changed, 161 insertions(+) create mode 100644 pages/advanced-algorithms/available-algorithms/leiden_community_detection.mdx diff --git a/pages/advanced-algorithms/available-algorithms.md b/pages/advanced-algorithms/available-algorithms.md index dc76509e..43c7db99 100644 --- a/pages/advanced-algorithms/available-algorithms.md +++ b/pages/advanced-algorithms/available-algorithms.md @@ -30,6 +30,7 @@ library](/advanced-algorithms/install-mage). | [bipartite_matching](/advanced-algorithms/available-algorithms/bipartite_matching) | C++ | Algorithm for calculating maximum bipartite matching, where matching is a set of nodes chosen in such a way that no two edges share an endpoint. | | [bridges](/advanced-algorithms/available-algorithms/bridges) | C++ | A bridge is an edge, which when deleted, increases the number of connected components. The goal of this algorithm is to detect edges that are bridges in a graph. | | [community_detection](/advanced-algorithms/available-algorithms/community_detection) | C++ | The Louvain method for community detection is a greedy method for finding communities with maximum modularity in a graph. Runs in _O_(*n*log*n*) time. | +| [leiden_community_detection](/advanced-algorithms/available-algorithms/leiden_community_detection) | C++ | The Leiden method for community detection is an improvement on the Louvain method, designed to find communities with maximum modularity in a graph while addressing issues of disconnected communities. Runs in _O_(*L* *m*) time, where *L* is the number of iterations of the algorithm | [cycles](/advanced-algorithms/available-algorithms/cycles) | C++ | Algorithm for detecting cycles on graphs. | | [degree_centrality](/advanced-algorithms/available-algorithms/degree_centrality) | C++ | The basic measurement of centrality that refers to the number of edges adjacent to a node. | | [distance_calculator](/advanced-algorithms/available-algorithms/distance_calculator) | C++ | Module for finding the geographical distance between two points defined with 'lng' and 'lat' coordinates. | diff --git a/pages/advanced-algorithms/available-algorithms/leiden_community_detection.mdx b/pages/advanced-algorithms/available-algorithms/leiden_community_detection.mdx new file mode 100644 index 00000000..180baebf --- /dev/null +++ b/pages/advanced-algorithms/available-algorithms/leiden_community_detection.mdx @@ -0,0 +1,160 @@ +--- +title: leiden_community_detection +description: Explore Memgraph's Leiden community detection capabilities and learn how to analyze the structure of complex networks. Access tutorials and comprehensive documentation to enhance your understanding of Leiden community detection algorithm. +--- + +import { Steps } from 'nextra/components' +import { Callout } from 'nextra/components' +import { Card, Cards } from 'nextra/components' +import GitHub from '/components/icons/GitHub' + +# leiden_community_detection + +Community in graphs mirrors real-world communities, like social circles. In a +graph, communities are sets of nodes. M. Girvan and M. E. J. Newman note that +nodes in a community connect more intensely with each other than with outside +nodes. + +This module employs the [Leiden +algorithm](https://en.wikipedia.org/wiki/Leiden_algorithm) for community detection +based on paper [*From Louvain to Leiden: guaranteeing well-connected communities*](https://arxiv.org/abs/1810.08473). +The Leiden algorithm is a hierarchical clustering algorithm, that recursively merges communities into single nodes by greedily optimizing the modularity and the process repeats in the condensed graph. +It enhances the Louvain algorithm by addressing its limitations, particularly in situations where some identified communities are not well-connected. +This improvement is made by periodically subdividing communities into smaller, well-connected groups. +With an $\mathcal{O}(Lm)$ runtime for $m$ edges and $L$ number of iterations, it suits large graphs. + + + } + title="Source code" + href="https://github.com/memgraph/mage/blob/main/cpp/leiden_community_detection_module/leiden_community_detection_module.cpp" + /> + + +| Trait | Value | +| ------------------------ | --------------------- | +| **Module type** | algorithm | +| **Implementation** | C++ | +| **Graph direction** | undirected | +| **Relationship weights** | weighted / unweighted | +| **Parallelism** | parallel | + +## Procedures + + +You can execute this algorithm on [graph projections, subgraphs or portions of the graph](/advanced-algorithms/run-algorithms#run-procedures-on-subgraph). + + +### `get()` + +Computes graph communities using the Leiden algorithm. + +{

Input:

} + +- `subgraph: Graph` (**OPTIONAL**) ➡ A specific subgraph, which is an [object of type Graph](/advanced-algorithms/run-algorithms#run-procedures-on-subgraph) returned by the `project()` function, on which the algorithm is run. +- `weight: string (default=null)` ➡ Specifies the default relationship weight. If not set, + the algorithm uses the `weight` relationship attribute when present and otherwise + treats the graph as unweighted. +- `gamma: double (default=1.0)` ➡ Resolution parameter used when computing the modularity. Internally the value is divided by the number of relationships for an unweighted graph, or the sum of weights of all relationships otherwise. +- `theta: double (default=0.01)` ➡ Controls the randomness while breaking a community into smaller ones. +- `resolution_parameter: double (default=0.01)` ➡ Minimum change in modularity that must be achieved when merging nodes within the same community. +- `max_iterations: int (default=inf)` ➡ Maximum number of iterations the algorithm will perform. If set to infinity, the algorithm will run until convergence is reached. + +{

Output:

} + +- `node: Vertex` ➡ Graph node. +- `community_id: integer` ➡ Community ID. Defaults to $-1$ if the node does not belong to any community. +- `communities: list` ➡ List of intermediate communities that a node has been part of across iterations. + +{

Usage:

} + +Use the following query to detect communities: + +```cypher +CALL leiden_community_detection.get() +YIELD node, community_id, communities; +``` + +### `get_subgraph()` + +Computes graph communities over a subgraph using the Louvain method. + +{

Input:

} + +- `subgraph: Graph` (**OPTIONAL**) ➡ A specific subgraph, which is an [object of type Graph](/advanced-algorithms/run-algorithms#run-procedures-on-subgraph) returned by the `project()` function, on which the algorithm is run. +- `subgraph_nodes: List[Node]` ➡ List of nodes in the subgraph. +- `subgraph_relationships: List[Relationship]` ➡ List of relationships in the subgraph. +- `weight: string (default=null)` ➡ Specifies the default relationship weight. If not set, + the algorithm uses the `weight` relationship attribute when present and otherwise + treats the graph as unweighted. +- `gamma: double (default=1.0)` ➡ Resolution parameter used when computing the modularity. Internally the value is divided by the number of relationships for an unweighted graph, or the sum of weights of all relationships otherwise. +- `theta: double (default=0.01)` ➡ Controls the randomness while breaking a community into smaller ones. +- `resolution_parameter: double (default=0.01)` ➡ Minimum change in modularity that must be achieved when merging nodes within the same community. +- `max_iterations: int (default=inf)` ➡ Maximum number of iterations the algorithm will perform. If set to infinity, the algorithm will run until convergence is reached. + +{

Output:

} + +- `node: Vertex` ➡ Graph node. +- `community_id: int` ➡ Community ID. Defaults to $-1$ if the node does not belong to any community. +- `communities: list` ➡ List of intermediate communities that a node has been part of across iterations. + +{

Usage:

} + +Use the following query to compute communities in a subgraph: + +```cypher +MATCH (a)-[e]-(b) +WITH COLLECT(a) AS nodes, COLLECT (e) AS relationships +CALL leiden_community_detection.get_subgraph(nodes, relationships) +YIELD node, community_id, communities; +``` + +## Example + + + +{

Database state

} + +The database contains the following data: + +![](/pages/advanced-algorithms/available-algorithms/community_detection/community-detection-1.png) + +Created with the following Cypher queries: + +```cypher +MERGE (a: Node {id: 0}) MERGE (b: Node {id: 1}) CREATE (a)-[r: Relation]->(b); +MERGE (a: Node {id: 0}) MERGE (b: Node {id: 2}) CREATE (a)-[r: Relation]->(b); +MERGE (a: Node {id: 1}) MERGE (b: Node {id: 2}) CREATE (a)-[r: Relation]->(b); +MERGE (a: Node {id: 2}) MERGE (b: Node {id: 3}) CREATE (a)-[r: Relation]->(b); +MERGE (a: Node {id: 3}) MERGE (b: Node {id: 4}) CREATE (a)-[r: Relation]->(b); +MERGE (a: Node {id: 3}) MERGE (b: Node {id: 5}) CREATE (a)-[r: Relation]->(b); +MERGE (a: Node {id: 4}) MERGE (b: Node {id: 5}) CREATE (a)-[r: Relation]->(b); +``` + +{

Detect communities

} + +Get communities using the following query: + +```cypher +CALL leiden_community_detection.get() +YIELD node, community_id, communities +RETURN node.id AS node_id, community_id, communities +ORDER BY node_id; +``` + +Results show which nodes belong to community 1, and which to community 2: + +```plaintext ++--------------+--------------+--------------+ +| node_id | community_id | communities | ++--------------+--------------+--------------+ +| 0 | 0 | [0] | +| 1 | 0 | [0] | +| 2 | 0 | [0] | +| 3 | 1 | [1] | +| 4 | 1 | [1] | +| 5 | 1 | [1] | ++--------------+--------------+--------------+ +``` + +
\ No newline at end of file