-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Leiden community detection algorithm docs #1014
base: memgraph-2-21
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,160 @@ | ||
--- | ||
title: leiden_community_detection | ||
description: Explore Memgraph's Leiden community detection capabilities and learn how to analyze the structure of complex networks. Access tutorials and comprehensive documentation to enhance your understanding of Leiden community detection algorithm. | ||
--- | ||
|
||
import { Steps } from 'nextra/components' | ||
import { Callout } from 'nextra/components' | ||
import { Card, Cards } from 'nextra/components' | ||
import GitHub from '/components/icons/GitHub' | ||
|
||
# leiden_community_detection | ||
|
||
Community in graphs mirrors real-world communities, like social circles. In a | ||
graph, communities are sets of nodes. M. Girvan and M. E. J. Newman note that | ||
nodes in a community connect more intensely with each other than with outside | ||
nodes. | ||
|
||
This module employs the [Leiden | ||
algorithm](https://en.wikipedia.org/wiki/Leiden_algorithm) for community detection | ||
based on paper [*From Louvain to Leiden: guaranteeing well-connected communities*](https://arxiv.org/abs/1810.08473). | ||
The Leiden algorithm is a hierarchical clustering algorithm, that recursively merges communities into single nodes by greedily optimizing the modularity and the process repeats in the condensed graph. | ||
It enhances the Louvain algorithm by addressing its limitations, particularly in situations where some identified communities are not well-connected. | ||
This improvement is made by periodically subdividing communities into smaller, well-connected groups. | ||
With an $\mathcal{O}(Lm)$ runtime for $m$ edges and $L$ number of iterations, it suits large graphs. | ||
|
||
<Cards> | ||
<Card | ||
icon={<GitHub />} | ||
title="Source code" | ||
href="https://github.com/memgraph/mage/blob/main/cpp/leiden_community_detection_module/leiden_community_detection_module.cpp" | ||
/> | ||
</Cards> | ||
|
||
| Trait | Value | | ||
| ------------------------ | --------------------- | | ||
| **Module type** | algorithm | | ||
| **Implementation** | C++ | | ||
| **Graph direction** | undirected | | ||
| **Relationship weights** | weighted / unweighted | | ||
| **Parallelism** | parallel | | ||
|
||
## Procedures | ||
|
||
<Callout type="info"> | ||
You can execute this algorithm on [graph projections, subgraphs or portions of the graph](/advanced-algorithms/run-algorithms#run-procedures-on-subgraph). | ||
</Callout> | ||
|
||
### `get()` | ||
|
||
Computes graph communities using the Leiden algorithm. | ||
|
||
{<h4> Input: </h4>} | ||
|
||
- `subgraph: Graph` (**OPTIONAL**) ➡ A specific subgraph, which is an [object of type Graph](/advanced-algorithms/run-algorithms#run-procedures-on-subgraph) returned by the `project()` function, on which the algorithm is run. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How does the algorithm behave when subgraph is NOT provided? |
||
- `weight: string (default=null)` ➡ Specifies the default relationship weight. If not set, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is this a string and not a float? Does it refer to a property name? |
||
the algorithm uses the `weight` relationship attribute when present and otherwise | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Relationship implies "edge". Is that what you meant? |
||
treats the graph as unweighted. | ||
- `gamma: double (default=1.0)` ➡ Resolution parameter used when computing the modularity. Internally the value is divided by the number of relationships for an unweighted graph, or the sum of weights of all relationships otherwise. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is gamma divided by some kind of weight? Please explain. |
||
- `theta: double (default=0.01)` ➡ Controls the randomness while breaking a community into smaller ones. | ||
- `resolution_parameter: double (default=0.01)` ➡ Minimum change in modularity that must be achieved when merging nodes within the same community. | ||
- `max_iterations: int (default=inf)` ➡ Maximum number of iterations the algorithm will perform. If set to infinity, the algorithm will run until convergence is reached. | ||
|
||
{<h4> Output: </h4>} | ||
|
||
- `node: Vertex` ➡ Graph node. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we give this a more descriptive label? What is special about this single node that is returned? Is it some kind of centroid for the community? |
||
- `community_id: integer` ➡ Community ID. Defaults to $-1$ if the node does not belong to any community. | ||
- `communities: list` ➡ List of intermediate communities that a node has been part of across iterations. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this indicate the community hierarchy? |
||
|
||
{<h4> Usage: </h4>} | ||
|
||
Use the following query to detect communities: | ||
|
||
```cypher | ||
CALL leiden_community_detection.get() | ||
YIELD node, community_id, communities; | ||
``` | ||
|
||
### `get_subgraph()` | ||
|
||
Computes graph communities over a subgraph using the Louvain method. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is Louvain on the Leiden page? I'm confused. |
||
|
||
{<h4> Input: </h4>} | ||
|
||
- `subgraph: Graph` (**OPTIONAL**) ➡ A specific subgraph, which is an [object of type Graph](/advanced-algorithms/run-algorithms#run-procedures-on-subgraph) returned by the `project()` function, on which the algorithm is run. | ||
- `subgraph_nodes: List[Node]` ➡ List of nodes in the subgraph. | ||
- `subgraph_relationships: List[Relationship]` ➡ List of relationships in the subgraph. | ||
- `weight: string (default=null)` ➡ Specifies the default relationship weight. If not set, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does leiden require the --storage-properties-on-edges=true configuration? |
||
the algorithm uses the `weight` relationship attribute when present and otherwise | ||
treats the graph as unweighted. | ||
- `gamma: double (default=1.0)` ➡ Resolution parameter used when computing the modularity. Internally the value is divided by the number of relationships for an unweighted graph, or the sum of weights of all relationships otherwise. | ||
- `theta: double (default=0.01)` ➡ Controls the randomness while breaking a community into smaller ones. | ||
- `resolution_parameter: double (default=0.01)` ➡ Minimum change in modularity that must be achieved when merging nodes within the same community. | ||
- `max_iterations: int (default=inf)` ➡ Maximum number of iterations the algorithm will perform. If set to infinity, the algorithm will run until convergence is reached. | ||
|
||
{<h4> Output: </h4>} | ||
|
||
- `node: Vertex` ➡ Graph node. | ||
- `community_id: int` ➡ Community ID. Defaults to $-1$ if the node does not belong to any community. | ||
- `communities: list` ➡ List of intermediate communities that a node has been part of across iterations. | ||
|
||
{<h4> Usage: </h4>} | ||
|
||
Use the following query to compute communities in a subgraph: | ||
|
||
```cypher | ||
MATCH (a)-[e]-(b) | ||
WITH COLLECT(a) AS nodes, COLLECT (e) AS relationships | ||
CALL leiden_community_detection.get_subgraph(nodes, relationships) | ||
YIELD node, community_id, communities; | ||
``` | ||
|
||
## Example | ||
|
||
<Steps> | ||
|
||
{<h3> Database state </h3>} | ||
|
||
The database contains the following data: | ||
|
||
![](/pages/advanced-algorithms/available-algorithms/community_detection/community-detection-1.png) | ||
|
||
Created with the following Cypher queries: | ||
|
||
```cypher | ||
MERGE (a: Node {id: 0}) MERGE (b: Node {id: 1}) CREATE (a)-[r: Relation]->(b); | ||
MERGE (a: Node {id: 0}) MERGE (b: Node {id: 2}) CREATE (a)-[r: Relation]->(b); | ||
MERGE (a: Node {id: 1}) MERGE (b: Node {id: 2}) CREATE (a)-[r: Relation]->(b); | ||
MERGE (a: Node {id: 2}) MERGE (b: Node {id: 3}) CREATE (a)-[r: Relation]->(b); | ||
MERGE (a: Node {id: 3}) MERGE (b: Node {id: 4}) CREATE (a)-[r: Relation]->(b); | ||
MERGE (a: Node {id: 3}) MERGE (b: Node {id: 5}) CREATE (a)-[r: Relation]->(b); | ||
MERGE (a: Node {id: 4}) MERGE (b: Node {id: 5}) CREATE (a)-[r: Relation]->(b); | ||
``` | ||
|
||
{<h3> Detect communities </h3>} | ||
|
||
Get communities using the following query: | ||
|
||
```cypher | ||
CALL leiden_community_detection.get() | ||
YIELD node, community_id, communities | ||
RETURN node.id AS node_id, community_id, communities | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be great to see an example where a node is a member of more than one hierarchical community |
||
ORDER BY node_id; | ||
``` | ||
|
||
Results show which nodes belong to community 1, and which to community 2: | ||
|
||
```plaintext | ||
+--------------+--------------+--------------+ | ||
| node_id | community_id | communities | | ||
+--------------+--------------+--------------+ | ||
| 0 | 0 | [0] | | ||
| 1 | 0 | [0] | | ||
| 2 | 0 | [0] | | ||
| 3 | 1 | [1] | | ||
| 4 | 1 | [1] | | ||
| 5 | 1 | [1] | | ||
+--------------+--------------+--------------+ | ||
``` | ||
|
||
</Steps> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in alphabetical order