Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first commit, isolates and betweenness #2

Merged
merged 27 commits into from
Sep 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
fb31b99
first commit, isolates and betweeness
20kavishs Jun 27, 2023
40f407f
Merge branch 'main' into isolates_and_betweenness
dschult Jun 28, 2023
e9868e7
Works with PR #6688, more graph types, parallel implementations of vi…
20kavishs Jul 10, 2023
d8e1860
Fixed betweeness tests + made betweenness_centrality pass all tests
20kavishs Jul 15, 2023
007c73e
Changed betweenness
20kavishs Jul 31, 2023
8af5d8a
Parallelized efficiency_measures
20kavishs Aug 6, 2023
1da2221
added originalGraph to parallel classes, added heatmaps + their code
20kavishs Aug 27, 2023
7a4f227
.py add
20kavishs Aug 27, 2023
144caf0
fix test build pyproject.toml
dschult Aug 27, 2023
6454951
add init files for import -- might be revised
dschult Aug 27, 2023
366f441
try changing dir
dschult Aug 27, 2023
4aab588
try changing dir correctly
dschult Aug 27, 2023
35015a2
undo dir munging tries
dschult Aug 27, 2023
d49ef2c
try again
dschult Aug 27, 2023
7696c88
try tests
dschult Aug 27, 2023
c6e7974
debug widnows ci
dschult Aug 27, 2023
08b1968
debug widnows ci
dschult Aug 27, 2023
158d9e3
now get nx_parallel tests working
dschult Aug 27, 2023
b2c5fb3
show environment pre-testing
dschult Aug 27, 2023
7f2001e
try pyargs with nx_parallel
dschult Aug 27, 2023
fea4bbe
import debug
dschult Aug 27, 2023
f1d10f1
print more
dschult Aug 27, 2023
47dccd3
use python -m pytest instead of pytest
dschult Aug 27, 2023
e792122
cleanup and check all
dschult Aug 27, 2023
95277fa
Quick timing documentation update
20kavishs Aug 27, 2023
22017fe
style with black and ruff
dschult Sep 11, 2023
82f5683
set up pre-commit config to match NetworkX
dschult Sep 11, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,10 @@ jobs:
run: |
conda install -c conda-forge joblib scipy pandas pytest-cov pytest-randomly
# matplotlib lxml pygraphviz pydot sympy # Extra networkx deps we don't need yet
pip install git+https://github.com/networkx/networkx.git@main --no-deps
pip install -e . --no-deps
python -m pip install git+https://github.com/networkx/networkx.git@main
python -m pip install .
echo "Done with installing"
- name: PyTest
run: |
NETWORKX_GRAPH_CONVERT=parallel pytest --pyargs networkx
python -m pytest --pyargs nx_parallel
1 change: 0 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -127,4 +127,3 @@ dmypy.json

# Pyre type checker
.pyre/

31 changes: 31 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Install pre-commit hooks via
# pre-commit install

repos:
- repo: https://github.com/psf/black
rev: 23.3.0
hooks:
- id: black
- repo: https://github.com/adamchainz/blacken-docs
rev: 1.13.0
hooks:
- id: blacken-docs
- repo: https://github.com/pre-commit/mirrors-prettier
rev: v2.7.1
hooks:
- id: prettier
files: \.(html|md|toml|yml|yaml)
args: [--prose-wrap=preserve]
- repo: https://github.com/charliermarsh/ruff-pre-commit
rev: v0.0.258
hooks:
- id: ruff
args:
- --fix
- repo: local
hooks:
- id: pyproject.toml
name: pyproject.toml
language: system
entry: python tools/generate_pyproject.toml.py
files: "pyproject.toml|requirements/.*\\.txt|tools/.*pyproject.*"
21 changes: 18 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
NX-Parallel
nx_parallel
-----------

A NetworkX backend plugin which uses dask for parallelization.
A NetworkX backend plugin which uses joblib and multiprocessing for parallelization.

``` python
In [1]: import networkx as nx; import nx_parallel
Expand All @@ -23,4 +23,19 @@ Out[4]:
8: 0.0,
9: 0.0}

```
```

Currently the following functions have parallelized implementations:
- centrality
- betweenness_centrality
- tournament
- is_reachable
- closeness_vitality
- efficiency_measures
- local_efficiency

![alt text](timing/heatmap_all_functions.png)

See the ```/timing``` folder for more heatmaps and code for heatmap generation!


4 changes: 2 additions & 2 deletions nx_parallel/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
from .centrality import *
from .graph import *
from .algorithms import *
from .classes import *
from .interface import *
8 changes: 8 additions & 0 deletions nx_parallel/algorithms/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# subpackages
from .centrality import *
from .utils import *

# modules
from .efficiency_measures import *
from .isolate import *
from .tournament import *
1 change: 1 addition & 0 deletions nx_parallel/algorithms/centrality/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .betweenness import *
121 changes: 121 additions & 0 deletions nx_parallel/algorithms/centrality/betweenness.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
from joblib import Parallel, delayed, cpu_count
from nx_parallel.algorithms.utils.chunk import chunks
from networkx.utils import py_random_state
from networkx.algorithms.centrality.betweenness import (
_rescale,
_single_source_shortest_path_basic,
_single_source_dijkstra_path_basic,
_accumulate_endpoints,
_accumulate_basic,
)

__all__ = ["betweenness_centrality"]


@py_random_state(5)
def betweenness_centrality(
G, k=None, normalized=True, weight=None, endpoints=False, seed=None
):
r"""Parallel Compute shortest-path betweenness centrality for nodes

Betweenness centrality of a node $v$ is the sum of the
fraction of all-pairs shortest paths that pass through $v$

.. math::

c_B(v) =\sum_{s,t \in V} \frac{\sigma(s, t|v)}{\sigma(s, t)}

where $V$ is the set of nodes, $\sigma(s, t)$ is the number of
shortest $(s, t)$-paths, and $\sigma(s, t|v)$ is the number of
those paths passing through some node $v$ other than $s, t$.
If $s = t$, $\sigma(s, t) = 1$, and if $v \in {s, t}$,
$\sigma(s, t|v) = 0$ [2]_.

Parameters
----------
G : graph
A NetworkX graph.

k : int, optional (default=None)
If k is not None use k node samples to estimate betweenness.
The value of k <= n where n is the number of nodes in the graph.
Higher values give better approximation.

normalized : bool, optional
If True the betweenness values are normalized by `2/((n-1)(n-2))`
for graphs, and `1/((n-1)(n-2))` for directed graphs where `n`
is the number of nodes in G.

weight : None or string, optional (default=None)
If None, all edge weights are considered equal.
Otherwise holds the name of the edge attribute used as weight.
Weights are used to calculate weighted shortest paths, so they are
interpreted as distances.

endpoints : bool, optional
If True include the endpoints in the shortest path counts.

seed : integer, random_state, or None (default)
Indicator of random number generation state.
See :ref:`Randomness<randomness>`.
Note that this is only used if k is not None.

Returns
-------
nodes : dictionary
Dictionary of nodes with betweenness centrality as the value.

Notes
-----
This algorithm is a parallelized version of betwenness centrality in NetworkX.
Nodes are divided into chunks based on the number of available processors,
and otherwise all calculations are similar.
"""
if k is None:
nodes = G.nodes
else:
nodes = seed.sample(list(G.nodes), k)
total_cores = cpu_count()
num_chunks = max(len(nodes) // total_cores, 1)
node_chunks = list(chunks(nodes, num_chunks))
bt_cs = Parallel(n_jobs=total_cores)(
delayed(betweenness_centrality_node_subset)(
G,
chunk,
weight,
endpoints,
)
for chunk in node_chunks
)

# Reducing partial solution
bt_c = bt_cs[0]
for bt in bt_cs[1:]:
for n in bt:
bt_c[n] += bt[n]

betweenness = _rescale(
bt_c,
len(G),
normalized=normalized,
directed=G.is_directed(),
k=k,
endpoints=endpoints,
)
return betweenness


def betweenness_centrality_node_subset(G, nodes, weight=None, endpoints=False):
betweenness = dict.fromkeys(G, 0.0)
for s in nodes:
# single source shortest paths
if weight is None: # use BFS
S, P, sigma, _ = _single_source_shortest_path_basic(G, s)
else: # use Dijkstra's algorithm
S, P, sigma, _ = _single_source_dijkstra_path_basic(G, s, weight)
# accumulation
if endpoints:
betweenness, delta = _accumulate_endpoints(betweenness, S, P, sigma, s)
else:
betweenness, delta = _accumulate_basic(betweenness, S, P, sigma, s)
return betweenness
Loading