Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH : Adding backend_info entry point #27

Merged
merged 26 commits into from
Jan 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
7561cf9
added n_jobs kwarg and removed unnecessary docs
Schefflera-Arboricola Jan 1, 2024
3da6091
added back cpu_count
Schefflera-Arboricola Jan 1, 2024
0d6ac99
rm cpu_count
Schefflera-Arboricola Jan 2, 2024
baf17a3
updated cpu_count and pyproject.toml
Schefflera-Arboricola Jan 2, 2024
56db8b2
updated .pre-commit-config.yaml
Schefflera-Arboricola Jan 2, 2024
ada49f9
backends to plugins(already done in PR 26)
Schefflera-Arboricola Jan 2, 2024
7e06a5f
updated docs
Schefflera-Arboricola Jan 2, 2024
8c3c163
Merge branch 'networkx:main' into update_all
Schefflera-Arboricola Jan 5, 2024
aa2a159
updated docs of betweenness_centrality and interface.py
Schefflera-Arboricola Jan 6, 2024
3ee2650
updated docs of all funcs and removed n_jobs
Schefflera-Arboricola Jan 6, 2024
c3a4427
added desc about parallel implementation step in all funcs
Schefflera-Arboricola Jan 6, 2024
7102d24
Merge branch 'networkx:main' into update_all
Schefflera-Arboricola Jan 6, 2024
312833f
Added egs to vitality.py
Schefflera-Arboricola Jan 6, 2024
360e13f
added get_info func and Parallel Computation section in all funcs
Schefflera-Arboricola Jan 11, 2024
b37e266
fixed get_info()
Schefflera-Arboricola Jan 12, 2024
4815a29
rm docs and backend_examples
Schefflera-Arboricola Jan 17, 2024
d85ab43
fixed get_info
Schefflera-Arboricola Jan 17, 2024
b1bb3ba
moved get_info to backend.py
Schefflera-Arboricola Jan 17, 2024
d9c4d6a
removed docstring pytest(bcoz no egs in docs)
Schefflera-Arboricola Jan 17, 2024
b07eace
style fix
Schefflera-Arboricola Jan 18, 2024
827c288
made this PR independent of PR-7219
Schefflera-Arboricola Jan 19, 2024
8fa0ba2
Merge branch 'networkx:main' into update_all
Schefflera-Arboricola Jan 19, 2024
12ea240
making new PR for pre-commit-config update
Schefflera-Arboricola Jan 21, 2024
9a06c2a
updated get_info()
Schefflera-Arboricola Jan 21, 2024
f3e62e3
moved tournament func renaming to pr 32
Schefflera-Arboricola Jan 22, 2024
544648f
Merge branch 'main' into update_all
Schefflera-Arboricola Jan 25, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,4 +43,3 @@ jobs:
NETWORKX_FALLBACK_TO_NX=True \
python -m pytest --pyargs networkx

python -m pytest --doctest-modules --pyargs nx_parallel
2 changes: 2 additions & 0 deletions nx_parallel/algorithms/__init__.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
# subpackages
from .centrality import *
from .shortest_paths import *

# modules
from .efficiency_measures import *
from .isolate import *
from .tournament import *
from .vitality import *
64 changes: 4 additions & 60 deletions nx_parallel/algorithms/centrality/betweenness.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@
_single_source_shortest_path_basic,
)
from networkx.utils import py_random_state

import nx_parallel as nxp

__all__ = ["betweenness_centrality"]
Expand All @@ -17,60 +16,10 @@
def betweenness_centrality(
G, k=None, normalized=True, weight=None, endpoints=False, seed=None
):
r"""Parallel Compute shortest-path betweenness centrality for nodes

Betweenness centrality of a node $v$ is the sum of the
fraction of all-pairs shortest paths that pass through $v$

.. math::

c_B(v) =\sum_{s,t \in V} \frac{\sigma(s, t|v)}{\sigma(s, t)}

where $V$ is the set of nodes, $\sigma(s, t)$ is the number of
shortest $(s, t)$-paths, and $\sigma(s, t|v)$ is the number of
those paths passing through some node $v$ other than $s, t$.
If $s = t$, $\sigma(s, t) = 1$, and if $v \in {s, t}$,
$\sigma(s, t|v) = 0$ [2]_.

Parameters
----------
G : graph
A NetworkX graph.

k : int, optional (default=None)
If k is not None use k node samples to estimate betweenness.
The value of k <= n where n is the number of nodes in the graph.
Higher values give better approximation.

normalized : bool, optional
If True the betweenness values are normalized by `2/((n-1)(n-2))`
for graphs, and `1/((n-1)(n-2))` for directed graphs where `n`
is the number of nodes in G.

weight : None or string, optional (default=None)
If None, all edge weights are considered equal.
Otherwise holds the name of the edge attribute used as weight.
Weights are used to calculate weighted shortest paths, so they are
interpreted as distances.

endpoints : bool, optional
If True include the endpoints in the shortest path counts.

seed : integer, random_state, or None (default)
Indicator of random number generation state.
See :ref:`Randomness<randomness>`.
Note that this is only used if k is not None.

Returns
-------
nodes : dictionary
Dictionary of nodes with betweenness centrality as the value.
"""The parallel computation is implemented by dividing the
nodes into chunks and computing betweenness centrality for each chunk concurrently.

Notes
-----
This algorithm is a parallelized version of betwenness centrality in NetworkX.
Nodes are divided into chunks based on the number of available processors,
and otherwise all calculations are similar.
networkx.betweenness_centrality : https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.centrality.betweenness_centrality.html
"""
if hasattr(G, "graph_object"):
G = G.graph_object
Expand All @@ -85,12 +34,7 @@ def betweenness_centrality(
node_chunks = nxp.chunks(nodes, num_in_chunk)

bt_cs = Parallel(n_jobs=total_cores)(
delayed(_betweenness_centrality_node_subset)(
G,
chunk,
weight,
endpoints,
)
delayed(_betweenness_centrality_node_subset)(G, chunk, weight, endpoints)
for chunk in node_chunks
)

Expand Down
58 changes: 12 additions & 46 deletions nx_parallel/algorithms/efficiency_measures.py
Original file line number Diff line number Diff line change
@@ -1,66 +1,32 @@
"""Provides functions for computing the efficiency of nodes and graphs."""
import networkx as nx
from joblib import Parallel, delayed

import nx_parallel as nxp

__all__ = ["local_efficiency"]

"""Helper to interface between graph types"""


def local_efficiency(G):
"""Returns the average local efficiency of the graph.

The *efficiency* of a pair of nodes in a graph is the multiplicative
inverse of the shortest path distance between the nodes. The *local
efficiency* of a node in the graph is the average global efficiency of the
subgraph induced by the neighbors of the node. The *average local
efficiency* is the average of the local efficiencies of each node [1]_.

Parameters
----------
G : :class:`networkx.Graph`
An undirected graph for which to compute the average local efficiency.

Returns
-------
float
The average local efficiency of the graph.

Examples
--------
>>> G = nx.Graph([(0, 1), (0, 2), (0, 3), (1, 2), (1, 3)])
>>> nx.local_efficiency(G)
0.9166666666666667
"""The parallel computation is implemented by dividing the
nodes into chunks and then computing and adding global efficiencies of all node
in all chunks, in parallel, and then adding all these sums and dividing by the
total number of nodes at the end.

Notes
-----
Edge weights are ignored when computing the shortest path distances.
networkx.local_efficiency : https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.efficiency_measures.local_efficiency.html#local-efficiency
"""

See also
--------
global_efficiency
def _local_efficiency_node_subset(G, nodes):
return sum(nx.global_efficiency(G.subgraph(G[v])) for v in nodes)

References
----------
.. [1] Latora, Vito, and Massimo Marchiori.
"Efficient behavior of small-world networks."
*Physical Review Letters* 87.19 (2001): 198701.
<https://doi.org/10.1103/PhysRevLett.87.198701>
"""
if hasattr(G, "graph_object"):
G = G.graph_object

total_cores = nxp.cpu_count()
num_in_chunk = max(len(G.nodes) // total_cores, 1)
cpu_count = nxp.cpu_count()

num_in_chunk = max(len(G.nodes) // cpu_count, 1)
node_chunks = list(nxp.chunks(G.nodes, num_in_chunk))

efficiencies = Parallel(n_jobs=total_cores)(
efficiencies = Parallel(n_jobs=cpu_count)(
delayed(_local_efficiency_node_subset)(G, chunk) for chunk in node_chunks
)
return sum(efficiencies) / len(G)


def _local_efficiency_node_subset(G, nodes):
return sum(nx.global_efficiency(G.subgraph(G[v])) for v in nodes)
28 changes: 11 additions & 17 deletions nx_parallel/algorithms/isolate.py
Original file line number Diff line number Diff line change
@@ -1,32 +1,26 @@
import networkx as nx
from joblib import Parallel, delayed

import nx_parallel as nxp

__all__ = ["number_of_isolates"]


def number_of_isolates(G):
"""Returns the number of isolates in the graph. Parallel implementation.

An *isolate* is a node with no neighbors (that is, with degree
zero). For directed graphs, this means no in-neighbors and no
out-neighbors.

Parameters
----------
G : NetworkX graph

Returns
-------
int
The number of degree zero nodes in the graph `G`.
"""The parallel computation is implemented by dividing the list
of isolated nodes into chunks and then finding the length of each chunk in parallel
and then adding all the lengths at the end.

networkx.number_of_isolates : https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.isolate.number_of_isolates.html#number-of-isolates
"""
if hasattr(G, "graph_object"):
G = G.graph_object

cpu_count = nxp.cpu_count()

isolates_list = list(nx.isolates(G))
num_in_chunk = max(len(isolates_list) // nxp.cpu_count(), 1)
num_in_chunk = max(len(isolates_list) // cpu_count, 1)
isolate_chunks = nxp.chunks(isolates_list, num_in_chunk)
results = Parallel(n_jobs=-1)(delayed(len)(chunk) for chunk in isolate_chunks)
results = Parallel(n_jobs=cpu_count)(
delayed(len)(chunk) for chunk in isolate_chunks
)
return sum(results)
56 changes: 7 additions & 49 deletions nx_parallel/algorithms/shortest_paths/weighted.py
Original file line number Diff line number Diff line change
@@ -1,70 +1,28 @@
from joblib import Parallel, delayed
from networkx.algorithms.shortest_paths.weighted import single_source_bellman_ford_path

import nx_parallel as nxp

__all__ = ["all_pairs_bellman_ford_path"]


def all_pairs_bellman_ford_path(G, weight="weight"):
"""Compute shortest paths between all nodes in a weighted graph.

Parameters
----------
G : NetworkX graph

weight : string or function (default="weight")
If this is a string, then edge weights will be accessed via the
edge attribute with this key (that is, the weight of the edge
joining `u` to `v` will be ``G.edges[u, v][weight]``). If no
such edge attribute exists, the weight of the edge is assumed to
be one.

If this is a function, the weight of an edge is the value
returned by the function. The function must accept exactly three
positional arguments: the two endpoints of an edge and the
dictionary of edge attributes for that edge. The function must
return a number.

Returns
-------
paths : iterator
(source, dictionary) iterator with dictionary keyed by target and
shortest path as the key value.
"""The parallel computation is implemented by computing the
shortest paths for each node concurrently.

Notes
-----
Edge weight attributes must be numerical.
Distances are calculated as sums of weighted edges traversed.

Examples
--------
>>> import networkx as nx
>>> G = nx.Graph()
>>> G.add_weighted_edges_from([(1, 0, 1), (1, 2, 1), (2, 0, 3)])
>>> path = dict(nx.all_pairs_bellman_ford_path(G))
>>> path[0][2]
[0, 1, 2]
>>> parallel_path = dict(nx.all_pairs_bellman_ford_path(G, backend="parallel"))
>>> parallel_path[0][2]
[0, 1, 2]
>>> import nx_parallel as nxp
>>> parallel_path_ = dict(nx.all_pairs_bellman_ford_path(nxp.ParallelGraph(G)))
>>> parallel_path_[0][2]
[0, 1, 2]
networkx.all_pairs_bellman_ford_path : https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.shortest_paths.weighted.all_pairs_bellman_ford_path.html#all-pairs-bellman-ford-path
"""

def _calculate_shortest_paths_subset(source):
return (source, single_source_bellman_ford_path(G, source, weight=weight))

if hasattr(G, "graph_object"):
G = G.graph_object

nodes = G.nodes
cpu_count = nxp.cpu_count()

total_cores = nxp.cpu_count()
nodes = G.nodes

paths = Parallel(n_jobs=total_cores, return_as="generator")(
paths = Parallel(n_jobs=cpu_count, return_as="generator")(
delayed(_calculate_shortest_paths_subset)(source) for source in nodes
)
return paths
Loading
Loading