first commit, isolates and betweenness #2

20kavishs · 2023-06-27T04:37:25Z

Basic structure
Added isolates
parallelized number_of_isolates, copied over other functions from isolate.py because they also had a dispatch decorator and that was the only way to make things work
Added betweenness centrality
Tried multiple methods, used the fastest implementation (will put more details in blog post)
Still need to pass some more tests

isolates passes all tests still need to pass a few betweenness tests

dschult

Nice -- does it work? :} I guess we should get a way to test these ideas on CI.

Can you explain a little more about why all the isolates functions need to be included in the file to make it work?

What happens to the doc_strings in these files? Does this get pulled into IPython's isolates?? That's pretty fancy so maybe not. It looks like it just copies the docs from networkx. Maybe we could do that programmatically to avoid long term maintenance. No need to touch anything now, I'm just rambling about the future and long term implications, etc.

nx_parallel/algorithms/centrality/betweenness.py

rossbar · 2023-06-27T15:47:24Z

Nice -- does it work? :} I guess we should get a way to test these ideas on CI.

Agreed! I think figuring out how to run the networkx test suite with the parallel backend (and adding this as CI for this repo) should be the top priority. Maybe we can look to the dispatching docs and/or python-graphblas for inspiration.

dschult · 2023-06-28T12:53:56Z

I'm looking into getting CI tests set up for this repo.
I notice that the entry_points entry for nx_parallel is parallel. Should that be nx_parallel? The name parallel is quite general and there may be other parallel backends in the future. But I don't know the implications of decisions like this. Where does this name get used once we add it as an entry_point? (I'm new at entry_point stuff :)

dschult · 2023-06-28T21:55:38Z

I made some changes on the nx_parallel repo to turn on CI testing there. It tests Python 3.10 and 3.11 with linux, windows and macos.

That means you will need to pull from this branch in your repo down to your local repo. I'm not expecting any conflicts so hopefully that will be easy. :}

MridulS · 2023-07-01T10:29:19Z

There is something funky with the betweeness centrality implementation:

In [1]: import nx_parallel as nxp

In [2]: import networkx as nx

In [3]: G = nx.DiGraph()

In [4]: nx.add_path(G, [0, 1, 2])

In [5]: GP = nxp.ParallelGraph(G)

In [6]: nx.betweenness_centrality(GP)
Out[6]: {0: 0.0, 1: 1.0, 2: 0.0}

In [7]: nx.betweenness_centrality(G)
Out[7]: {0: 0.0, 1: 0.5, 2: 0.0}

20kavishs · 2023-07-07T01:26:18Z

I'm finishing up parallelizing closeness_vitality and the functions in tournament.py with TODOs that say easily parallelizable...I just realized though that those functions do not all have the @nx._dispatch decorator. Thus I'm not able to use the implementations because the dispatcher doesn't dispatch to what I made.

I'm thinking I either 1) just stick to parallelizing functions with the dispatch decorator or 2) consider adding the @nx._dispatch decorator to the functions I want to get around this?

Any thoughts?

dschult · 2023-07-07T13:10:46Z

For deciding about adding @nx._dispatch, can you look at networkx-#6688 which adds lots of functions to the set that can be dispatched. At the moment, the dispatch feature is implemented in a fairly small set because we were just rolling it out. But with #6688 we are wrapping many of the functions.

So, I think you should add the @nx._dispatch operator to the functions you are adding for nx_parallel. That means you will have two connected PRs -- one in nx_parallel and one in networkx. That said, if the functions you are looking at already show up in #6688, you won't need a separate PR. But you will need a way to be testing against that PR's version of networkx rather than a release or the main branch. I would suggest starting with a local copy of both repos -- either get #6688 locally, or clone @eriknw repo locally. Then get a config where you can use your nx_parallel along with that PR. Another approach would be to open your own local networkx repo and make a branch that copies the relevant parts of #6688 and adds any other functions you want to have supported. Then you've got a local (and potential PR) to complement #6688 that you can use locally.

You might also consider splitting your nx_parallel PR into a part that adds support for functions that do have the @nx._dispatch operator and a separate branch/PR for functions that don't have that yet.

20kavishs · 2023-07-08T02:35:58Z

Alright I messed around a bit...I think I will stick to making a commit that works with PR #6688. I've been able to set up and have nx_parallel running with the PR. The functions I want to parallelize have all had decorators added in the PR, so don't think I need to make my own local changes to the networkx repo.

Only issue is that since many more functions have the dispatch decorator, I now have to include their implementations (or else I get where it says "not implemented by parallel" as discussed earlier with Mridul). But it should be fine, don't see an immediate workaround

dschult · 2023-07-08T03:39:11Z

Just to help me understand this -- it is giving that error when code from one of your implementations calls another function that has@nx._dispatch. That is, it doesn't give an error for functions your function doesn't use, right?

Could it just be that you are passing a ParallelGraph into those functions instead of a NetworkX graph? Is there a way to unwrap the networkx graph enough to send it to the other functions while not messing up the parallel nature of what is being done?

20kavishs · 2023-07-08T14:50:17Z

Yup, it is only giving the error when I call another function with the dispatch decorator. There are no errors for functions I don’t use.

I think that trying to unwrap the graph is a good idea (maybe some extra overhead but probably not too much). I’ll try that and fiddling with pytest

MridulS · 2023-07-08T17:17:00Z

A quick way to unwrap is to use ParallelGraph.to_networkx(), or you can add a class variable which stores the graph at __init__

…tality and tournament - Decided to just make things work with PR #6688 (had all the functions I needed marked with dispatch decorator) - More graph types and small interface changes - parallel implementations of closeness_vitality + tournament (I am a bit ahead of schedule) - Made networkx tests into my own tests for nx_parallel (same directories as in networkx. can be easily run w pytest for CI) - ended up having to include all the functions, but didn't have to reimplement (see isolates or tournament for example) - added utils/chunk.py

- Fixed betweenness tests, had some small errors in them - Minor changes to graph class constructors - Changed betweenness_centrality implementation, passes all tests

dschult · 2023-07-15T22:18:19Z

Can you try an even easier way to unwrap -- that we found out more about at the SciPy meeting is:
Instead of calling nx.function, call nx.function.__wrapped__. This will get around the trouble of not being able to call a networkx function with the @nx.dispatch decorator from the backend code.

Can you try this?

MridulS · 2023-07-15T22:22:02Z

@20kavishs I have created a PR on your repo 20kavishs#1 which uses the __wrapped__ attribute to access the networkx functions. This will fail with the main branch of networkx as the internal functions are not decorated just yet but it should work (I tried it locally). Maybe we should keep a fork of networkx repo to iterate quickly on this repo.

…able, and cleanly annotated base classes to permit easy iteration

Redid betweenness without convert function Tried to use __wrapped__ but it only worked for isolates...for consistency I kept everything the same Errors for using __wrapped__ were because various methods were "not implemented by parallel"

Passes all tests

dschult · 2023-08-08T20:45:30Z

I think this version of nx_parallel is slow due to copying the networkx graph when we instantiate the ParallelGraph instance. Indeed we often end up converting back to networkx.Graph again so it is really not worth it.

Can we try making a slimmer version of this:

The ParallelGraph class can just have the __networkx_plugin__ attribute and in __init__ it can store the input graph as an attribute.
Only implement the outer level function we want to parallelize. Not need to copy any functions in their entirety from networkx.
Whenever we call a networkx function that needs access to G we give the __wrapped__ version of the function an input of PG.G

Something like:

class ParallelGraph:
    def __init__(self, input_graph):
        self.G = input_graph
    __networkx_plugin__ = "parallel"

def number_of_isolates(PG):
    isolates_list = list(nx.isolates.__wrapped__(PG.G))
    num_chunks = max(len(isolates_list) // cpu_count(), 1)
    isolate_chunks = chunks(isolates_list, num_chunks)
    results = Parallel(n_jobs=-1)(delayed(len)(chunk) for chunk in isolate_chunks)
    return sum(results)

Also note that this will only work on the main branch of networkx after PR 6688 was merged. Let's try to get this implemented and timed. It should reduce overhead by a lot.

added originalGraph to parallel classes, added heatmaps + their code in the timing folder WIP for heatmap

first commit, isolates and betweeness

fb31b99

isolates passes all tests still need to pass a few betweenness tests

dschult reviewed Jun 27, 2023

View reviewed changes

nx_parallel/algorithms/centrality/betweenness.py Outdated Show resolved Hide resolved

nx_parallel/algorithms/centrality/betweenness.py Outdated Show resolved Hide resolved

Merge branch 'main' into isolates_and_betweenness

40f407f

20kavishs added 2 commits July 10, 2023 00:35

Fixed betweeness tests + made betweenness_centrality pass all tests

d8e1860

- Fixed betweenness tests, had some small errors in them - Minor changes to graph class constructors - Changed betweenness_centrality implementation, passes all tests

dPys added a commit to dPys/nx_parallel that referenced this pull request Jul 16, 2023

[WIP]: Consolidate current repo and PR networkx#2 into simple, compos…

5ff343d

…able, and cleanly annotated base classes to permit easy iteration

dPys mentioned this pull request Jul 16, 2023

[WIP]: Refactor-- consolidate and simplify #7

Closed

20kavishs added 2 commits July 31, 2023 12:39

Changed betweenness

007c73e

Redid betweenness without convert function Tried to use __wrapped__ but it only worked for isolates...for consistency I kept everything the same Errors for using __wrapped__ were because various methods were "not implemented by parallel"

Parallelized efficiency_measures

8af5d8a

Passes all tests

20kavishs and others added 8 commits August 26, 2023 20:09

added originalGraph to parallel classes, added heatmaps + their code

1da2221

added originalGraph to parallel classes, added heatmaps + their code in the timing folder WIP for heatmap

.py add

7a4f227

fix test build pyproject.toml

144caf0

add init files for import -- might be revised

6454951

try changing dir

366f441

try changing dir correctly

4aab588

undo dir munging tries

35015a2

try again

d49ef2c

dschult and others added 13 commits August 27, 2023 09:33

try tests

7696c88

debug widnows ci

c6e7974

debug widnows ci

08b1968

now get nx_parallel tests working

158d9e3

show environment pre-testing

b2c5fb3

try pyargs with nx_parallel

7f2001e

import debug

fea4bbe

print more

f1d10f1

use python -m pytest instead of pytest

47dccd3

cleanup and check all

e792122

Quick timing documentation update

95277fa

style with black and ruff

22017fe

set up pre-commit config to match NetworkX

82f5683

dschult merged commit 241fbac into networkx:main Sep 11, 2023

jarrodmillman added the type: Enhancement New feature or request label Oct 13, 2023

jarrodmillman added this to the 0.1 milestone Oct 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

first commit, isolates and betweenness #2

first commit, isolates and betweenness #2

20kavishs commented Jun 27, 2023

dschult left a comment

rossbar commented Jun 27, 2023

dschult commented Jun 28, 2023

dschult commented Jun 28, 2023

MridulS commented Jul 1, 2023

20kavishs commented Jul 7, 2023

dschult commented Jul 7, 2023

20kavishs commented Jul 8, 2023 •

edited

Loading

dschult commented Jul 8, 2023

20kavishs commented Jul 8, 2023

MridulS commented Jul 8, 2023

dschult commented Jul 15, 2023

MridulS commented Jul 15, 2023

dschult commented Aug 8, 2023

first commit, isolates and betweenness #2

first commit, isolates and betweenness #2

Conversation

20kavishs commented Jun 27, 2023

dschult left a comment

Choose a reason for hiding this comment

rossbar commented Jun 27, 2023

dschult commented Jun 28, 2023

dschult commented Jun 28, 2023

MridulS commented Jul 1, 2023

20kavishs commented Jul 7, 2023

dschult commented Jul 7, 2023

20kavishs commented Jul 8, 2023 • edited Loading

dschult commented Jul 8, 2023

20kavishs commented Jul 8, 2023

MridulS commented Jul 8, 2023

dschult commented Jul 15, 2023

MridulS commented Jul 15, 2023

dschult commented Aug 8, 2023

20kavishs commented Jul 8, 2023 •

edited

Loading