-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a few igraph algorithms to run via scripts/bench.py
#54
base: main
Are you sure you want to change the base?
Conversation
Including @ntamas here you might have more feedback. I'll also take a look next week. I wanted to make it clear again that I asked about benchmarking against igraph only out of curiosity, and to better understand what performance is possible with GraphBLAS. This is not part of the pyOpenSci review. |
Understood. We have your attention, so I'd like to use it 😉 . I think many people are curious about how igraph compares. Part of the value proposition of So, yeah, I appreciate that you aren't expecting us to benchmark against Here are results from my initial benchmarking comparing GraphBLAS and igraph using this PR running on a DGX. One should be skeptical of benchmarks in general including these, and I'm sure they don't tell the full story, but the results may be interesting nevertheless: |
Codecov ReportPatch coverage has no change and project coverage change:
📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more Additional details and impacted files@@ Coverage Diff @@
## main #54 +/- ##
==========================================
- Coverage 69.16% 67.91% -1.25%
==========================================
Files 77 77
Lines 3152 3151 -1
Branches 602 597 -5
==========================================
- Hits 2180 2140 -40
- Misses 786 832 +46
+ Partials 186 179 -7
... and 7 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
if backend == "networkx": | ||
return G | ||
if backend == "igraph": | ||
# TODO: is there a better way for igraph to read MM files or scipy.sparse arrays? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, how is one supposed to load Matrix Market files into an igraph Graph? Going from *.mtx file to scipy.sparse to networkx to igraph seems less than ideal.
You can use igraph.Graph.Adjacency
to convert from a SciPy sparse matrix.
There is no direct support for reading MM files in igraph.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just what I was looking for, thanks!
implementation="prpack", | ||
): | ||
if personalization is not None: | ||
raise NotImplementedError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personalization is provided through the reset
and reset_vertices
parameters of personalized_pagerank()
https://python.igraph.org/en/stable/api/igraph.GraphBase.html#personalized_pagerank
if nstart is not None: | ||
raise NotImplementedError | ||
if dangling is not None: | ||
raise NotImplementedError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just confirming that the behaviour is the same as with dangling=None
in NetworkX.
# TODO: check results when `count_zeros=False` | ||
mode = "zero" if count_zeros else "nan" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is correct.
Probably PageRank is just not the best type of computation for benchmarking ... The results from using the default arguments are still practically useful though. I just checked what PRPACK does exactly. For graph with less than 128 vertices, it uses Gaussian elimination (which is exact, no tolerance comes into play). For larger graphs, it seems to use a variant of Gauss-Seidel iteration. igraph uses a hard-coded tolerance of |
Thanks for sharing these results! I assume these were run in parallel, which is fair, since this seems to be one of the major advantages of using GraphBLAS. But it would be interesting to see a serial comparison too. In igraph, PRPACK can use some parallelism, but it's not very effective. This can be disabled by setting the |
Agreed! I'll try to get new results this weekend.
Setting this environment variable works for |
See #53 (this PR maybe closes it).
CC @szhorvat. I'm not at all familiar with using igraph, so I'm hoping you can take a look at this. What other algorithms do you think would be easy to compare against? I think we can compare some SSSP and BFS algorithms once #51 is in. Also, how is one supposed to load Matrix Market files into an igraph Graph? Going from
*.mtx
file to scipy.sparse to networkx to igraph seems less than ideal.I'm not sure the most fair way to compare PageRank. On one hand, comparing PageRank with default arguments is fair because that's how they will typically get used in practice. On the other hand, we can view PageRank and other centrality measures as ranking metrics (i.e., order them by score), so a fair comparison could be to run iterative versions of the algorithm until the same ranking is achieved.