Add `{all_pairs,single_source}_bellman_ford_path_length` #44

eriknw · 2023-02-11T06:33:12Z

That is, add these to nxapi:

all_pairs_bellman_ford_path_length (only implemented in nxapi for now)
single_source_bellman_ford_path_length

I did a slightly different implementation of Bellman-Ford. It seems to run faster in my limited testing. This has not been heavily benchmarked and tuned, so there could be room for improvement.

This algorithm could probably be optimized further when the adjacency matrix is known to be iso-valued.

That is, add these to `nxapi`: - `all_pairs_bellman_ford_path_length` - `single_source_bellman_ford_path_length`

codecov-commenter · 2023-02-11T06:37:56Z

Codecov Report

Base: 72.48% // Head: 68.87% // Decreases project coverage by -3.62% ⚠️

Coverage data is based on head (bbcdd07) compared to base (0b649b2).
Patch coverage: 32.46% of modified lines in pull request are covered.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #44      +/-   ##
==========================================
- Coverage   72.48%   68.87%   -3.62%     
==========================================
  Files          72       74       +2     
  Lines        2777     3023     +246     
  Branches      516      574      +58     
==========================================
+ Hits         2013     2082      +69     
- Misses        587      753     +166     
- Partials      177      188      +11

Impacted Files	Coverage Δ
graphblas_algorithms/nxapi/exception.py	`26.08% <0.00%> (-2.49%)`	⬇️
graphblas_algorithms/classes/_caching.py	`41.46% <16.66%> (-2.29%)`	⬇️
...s_algorithms/algorithms/shortest_paths/weighted.py	`20.14% <20.14%> (ø)`
...phblas_algorithms/nxapi/shortest_paths/weighted.py	`24.39% <24.39%> (ø)`
graphblas_algorithms/classes/graph.py	`56.53% <38.80%> (-3.01%)`	⬇️
graphblas_algorithms/classes/digraph.py	`38.61% <45.45%> (-1.39%)`	⬇️
graphblas_algorithms/algorithms/centrality/katz.py	`91.17% <50.00%> (ø)`
...lgorithms/algorithms/link_analysis/pagerank_alg.py	`82.45% <50.00%> (-0.31%)`	⬇️
graphblas_algorithms/nxapi/cluster.py	`76.69% <64.28%> (-0.86%)`	⬇️
graphblas_algorithms/algorithms/core.py	`100.00% <100.00%> (ø)`
... and 11 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

eriknw · 2023-02-11T06:38:03Z

graphblas_algorithms/algorithms/shortest_paths/weighted.py

+        # Mask is True where cur not in d or cur < d
+        mask(cur.S, replace) << True  # or: `mask << unary.one[bool](cur)`
+        mask(binary.second) << binary.lt(cur & d)
+
+        # Drop values from `cur` that didn't improve
+        cur(mask.V, replace) << cur


FYI, a very similar pattern to this also showed up in Floyd-Warshall

graphblas-algorithms/graphblas_algorithms/algorithms/shortest_paths/dense.py

Lines 67 to 73 in 0b649b2

# Update Outer to only include off-diagonal values that will update D and P.

if is_directed:

Mask << indexunary.offdiag(Outer)

else:

Mask << indexunary.triu(Outer, 1)

Mask(binary.second) << binary.lt(Outer & D)

Outer(Mask.V, replace) << Outer

where we need to set a mask according to a comparison and set the mask to True for new values.

Just thought this was interesting. It's neat to see common patterns.

eriknw · 2023-02-11T06:43:44Z

graphblas_algorithms/nxapi/shortest_paths/weighted.py

+def all_pairs_bellman_ford_path_length(G, weight="weight"):
+    # TODO: what if weight is a function?
+    # How should we implement and call `algorithms.all_pairs_bellman_ford_path_length`?
+    # Should we compute in chunks to expose more parallelism?


I'm not sure what the best API for a backend implementation (i.e., to have in graphblas_algorithms.algorithms). Returning a Matrix seems risky.

Maybe bellman_ford_path_lengths(G, nodes=None), which will return a Matrix with distances for the specified nodes (default to all nodes). Then, nxapi.all_pairs_bellman_ford_path_length can compute in batches to get increased parallelism (assuming doing so is actually faster).

eriknw · 2023-02-12T04:36:00Z

@LuisFelipeRamos, all_pairs_bellman_ford_path_length might be useful to you. Do you only need distances, or do you need more such as predecessors?

I like that we can compare these to scipy.sparse.csgraph algorithms. For SSSP, our Bellman-Ford is about the same speed as (or a little faster than) scipy's Dijkstra SSSP, which I think is impressive. I don't know what's up with scipy.sparse.csgraph.bellman_ford--it's super slow, like 15000x slower.

For all-pairs shortest path, our version with batching Bellman-Ford is 2-3x faster than scipy.sparse.csgraph.dijkstra! Batching gives us more parallelism, and we can do even better with larger batches.

My benchmarks aren't very exhaustive. They're also on my laptop (4 cores, 8 threads). I would expect GraphBLAS to do even better running on a beefier system (such as DGX that has 40 cores / 80 threads).

My variant of SSSP Bellman-Ford is about 3-4x faster than the "simple" GraphBLAS implementation that uses d(min) << min_plus(d @ A) and comparison to previous value. It's interesting that we're able to find optimizations for most algorithms in graphblas-algorithms with just a little bit of experimentation.

I don't know how our all-pairs shortest path with Bellman-Ford compares to Floyd-Warshall except that Bellman-ford is a generator and uses much less memory.

eriknw · 2023-02-12T04:40:24Z

graphblas_algorithms/nxapi/shortest_paths/weighted.py

+]
+
+
+def all_pairs_bellman_ford_path_length(G, weight="weight", *, chunksize=128):


We're adding a new keyword-only argument to nxapi. @jim22k, do you like chunksize for this?

We have a similar extra keyword-argument nchunks in nxapi.square_clustering that controls how much memory is used during calculation.

Neither one, nchunks or chunksize, changes the result; they're only used to control memory use and performance via available parallelism.

eriknw · 2023-02-13T06:44:47Z

graphblas_algorithms/algorithms/shortest_paths/weighted.py

+    # Use `offdiag` instead of `A`, b/c self-loops don't contribute to the result,
+    # and negative self-loops are easy negative cycles to avoid.
+    # We check if we hit a self-loop negative cycle at the end.
+    A, has_negative_diagonal = G.get_properties("offdiag has_negative_diagonal")


Yet another change from the "simple" implementation: don't use diagonal values. I think "offdiag" is a property we'll use regularly, so I think we shouldn't be afraid to use it if it's useful.

The downside is possible extra memory use and some micro-benchmarks may perform worse 🤷‍♂️

(Also, added new properties has_negative_diagonal and has_negative_edges*).

lframosferreira · 2023-02-13T19:12:34Z

Hi Erik! For my problem, I only need distances. I tested the Floyd-Warshall implementation with my graphs but it was very slow, probably because they are connected, as you said in our meeting. But that Bellman-Ford implementation sounds great! I'll try ASAP!

…

________________________________ De: Erik Welch ***@***.***> Enviado: domingo, 12 de fevereiro de 2023 01:36 Para: python-graphblas/graphblas-algorithms ***@***.***> Cc: Luís Felipe Ramos Ferreira ***@***.***>; Mention ***@***.***> Assunto: Re: [python-graphblas/graphblas-algorithms] Add `{all_pairs,single_source}_bellman_ford_path_length` (PR #44) @LuisFelipeRamos<https://github.com/LuisFelipeRamos>, all_pairs_bellman_ford_path_length might be useful to you. Do you only need distances, or do you need more such as predecessors? I like that we can compare these to scipy.sparse.csgraph algorithms. For SSSP, our Bellman-Ford is about the same speed as (or a little faster than) scipy's Dijkstra SSSP, which I think is impressive. I don't know what's up with scipy.sparse.csgraph.bellman_ford--it's super slow, like 15000x slower. For all-pairs shortest path, our version with batching Bellman-Ford is 2-3x faster than scipy.sparse.csgraph.dijkstra! Batching gives us more parallelism, and we can do even better with larger batches. My benchmarks aren't very exhaustive. They're also on my laptop (4 cores, 8 threads). I would expect GraphBLAS to do even better running on a beefier system (such as DGX that has 40 cores / 80 threads). My variant of SSSP Bellman-Ford is about 3-4x faster than the "simple" GraphBLAS implementation that uses d(min) << min_plus(d @ A) and comparison to previous value. It's interesting that we're able to find optimizations for most algorithms in graphblas-algorithms with just a little bit of experimentation. I don't know how our all-pairs shortest path with Bellman-Ford compares to Floyd-Warshall except that Bellman-ford is a generator and uses much less memory. — Reply to this email directly, view it on GitHub<#44 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ANVHAVJUALXYTK2Z5BODANDWXBSDXANCNFSM6AAAAAAUYQ6KQ4>. You are receiving this because you were mentioned.Message ID: ***@***.***>

eriknw · 2023-02-13T20:21:49Z

Hi Erik! For my problem, I only need distances. I tested the Floyd-Warshall implementation with my graphs but it was very slow, probably because they are connected, as you said in our meeting. But that Bellman-Ford implementation sounds great! I'll try ASAP!

Cool! I'm eager to see how this turns out. Let us know if you want any help or if I should improve anything in this PR.

Did Floyd-Warshall finish, or was it too large?

eriknw · 2023-02-13T20:22:40Z

Also, @LuisFelipeRamos, is your graph directed or undirected, and are the edges all the same value or different values?

lframosferreira · 2023-02-13T23:28:39Z

They are undirected and unweighted.

…

________________________________ De: Erik Welch ***@***.***> Enviado: segunda-feira, 13 de fevereiro de 2023 17:22 Para: python-graphblas/graphblas-algorithms ***@***.***> Cc: Luís Felipe Ramos Ferreira ***@***.***>; Mention ***@***.***> Assunto: Re: [python-graphblas/graphblas-algorithms] Add `{all_pairs,single_source}_bellman_ford_path_length` (PR #44) Also, @LuisFelipeRamos<https://github.com/LuisFelipeRamos>, is your graph directed or undirected, and are the edges all the same value or different values? — Reply to this email directly, view it on GitHub<#44 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ANVHAVOACXZLRXPLUZ5Y4KTWXKJZXANCNFSM6AAAAAAUYQ6KQ4>. You are receiving this because you were mentioned.Message ID: ***@***.***>

eriknw · 2023-02-14T02:38:12Z

They are undirected and unweighted.

Good to know! We should be able to further optimize for this case, which I think is pretty much level-BFS.

eriknw · 2023-02-14T06:22:54Z

Good news @LuisFelipeRamos--I just doubled the performance of Bellman-Ford for unweighted graphs! (and also for graph that are iso-valued)

I do this by performing BFS that stores the level. That this is only twice as fast suggests to me that our Bellman-Ford implementation is probably pretty solid.

lframosferreira · 2023-02-14T16:46:47Z

That's great news!

eriknw · 2023-02-15T19:11:42Z

Alright, I think this is good to go in.

I just updated chunksize="auto" to "all pairs" to automatically choose a reasonable chunksize.

eriknw · 2023-02-17T05:18:17Z

This is in! 🎉

Add {all_pairs,single_source}_bellman_ford_path_length

c9ed0ab

That is, add these to `nxapi`: - `all_pairs_bellman_ford_path_length` - `single_source_bellman_ford_path_length`

eriknw commented Feb 11, 2023

View reviewed changes

eriknw added 2 commits February 11, 2023 01:19

Faster (and clearer)

caa3e83

Implement algorithms.bellman_ford_path_lengths to compute in chunks

343fcb8

eriknw commented Feb 12, 2023

View reviewed changes

eriknw requested a review from SultanOrazbayev February 12, 2023 04:40

eriknw added 3 commits February 12, 2023 11:46

Allow to compute all at once

2ca6b0d

Ignore diagonals during Bellman-Ford

704937e

Add comment, and use offdiag more places.

4cbbfe9

eriknw commented Feb 13, 2023

View reviewed changes

Do level BFS for Bellman-Ford when iso-valued (and non-negative)

a644bf8

Use "iso_value" property more places instead of A.ss.iso_value

aaa19c1

eriknw added 3 commits February 14, 2023 11:09

Fail fast in these unlikely, but easily detected, cases

b82c8af

Allow garbage collector to be enabled during benchmarks

ba337d4

Automatically choose appropriate chunksize

4370b2d

eriknw added 2 commits February 17, 2023 05:50

Use nsplits="auto" in square_clustering (default to 256 MB chunks)

7268d80

Add note about A.ss.is_iso

bbcdd07

eriknw merged commit 1180e1c into python-graphblas:main Feb 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add `{all_pairs,single_source}_bellman_ford_path_length` #44

Add `{all_pairs,single_source}_bellman_ford_path_length` #44

Uh oh!

eriknw commented Feb 11, 2023

Uh oh!

codecov-commenter commented Feb 11, 2023 •

edited

Loading

Uh oh!

eriknw Feb 11, 2023 •

edited

Loading

Uh oh!

eriknw Feb 11, 2023

Uh oh!

eriknw commented Feb 12, 2023

Uh oh!

eriknw Feb 12, 2023

Uh oh!

eriknw Feb 13, 2023 •

edited

Loading

Uh oh!

lframosferreira commented Feb 13, 2023 via email

Uh oh!

eriknw commented Feb 13, 2023

Uh oh!

eriknw commented Feb 13, 2023

Uh oh!

lframosferreira commented Feb 13, 2023 via email

Uh oh!

eriknw commented Feb 14, 2023

Uh oh!

eriknw commented Feb 14, 2023

Uh oh!

lframosferreira commented Feb 14, 2023

Uh oh!

eriknw commented Feb 15, 2023

Uh oh!

eriknw commented Feb 17, 2023

Uh oh!

Uh oh!

	# Update Outer to only include off-diagonal values that will update D and P.
	if is_directed:
	Mask << indexunary.offdiag(Outer)
	else:
	Mask << indexunary.triu(Outer, 1)
	Mask(binary.second) << binary.lt(Outer & D)
	Outer(Mask.V, replace) << Outer

		]


		def all_pairs_bellman_ford_path_length(G, weight="weight", *, chunksize=128):

Add {all_pairs,single_source}_bellman_ford_path_length #44

Add {all_pairs,single_source}_bellman_ford_path_length #44

Uh oh!

Conversation

eriknw commented Feb 11, 2023

Uh oh!

codecov-commenter commented Feb 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

eriknw Feb 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eriknw Feb 11, 2023

Choose a reason for hiding this comment

Uh oh!

eriknw commented Feb 12, 2023

Uh oh!

eriknw Feb 12, 2023

Choose a reason for hiding this comment

Uh oh!

eriknw Feb 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lframosferreira commented Feb 13, 2023 via email

Uh oh!

eriknw commented Feb 13, 2023

Uh oh!

eriknw commented Feb 13, 2023

Uh oh!

lframosferreira commented Feb 13, 2023 via email

Uh oh!

eriknw commented Feb 14, 2023

Uh oh!

eriknw commented Feb 14, 2023

Uh oh!

lframosferreira commented Feb 14, 2023

Uh oh!

eriknw commented Feb 15, 2023

Uh oh!

eriknw commented Feb 17, 2023

Uh oh!

Uh oh!

Add `{all_pairs,single_source}_bellman_ford_path_length` #44

Add `{all_pairs,single_source}_bellman_ford_path_length` #44

codecov-commenter commented Feb 11, 2023 •

edited

Loading

eriknw Feb 11, 2023 •

edited

Loading

eriknw Feb 13, 2023 •

edited

Loading