Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address some MNMG issues in cython.cu #2224

Merged
merged 2 commits into from
Apr 14, 2022

Conversation

ChuckHastings
Copy link
Collaborator

Closes #2217

The issue identifies some issues if the partitioning ends up with some empty partitions. There were two issues here:

  1. The call_shuffle method determines whether a graph has weights by comparing the weight pointer to nullptr. But if an MNMG partition is empty then the weight pointer will be nullptr even if the graph has weights. Added a parameter to identify if the graph is weighted.
  2. The renumber_helper method was missing an else and was overwriting the correct value with an invalid value if the return from shuffling on a gpu had no entries.

These are corrected in the PR.

@ChuckHastings ChuckHastings requested review from a team as code owners April 13, 2022 22:39
@ChuckHastings ChuckHastings self-assigned this Apr 13, 2022
@ChuckHastings ChuckHastings added 3 - Ready for Review bug Something isn't working non-breaking Non-breaking change labels Apr 13, 2022
@ChuckHastings ChuckHastings added this to the 22.06 milestone Apr 13, 2022
@@ -594,9 +594,10 @@ std::unique_ptr<major_minor_weights_t<vertex_t, edge_t, weight_t>> call_shuffle(
edgelist_major_vertices, // [IN / OUT]: groupby_gpu_id_and_shuffle_values() sorts in-place
vertex_t* edgelist_minor_vertices, // [IN / OUT]
weight_t* edgelist_weights, // [IN / OUT]
edge_t num_edgelist_edges);
edge_t num_edgelist_edges,
bool is_weighted);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI, we're using std::optional elsewhere to address this instead of adding an additional flag. Not sure it is worth the effort in applying the same strategy here as this code will become obsolete once we switch to the C-API based approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And yeah... this is cython, so we may not be able to use std optional here.

@codecov-commenter
Copy link

codecov-commenter commented Apr 14, 2022

Codecov Report

Merging #2224 (67ceaf7) into branch-22.06 (38be932) will increase coverage by 0.04%.
The diff coverage is n/a.

@@               Coverage Diff                @@
##           branch-22.06    #2224      +/-   ##
================================================
+ Coverage         70.82%   70.86%   +0.04%     
================================================
  Files               170      170              
  Lines             11036    11036              
================================================
+ Hits               7816     7821       +5     
+ Misses             3220     3215       -5     
Impacted Files Coverage Δ
python/cugraph/cugraph/structure/hypergraph.py 90.00% <ø> (ø)
.../cugraph/tests/test_edge_betweenness_centrality.py 84.93% <0.00%> (+0.60%) ⬆️
...graph/cugraph/tests/test_betweenness_centrality.py 83.01% <0.00%> (+0.62%) ⬆️
python/cugraph/cugraph/tests/test_graph_store.py 100.00% <0.00%> (+1.56%) ⬆️
python/cugraph/cugraph/tests/test_utils.py 75.55% <0.00%> (+4.44%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7030b83...67ceaf7. Read the comment docs.

@@ -65,7 +65,8 @@ cdef renumber_helper(shuffled_vertices_t* ptr_maj_min_w, vertex_t, weights):
move(pair_s_weights.first), weight_t, "shuffled_weights")
if shuffled_weights_series is None:
shuffled_df['value']= cudf.Series(dtype=weight_t)
shuffled_df['value']= shuffled_weights_series
else:
shuffled_df['value']= shuffled_weights_series
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we should do the same for shuffled_minor_series and shuffled_major_series

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Wasn't causing an error, but the code is, in fact, incorrect. Updated in the last push.

Copy link
Contributor

@jnke2016 jnke2016 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a comment regarding adding the else statement for both shuffle_major and shuffle_minor too

@ChuckHastings
Copy link
Collaborator Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 6a8faa0 into rapidsai:branch-22.06 Apr 14, 2022
@ChuckHastings ChuckHastings deleted the fix_2217 branch August 4, 2022 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Failure at renumbering when passing empty dataframe
5 participants