Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test select_random_vertices for all possible values of flags #4042

Merged
merged 14 commits into from
Dec 12, 2023

Conversation

naimnv
Copy link
Contributor

@naimnv naimnv commented Dec 5, 2023

Test select_random_vertices for all possible values of flags.

@naimnv naimnv requested a review from a team as a code owner December 5, 2023 00:13
Copy link

copy-pr-bot bot commented Dec 5, 2023

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@naimnv naimnv self-assigned this Dec 5, 2023
@naimnv naimnv added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Dec 5, 2023
@@ -903,6 +903,9 @@ weight_t compute_total_edge_weight(
* @param select_count The number of vertices to select from the graph
* @param with_replacement If true, select with replacement, if false select without replacement
* @param sort_vertices If true, return the sorted vertices (in the ascending order).
* @param shuffle_int_to_local If true and If @p given_set is not specified
* then shuffle internal (i.e. renumbered) vertices to their local GPUs based on vertex
* partitioning, otherwise shuffle as many vertices as local vertex partition size to each GPU.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should we care about renumbering here? And why apply this flag only when @p given_set is not specified?

Copy link
Contributor Author

@naimnv naimnv Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have to change all of these.
Objective: For a graph with N vertices, we want to generate N unique random numbers in range [0, N) (across GPUs) and we don't want to shuffle the generated numbers to GPUs based on vertex partitioning, and each GPU gets graph_view.local_vertex_partition_range_size() many random numbers,
while keeping supporting the existing behavior of select_random_vertices.

Copy link
Contributor Author

@naimnv naimnv Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense now @seunghwak ?

Copy link
Collaborator

@ChuckHastings ChuckHastings Dec 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm wondering why this even needs to be part of the select_random_vertices call.

While there is randomness to the feature that we need, I think all that we need is a "distributed shuffle" function. Given an rmm::device_uvector<T> on each GPU that is initialized with some values, we want the equivalent of thrust::shuffle of that distributed device vector.

I'm not aware of an algorithm that would need to both select a random subset of the vertex list and randomly order that list across all of the GPUs. If there was such an algorithm it could first call select_random_vertices and then call a new randomly shuffle a distributed device vector function.

I don't see a benefit of conflating the two actions into a single function.

Copy link
Contributor Author

@naimnv naimnv Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes havebeen be aborted. Shuffling will be done by the ECG implementation itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this PR is not meant to be merged, should we better close this PR?

Copy link
Contributor Author

@naimnv naimnv Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually all the changes to the select_random_vertices has been aborted, but the test has been updated. Please review the tests.

@naimnv naimnv added this to the 24.02 milestone Dec 5, 2023
@naimnv naimnv changed the title Update select_random_vertices to select as many random vertices as local vertex partition range size on each GPU Test select_random_vertices for all possible values of flags Dec 12, 2023
@naimnv
Copy link
Contributor Author

naimnv commented Dec 12, 2023

/ok to test

@naimnv
Copy link
Contributor Author

naimnv commented Dec 12, 2023

/ok to test

Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@naimnv
Copy link
Contributor Author

naimnv commented Dec 12, 2023

/ok to test

@ChuckHastings
Copy link
Collaborator

/merge

@rapids-bot rapids-bot bot merged commit 23e70a6 into rapidsai:branch-24.02 Dec 12, 2023
75 of 76 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuGraph improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants