-
Notifications
You must be signed in to change notification settings - Fork 300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define C API and implement induced subgraph #2854
Define C API and implement induced subgraph #2854
Conversation
d_vertices.begin(), | ||
[local_vertex_first] __device__(vertex_t v) { return v - local_vertex_first; }); | ||
|
||
d_local_values.resize(local_vertex_last - local_vertex_first, handle.get_stream()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a major issue, but
static_cast<size_t>(thrust::distance(local_vertex_last, local_vertex_first)
can be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could, but these are integer types (vertex_t
) so subtraction feels more natural to me in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... thrust::distance is for iterators not for scalars.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, they are not pointer/iterator here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good in overall but I have few suggestions.
And I am still thinking about improving expand_sparse_offsets
, so expect few more comments
|
||
namespace cugraph { | ||
namespace detail { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a short description (documentation) of this function will be helpful.
Here, d_vertices should cover the entire set of vertices (assigned to this GPU in multi-GPU), right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the implementation file. Documentation for this function is included in the header cugraph/detail/shuffle_wrappers.hpp
.
// returns thrust::nullopt if the destination vertex has | ||
// a property of 0, return the edge if the destination | ||
// vertex has a property of 1 | ||
// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comments looks out-dated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copy/paste error, corrected in next push.
// vertex has a property of 1 | ||
// | ||
return_type __device__ operator()( | ||
thrust::tuple<vertex_t, size_t> src, vertex_t dst, weight_t wgt, property_t sv, property_t dv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src=>tagged_src
might be clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in next push.
// returns thrust::nullopt if the destination vertex has | ||
// a property of 0, return the edge if the destination | ||
// vertex has a property of 1 | ||
// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comments looks out-dated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in next push.
// a property of 0, return the edge if the destination | ||
// vertex has a property of 1 | ||
// | ||
return_type __device__ operator()(thrust::tuple<vertex_t, size_t> src, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src=>tagged_src
might be clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in next push
handle.get_stream()); | ||
thrust::sort(handle.get_thrust_policy(), | ||
thrust::make_zip_iterator(graph_ids_v.begin(), dst_subgraph_vertices_v.begin()), | ||
thrust::make_zip_iterator(graph_ids_v.end(), dst_subgraph_vertices_v.end())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This copy might be unnecessary.
We may do something like the following.
raft::device_span<size_t const> dst_subgraph_offsets{};
raft::device_span<vertex_t const> dst_subgraph_vertices{};
std::conditional_t<multi_gpu, rmm::device_uvector<size_t>, byte_t /* dummy */> tmp_dst_subgraph_offsets{};
std::conditional_t<multi_gpu, rmm::device_uvector<vertex_t>, byte_t /* dummy */> tmp_dst_subgraph_vertices{};
if constexpr (multi_gpu) {
// update tmp_dst_subgraph_offsets|vertices as before;
dst_subgraph_offsets = raft::device_span<size_t const>(tmp_dst_subgraph_offsets.data(), tmp_dst_subgraph_offsets.size());
dst_subgraph_vertices = raft::device_span<vertex_t const>(tmp_dst_subgraph_vertices.data(), tmp_dst_subgraph_vertices.size());
}
else {
dst_subgraph_offsets = raft::device_span<size_t const>(subgraph_offsets.data(), subgraph_offsets.size());
dst_subgraph_vertices = raft::device_span<vertex_t const>(subgraph_vertices.data(), subgraph_vertices.size());
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do this, the above deleted input parameter check is still relevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And group_ids_v
is relevant only in the mutli-GPU pass, so it's declaration may better be moved inside the if constexpr (multi_gpu) {}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started with something like your suggestion. I could go back to that, if we prefer that approach.
During testing, I discovered that the MG path does not require the input subgraph specifications to be sorted, because we have to sort the data after we shuffle. By adding this copy/sort we can actually drop the requirement that the input be pre-sorted. I dropped the check from the expensive checks - although I just observed that I did not update the documentation in graph_functions.hpp
.
If we think that this extra memory/sorting cost isn't worth it in SG, I can go back to requiring the input to be sorted and just use the provided input in SG.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the work below, I think the graph_ids_v
will still be required for populating the frontier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, I will let you make the call. This might be more of a strategic decision than purely technical. This really depends on whether we want to sacrifice little performance/memory footprint of SG path for better usability in both SG & MG paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took out the requirement (from the documentation) that the data be sorted and made both paths sort. We can always revisit this later if we decide the sorting overhead needs to be removed in the SG case.
// fill the edge list buffer (to be returned) for each vetex in the aggregate subgraph vertex | ||
// list (use the offsets computed in the Phase 1) | ||
thrust::for_each( | ||
// FIXME: Shouldn't there be a dummy property equivalent here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to do something similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reference. I guess I was just missing the .view()
. Fixed in next push.
graph_view, | ||
vertex_frontier.bucket(0), | ||
// edge_src_dummy_property_t{}, | ||
// edge_dst_dummy_property_t{}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
edge_src_dummy_property_t{}.view()
: std::nullopt; | ||
vertex_frontier.bucket(0).insert(pair_begin + h_subgraph_offsets[bucket_idx], | ||
pair_begin + h_subgraph_offsets[bucket_idx + 1]); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A concern with this code is that, if we are extracting thousands of subgraphs, this will create a long chain of kernel launches with little work in a single kernel.
Actually, we can use the same mechanism to create graph_ids_v
above, and create a zip iterator of subgraph vertices and graph indices, insert everything in a single kernel call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do that and defer releasing the graph_ids_v
memory until after we've populated the frontier. Will look at that for the next push.
template <typename vertex_t, typename edge_t> | ||
rmm::device_uvector<vertex_t> expand_sparse_offsets(raft::device_span<edge_t const> offsets, | ||
vertex_t base_vertex_id, | ||
rmm::cuda_stream_view stream_view) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use typename vertex_t
here?
auto graph_ids_v = cugraph::detail::expand_sparse_offsets(
raft::device_span<size_t const>(d_subgraph_edge_offsets.data(),
d_subgraph_edge_offsets.size()));
d_subgraph_edge_offsets.size()),
vertex_t{0},
handle_->get_stream());
Here, d_subgraph_edge_offsets.size() can theoretically overflow vertex_t
, right? # subgraphs to extract can be more than the number of vertices in the graph.
And offsets.back() isn't really the number of edges in this context, but the total number of vertices in all the subgraphs, and this can overflow edge_t as well (may not happen in most cases, but we really can't say this will never happen).
Should we better use either idx_t or size_t for vertex_t and offset_t for edge_t?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually just a label in the templated function... I used vertex_t
and edge_t
out of habit more than anything else. idx_t
and offset_t
would definitely be better labels for this function.
Changed in the next push. Also updated some variable names.
{ | ||
edge_t num_edges{0}; | ||
|
||
if (offsets.size() > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, offsets.size() should be at least 0 + 1, so I guess we'd better assume offsets.size() > 0? (add assert(offsetes.size() > 0?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was - during my debugging - a case in MG where on some of the partitions the input was empty, I added this check to make sure that didn't result in a crash. I will change it to an assert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if the input is empty (i.e. # subgraphs is 0), the offsets.size() should be 0 + 1 = 1 and > 0, right? In this case offsets should be [0] if I am not mistaken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is specified correctly :-) . I believe it was a bug in my test code which results in the error.
offsets.begin() + 1, | ||
offsets.end(), | ||
[d_vertices = vertices.data(), n_vertices = vertices.size()] __device__(auto offset) { | ||
if (offset < n_vertices) { atomicAdd(&d_vertices[offset], vertex_t{1}); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this (offset < n_vertices) be always true? I guess this check is unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the last subgraph list is empty we end up with multiple entries in the array where offset == n_vertices
. This check is to catch that condition and not update out-of-bounds in the vertices array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yes, you're right.
rerun tests |
rerun tests |
1 similar comment
rerun tests |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## branch-22.12 #2854 +/- ##
===============================================
Coverage ? 62.61%
===============================================
Files ? 114
Lines ? 6395
Branches ? 0
===============================================
Hits ? 4004
Misses ? 2391
Partials ? 0 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
@gpucibot merge |
Add capability to C API to create an SG graph from an existing CSR. Dependent on PR #2854 Closes #2508 Authors: - Chuck Hastings (https://github.com/ChuckHastings) Approvers: - Seunghwa Kang (https://github.com/seunghwak) - Joseph Nke (https://github.com/jnke2016) - Naim (https://github.com/naimnv) - Rick Ratzel (https://github.com/rlratzel) URL: #2856
Defines the C API and provides implementation of induced subgraph.
Closes #2530
Closes #2532
Closes #2533