Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define C API and implement induced subgraph #2854

Merged

Conversation

ChuckHastings
Copy link
Collaborator

Defines the C API and provides implementation of induced subgraph.

Closes #2530
Closes #2532
Closes #2533

@ChuckHastings ChuckHastings requested review from a team as code owners October 26, 2022 18:39
@ChuckHastings ChuckHastings self-assigned this Oct 26, 2022
@ChuckHastings ChuckHastings added 3 - Ready for Review improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Oct 26, 2022
@ChuckHastings ChuckHastings added this to the 22.12 milestone Oct 26, 2022
d_vertices.begin(),
[local_vertex_first] __device__(vertex_t v) { return v - local_vertex_first; });

d_local_values.resize(local_vertex_last - local_vertex_first, handle.get_stream());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a major issue, but
static_cast<size_t>(thrust::distance(local_vertex_last, local_vertex_first) can be used?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could, but these are integer types (vertex_t) so subtraction feels more natural to me in this case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... thrust::distance is for iterators not for scalars.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, they are not pointer/iterator here.

Copy link
Contributor

@seunghwak seunghwak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in overall but I have few suggestions.

And I am still thinking about improving expand_sparse_offsets, so expect few more comments


namespace cugraph {
namespace detail {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a short description (documentation) of this function will be helpful.

Here, d_vertices should cover the entire set of vertices (assigned to this GPU in multi-GPU), right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the implementation file. Documentation for this function is included in the header cugraph/detail/shuffle_wrappers.hpp.

// returns thrust::nullopt if the destination vertex has
// a property of 0, return the edge if the destination
// vertex has a property of 1
//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comments looks out-dated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy/paste error, corrected in next push.

// vertex has a property of 1
//
return_type __device__ operator()(
thrust::tuple<vertex_t, size_t> src, vertex_t dst, weight_t wgt, property_t sv, property_t dv)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src=>tagged_src might be clearer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in next push.

// returns thrust::nullopt if the destination vertex has
// a property of 0, return the edge if the destination
// vertex has a property of 1
//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comments looks out-dated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in next push.

// a property of 0, return the edge if the destination
// vertex has a property of 1
//
return_type __device__ operator()(thrust::tuple<vertex_t, size_t> src,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src=>tagged_src might be clearer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in next push

handle.get_stream());
thrust::sort(handle.get_thrust_policy(),
thrust::make_zip_iterator(graph_ids_v.begin(), dst_subgraph_vertices_v.begin()),
thrust::make_zip_iterator(graph_ids_v.end(), dst_subgraph_vertices_v.end()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This copy might be unnecessary.

We may do something like the following.

raft::device_span<size_t const> dst_subgraph_offsets{};
raft::device_span<vertex_t const> dst_subgraph_vertices{};
std::conditional_t<multi_gpu, rmm::device_uvector<size_t>, byte_t /* dummy */> tmp_dst_subgraph_offsets{};
std::conditional_t<multi_gpu, rmm::device_uvector<vertex_t>, byte_t /* dummy */> tmp_dst_subgraph_vertices{};
if constexpr (multi_gpu) {
  // update tmp_dst_subgraph_offsets|vertices as before;
  dst_subgraph_offsets = raft::device_span<size_t const>(tmp_dst_subgraph_offsets.data(), tmp_dst_subgraph_offsets.size());
  dst_subgraph_vertices = raft::device_span<vertex_t const>(tmp_dst_subgraph_vertices.data(), tmp_dst_subgraph_vertices.size());
}
else {
  dst_subgraph_offsets = raft::device_span<size_t const>(subgraph_offsets.data(), subgraph_offsets.size());
  dst_subgraph_vertices = raft::device_span<vertex_t const>(subgraph_vertices.data(), subgraph_vertices.size());
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do this, the above deleted input parameter check is still relevant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And group_ids_v is relevant only in the mutli-GPU pass, so it's declaration may better be moved inside the if constexpr (multi_gpu) {}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started with something like your suggestion. I could go back to that, if we prefer that approach.

During testing, I discovered that the MG path does not require the input subgraph specifications to be sorted, because we have to sort the data after we shuffle. By adding this copy/sort we can actually drop the requirement that the input be pre-sorted. I dropped the check from the expensive checks - although I just observed that I did not update the documentation in graph_functions.hpp.

If we think that this extra memory/sorting cost isn't worth it in SG, I can go back to requiring the input to be sorted and just use the provided input in SG.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the work below, I think the graph_ids_v will still be required for populating the frontier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I will let you make the call. This might be more of a strategic decision than purely technical. This really depends on whether we want to sacrifice little performance/memory footprint of SG path for better usability in both SG & MG paths.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took out the requirement (from the documentation) that the data be sorted and made both paths sort. We can always revisit this later if we decide the sorting overhead needs to be removed in the SG case.

// fill the edge list buffer (to be returned) for each vetex in the aggregate subgraph vertex
// list (use the offsets computed in the Phase 1)
thrust::for_each(
// FIXME: Shouldn't there be a dummy property equivalent here?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reference. I guess I was just missing the .view(). Fixed in next push.

graph_view,
vertex_frontier.bucket(0),
// edge_src_dummy_property_t{},
// edge_dst_dummy_property_t{},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edge_src_dummy_property_t{}.view()

: std::nullopt;
vertex_frontier.bucket(0).insert(pair_begin + h_subgraph_offsets[bucket_idx],
pair_begin + h_subgraph_offsets[bucket_idx + 1]);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A concern with this code is that, if we are extracting thousands of subgraphs, this will create a long chain of kernel launches with little work in a single kernel.

Actually, we can use the same mechanism to create graph_ids_v above, and create a zip iterator of subgraph vertices and graph indices, insert everything in a single kernel call.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do that and defer releasing the graph_ids_v memory until after we've populated the frontier. Will look at that for the next push.

template <typename vertex_t, typename edge_t>
rmm::device_uvector<vertex_t> expand_sparse_offsets(raft::device_span<edge_t const> offsets,
vertex_t base_vertex_id,
rmm::cuda_stream_view stream_view)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use typename vertex_t here?

      auto graph_ids_v = cugraph::detail::expand_sparse_offsets(
        raft::device_span<size_t const>(d_subgraph_edge_offsets.data(),
                                        d_subgraph_edge_offsets.size()));
                                        d_subgraph_edge_offsets.size()),
        vertex_t{0},
        handle_->get_stream());

Here, d_subgraph_edge_offsets.size() can theoretically overflow vertex_t, right? # subgraphs to extract can be more than the number of vertices in the graph.

And offsets.back() isn't really the number of edges in this context, but the total number of vertices in all the subgraphs, and this can overflow edge_t as well (may not happen in most cases, but we really can't say this will never happen).

Should we better use either idx_t or size_t for vertex_t and offset_t for edge_t?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually just a label in the templated function... I used vertex_t and edge_t out of habit more than anything else. idx_t and offset_t would definitely be better labels for this function.

Changed in the next push. Also updated some variable names.

{
edge_t num_edges{0};

if (offsets.size() > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, offsets.size() should be at least 0 + 1, so I guess we'd better assume offsets.size() > 0? (add assert(offsetes.size() > 0?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was - during my debugging - a case in MG where on some of the partitions the input was empty, I added this check to make sure that didn't result in a crash. I will change it to an assert.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if the input is empty (i.e. # subgraphs is 0), the offsets.size() should be 0 + 1 = 1 and > 0, right? In this case offsets should be [0] if I am not mistaken.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is specified correctly :-) . I believe it was a bug in my test code which results in the error.

offsets.begin() + 1,
offsets.end(),
[d_vertices = vertices.data(), n_vertices = vertices.size()] __device__(auto offset) {
if (offset < n_vertices) { atomicAdd(&d_vertices[offset], vertex_t{1}); }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this (offset < n_vertices) be always true? I guess this check is unnecessary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the last subgraph list is empty we end up with multiple entries in the array where offset == n_vertices. This check is to catch that condition and not update out-of-bounds in the vertices array.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes, you're right.

@ChuckHastings
Copy link
Collaborator Author

rerun tests

@ChuckHastings
Copy link
Collaborator Author

rerun tests

1 similar comment
@ChuckHastings
Copy link
Collaborator Author

rerun tests

@codecov-commenter
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.12@856e3ba). Click here to learn what that means.
Patch has no changes to coverable lines.

Additional details and impacted files
@@               Coverage Diff               @@
##             branch-22.12    #2854   +/-   ##
===============================================
  Coverage                ?   62.61%           
===============================================
  Files                   ?      114           
  Lines                   ?     6395           
  Branches                ?        0           
===============================================
  Hits                    ?     4004           
  Misses                  ?     2391           
  Partials                ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@ChuckHastings
Copy link
Collaborator Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit cc2048d into rapidsai:branch-22.12 Nov 3, 2022
rapids-bot bot pushed a commit that referenced this pull request Nov 4, 2022
Add capability to C API to create an SG graph from an existing CSR.

Dependent on PR #2854 

Closes #2508

Authors:
  - Chuck Hastings (https://github.com/ChuckHastings)

Approvers:
  - Seunghwa Kang (https://github.com/seunghwak)
  - Joseph Nke (https://github.com/jnke2016)
  - Naim (https://github.com/naimnv)
  - Rick Ratzel (https://github.com/rlratzel)

URL: #2856
@ChuckHastings ChuckHastings deleted the define_capi_induced_subgraph branch December 2, 2022 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
5 participants