support heterogenous fanout type #4608

jnke2016 · 2024-08-13T15:54:13Z

closes #4589
closes #4591

ChuckHastings

Some thoughts on changing the API a bit.

ChuckHastings · 2024-08-13T19:19:42Z

cpp/include/cugraph/sampling_functions.hpp

 raft::random::RngState& rng_state,
 bool return_hops,
 bool with_replacement = true,
 prior_sources_behavior_t prior_sources_behavior = prior_sources_behavior_t::DEFAULT,
 bool dedupe_sources = false,
 bool do_expensive_check = false);

+#if 0
+/* FIXME:
+ There are two options to support heterogeneous fanout


Here's another option to explore.

Create a new function called neighbor_sample. Create it off of the biased sampling API, but with the following changes:

the biases become optional instead of required. Then it can do either uniform or biased in the same call just by whether the biases are included or not

the fanout and heterogeneous fanout as you have defined. Or we might explore using std::variant, where it would either take host_span or tuple of host span and make the right choice internally

Move the rng_state parameter to be right after the handle (before the graph_view). This feels like a better standard place for the parameter.

We can then mark the existing uniform_neighbor_sample and biased_neighbor_sample as deprecated. When we implement, the internal C++ implementation can just call the new neighbor_sample with the parameters properly configured. This makes it a non-breaking change (eventually we'll drop the old functions), but still keeps the code reuse increased.

Thoughts @seunghwak ?

the biases become optional instead of required. Then it can do either uniform or biased in the same call just by whether the biases are included or not

=> In this case, we may update the existing non-heterogeneous fanout type sampling functions as well. i.e. combine the uniform & biased sampling functions. Not sure about the optimal balancing point between creating too many functions vs creating a function with too many input parameters.

Yeah... I guess we should avoid creating a too busy function (one function handling all different types of sampling based on the input arguments excessively using std::variant & std::optional) but we should also avoid creating too many functions... Not sure what's the optimal balancing point...

In theory, adding new parameters exponentially increase code complexity (too handle all possible combinations of optional parameters), we should better create separate functions. If supporting an additional optional parameter requires only a minor change in the API and implementation, we may create one generic function (or we may create one complex function that handles all different options in the detail namespace and multiple public functions calling this if this helps in reducing code replication).

cpp/include/cugraph_c/sampling_algorithms.h

ChuckHastings · 2024-08-13T19:30:38Z

cpp/include/cugraph_c/sampling_algorithms.h

@@ -368,6 +410,7 @@ cugraph_error_code_t cugraph_uniform_neighbor_sample(
 const cugraph_type_erased_device_array_view_t* label_to_comm_rank,
 const cugraph_type_erased_device_array_view_t* label_offsets,
 const cugraph_type_erased_host_array_view_t* fan_out,
+ const cugraph_sample_heterogeneous_fanout_t* heterogeneous_fanout,


Perhaps we take the same approach here. Create a new C API function called neighbor_sample, following the biased function definition. Add this parameter. Deprecate the other functions. In the implementation we can just check for nullptr (NULL).

ChuckHastings · 2024-08-13T20:01:01Z

cpp/src/sampling/neighbor_sampling_impl.hpp

@@ -150,7 +173,7 @@ neighbor_sample_impl(

 std::vector<size_t> level_sizes{};
 int32_t hop{0};
- for (auto&& k_level : fan_out) {
+ for (auto&& k_level : (*fan_out)) {


This isn't actually sufficient yet... but I'm more worried about the API right now.

This loop will need, in the case of heterogeneous sampling, to have 2 levels of for loop. An outer loop iterating by hop and an inner loop iterating by type.

I'd be inclined to add a setup loop that iterates over the types and generates the masks - and perhaps identifies the maximum number of hops to drive the outer loop. You'll need to get k_level from the right type/hop combination... so this for construct won't work at all, it will need to look different.

Right I only added it for it to compile. I will revisit this approach once we lock the API's interface. It is only supporting non heterogeneous type for now

ChuckHastings · 2024-08-13T20:01:23Z

cpp/src/sampling/neighbor_sampling_impl.hpp

@@ -192,7 +215,7 @@ neighbor_sample_impl(
 if (labels) { (*level_result_label_vectors).push_back(std::move(*labels)); }

 ++hop;
- if (hop < fan_out.size()) {
+ if (hop < (*fan_out).size()) {


fan_out size will (potentially) vary by type.

Right I only added it for it to compile. I will revisit this approach once we lock the API's interface. It is only supporting non heterogeneous type for now

ChuckHastings · 2024-08-13T20:03:15Z

python/cugraph/cugraph/dask/sampling/uniform_neighbor_sample.py

+ # FIXME: Add expensive check to ensure all dict values are lists
+ # Convert to a tuple of sequence (edge type size and fanout values)
+ edge_type_size = []
+ [edge_type_size.append(len(s)) for s in list(fanout_vals.values())]


Does this iterate over the edge types in the dictionary in order? We need to make sure that this is constructed with edge type 0 first, followed by edge type 1, etc.

Right. I converted the heterogeneous fanout type to a sorted ordered dictionary.

ChuckHastings · 2024-08-13T20:03:56Z

python/cugraph/cugraph/dask/sampling/uniform_neighbor_sample.py

+ edge_type_size = []
+ [edge_type_size.append(len(s)) for s in list(fanout_vals.values())]
+ edge_type_fanout_vals = list(chain.from_iterable(list(fanout_vals.values())))
+ fanout_vals = (


Per my earlier suggestions, I think we want this to be a CSR structure, so converting from a list of sizes to a list of offsets is perhaps best done here.

We changed this back to a dense structure... so I think this code isn't right.

ChuckHastings · 2024-08-13T20:04:22Z

python/cugraph/cugraph/sampling/uniform_neighbor_sample.py

@@ -314,8 +316,21 @@ def uniform_neighbor_sample(
 fanout_vals = fanout_vals.get().astype("int32")
 elif isinstance(fanout_vals, cudf.Series):
 fanout_vals = fanout_vals.values_host.astype("int32")
+ elif isinstance(fanout_vals, dict):


Same comments as above

…olidate neighborhood sampling functions

cpp/include/cugraph/sampling_functions.hpp

cpp/include/cugraph_c/sampling_algorithms.h

cpp/src/c_api/neighbor_sampling.cpp

ChuckHastings · 2024-09-23T19:54:03Z

cpp/src/c_api/neighbor_sampling.cpp

+ handle_.get_stream());
+ }
+
+ if constexpr (multi_gpu) {


If this was directly from my PR, I'm sorry I introduced this problem.

This shuffle won't work. start_vertex_offsets_ groups these vertices into groups based on the label. Shuffling start_vertices will lose the appropriate label information. I believe the logic would need to be:

Convert start_vertex_offsets_ to start_vertex_labels

Shuffle the pair (start_vertex, start_vertex_label) to the proper GPU

Renumber the starting vertices as below

Sort by the pairs by start_vertex_label

Reconstitute the starting vertex offsets based on the new labels organize onto the proper GPUs

Then this will be ready for calling the sampling functions.

With the latest change to the C++ API, we don't need to reconstitute the offsets, we'll be passing in labels to the C++ call. So skip step 5 above.

However, because the start_vertex_offsets_ will be local to this GPU, we will need to compute a global label id to perform step 1 above properly. You can compute the number of local labels (start_vertex_offsets_->size_ - 1). If we use host_scalar_allgatherv we can get the number of labels on each GPU. Then we can do thrust::exclusive_scan to compute the base label for each GPU. The global label ids can be constructed from that.

Finally, we'll need to construct the global label_to_comm_rank list. This should be constructible by using the output from the thrust::exclusive_scan to compute the mapping of labels to output GPUs.

ChuckHastings · 2024-09-23T20:12:22Z

cpp/src/c_api/neighbor_sampling.cpp

+ std::optional<rmm::device_uvector<label_t>> edge_label{std::nullopt};
+ std::optional<rmm::device_uvector<size_t>> offsets{std::nullopt};
+
+ rmm::device_uvector<vertex_t> vertex_type_offsets(graph_view.local_vertex_partition_range_size(), handle_.get_stream());


This looks like it's only used in the heterogeneous renumbering code and is redefined in that block. I'd be inclined to delete the definition and sequence_fill here.

ChuckHastings · 2024-09-24T20:42:20Z

cpp/include/cugraph/sampling_functions.hpp

+ std::optional<edge_property_view_t<edge_t, edge_t const*>> edge_id_view,
+ std::optional<edge_property_view_t<edge_t, edge_type_t const*>> edge_type_view,
+ raft::device_span<vertex_t const> starting_vertices,
+ std::optional<raft::device_span<size_t const>> starting_vertex_offsets,


Based on latest slack conversation... we should change this back to starting_vertex_labels. This should be an easy change by backing out a few things from the implementation.

ChuckHastings · 2024-09-24T20:42:46Z

cpp/include/cugraph/sampling_functions.hpp

+ std::optional<edge_property_view_t<edge_t, edge_type_t const*>> edge_type_view,
+ edge_property_view_t<edge_t, bias_t const*> edge_bias_view,
+ raft::device_span<vertex_t const> starting_vertices,
+ std::optional<raft::device_span<size_t const>> starting_vertex_offsets,


Same as above, change back to starting_vertex_labels

ChuckHastings · 2024-09-24T20:42:56Z

cpp/include/cugraph/sampling_functions.hpp

+ std::optional<edge_property_view_t<edge_t, edge_t const*>> edge_id_view,
+ std::optional<edge_property_view_t<edge_t, edge_type_t const*>> edge_type_view,
+ raft::device_span<vertex_t const> starting_vertices,
+ std::optional<raft::device_span<size_t const>> starting_vertex_offsets,


ChuckHastings · 2024-09-24T20:43:08Z

cpp/include/cugraph/sampling_functions.hpp

+ std::optional<edge_property_view_t<edge_t, edge_type_t const*>> edge_type_view,
+ edge_property_view_t<edge_t, bias_t const*> edge_bias_view,
+ raft::device_span<vertex_t const> starting_vertices,
+ std::optional<raft::device_span<size_t const>> starting_vertex_offsets,


ChuckHastings · 2024-09-24T20:47:29Z

cpp/include/cugraph_c/sampling_algorithms.h

+ const cugraph_edge_property_view_t* edge_biases,
+ const cugraph_type_erased_device_array_view_t* start_vertices,
+ const cugraph_type_erased_device_array_view_t* start_vertex_offsets,
+ const cugraph_type_erased_device_array_view_t* label_to_comm_rank,


Let's drop this from the C API (per the latest slack conversation). We will internally compute the label to comm rank for C++ based on which GPU the seeds are sent from.

ChuckHastings · 2024-09-24T20:49:38Z

cpp/src/c_api/neighbor_sampling.cpp

+ cugraph_edge_property_view_t const* edge_biases,
+ cugraph_type_erased_device_array_view_t const* start_vertices,
+ cugraph_type_erased_device_array_view_t const* start_vertex_offsets,
+ cugraph_type_erased_device_array_view_t const* label_to_comm_rank,


This gets removed

ChuckHastings · 2024-09-24T20:57:39Z

cpp/src/c_api/neighbor_sampling.cpp

+ handle_.get_stream());
+ }
+
+ if constexpr (multi_gpu) {


With the latest change to the C++ API, we don't need to reconstitute the offsets, we'll be passing in labels to the C++ call. So skip step 5 above.

However, because the start_vertex_offsets_ will be local to this GPU, we will need to compute a global label id to perform step 1 above properly. You can compute the number of local labels (start_vertex_offsets_->size_ - 1). If we use host_scalar_allgatherv we can get the number of labels on each GPU. Then we can do thrust::exclusive_scan to compute the base label for each GPU. The global label ids can be constructed from that.

Finally, we'll need to construct the global label_to_comm_rank list. This should be constructible by using the output from the thrust::exclusive_scan to compute the mapping of labels to output GPUs.

cpp/src/c_api/neighbor_sampling.cpp

KyleFromNVIDIA · 2024-09-25T18:08:07Z

python/pylibcugraph/pylibcugraph/CMakeLists.txt

+ #neighbor_sample.pyx // Fix me, break the APi into homogeneous nad heterogeneous neighbor sample
+ #biased_neighbor_sample.pyx


Is this going to be part of this PR, or a future PR?

This FIXME is already addressed in my local branch. it will be part of my next commit

cpp/include/cugraph_c/sampling_algorithms.h

ChuckHastings · 2024-09-26T03:08:36Z

python/pylibcugraph/pylibcugraph/_cugraph_c/sampling_algorithms.pxd

+ const cugraph_type_erased_device_array_view_t* label_offsets,
+ const cugraph_type_erased_host_array_view_t* fan_out,
+ const cugraph_sampling_options_t* options,
+ bool_t is_biased,


Parameter is obsolete.

Right. I didn't update the PLC and python API as I was mostly focused on the C++ and CAPI. But these should be address in my next commits

python/pylibcugraph/pylibcugraph/_cugraph_c/sampling_algorithms.pxd

ChuckHastings · 2024-09-26T03:09:34Z

python/pylibcugraph/pylibcugraph/_cugraph_c/sampling_algorithms.pxd

+ ctypedef struct cugraph_sample_heterogeneous_fan_out_t:
+ pass
+
+ cdef cugraph_error_code_t \


These next two functions are also no longer necessary.

ChuckHastings · 2024-09-26T03:10:14Z

python/pylibcugraph/pylibcugraph/heterogeneous_neighbor_sample.pyx

+ element corresponds to the fan_out values.
+ The sampling method can use different fan_out values for each edge type.
+
+ is_biased: bool


We made the C++ parameter obsolete be separating the uniform and biased methods. We should mirror this in PLC.

…orhood sampling

…s and heterogeneous, with/without biases

support heterogenous fanout type

0adb2fd

github-actions bot added cuGraph python labels Aug 13, 2024

jnke2016 requested review from ChuckHastings and alexbarghi-nv August 13, 2024 17:51

jnke2016 added 2 commits August 13, 2024 11:01

remove unusued code

bb5a3e2

fix style

10fa86d

ChuckHastings reviewed Aug 13, 2024

View reviewed changes

jnke2016 added 3 commits August 20, 2024 04:15

create one API for both uniform and biased neighborhood sampling

f904350

use the same function for both uniform and biased nieghborhood sampling

1fc32c3

add support for heterogenous fanout support at the plc layer and cons…

8fc21f8

…olidate neighborhood sampling functions

github-actions bot added the CMake label Aug 20, 2024

jnke2016 added 11 commits August 20, 2024 09:06

remove outdated codes

01a57f3

add flag differentiating between biased and uniform sampling

3a6aeb2

update docstrings and rename variable

d2f6467

rename variable

5d25155

create new tuple type

80f8b86

remove unnecessary check

50e0fc5

add constructor converting from array_view_t to array_t

9f455bf

leverage new constructor and remove unnecessary code

d114534

ensure edge types are ordered in increasing order

cf4a3ae

update docstrings

bc87b50

update docstrings

3013684

ChuckHastings reviewed Aug 21, 2024

View reviewed changes

jnke2016 added 6 commits August 22, 2024 12:30

undo changes to uniform neighbor sample

d6b6234

undo changes to uniform neighbor sample

068b0a3

update docstrings

6920f65

re-order arguments

760c5cd

remove outdated comments

1e0ef27

add arguments and type check

de79620

ChuckHastings reviewed Sep 23, 2024

View reviewed changes

update neighbor sample API

9dff3ab

ChuckHastings reviewed Sep 24, 2024

View reviewed changes

jnke2016 marked this pull request as ready for review September 25, 2024 18:05

jnke2016 requested review from a team as code owners September 25, 2024 18:05

KyleFromNVIDIA reviewed Sep 25, 2024

View reviewed changes

jnke2016 added 4 commits September 25, 2024 14:08

update CAPI

33c8b3d

remove unsued code

e357f42

remove outdated comment

6081978

remove unnecessary copy

73b3ffe

ChuckHastings reviewed Sep 26, 2024

View reviewed changes

jnke2016 added 16 commits September 26, 2024 13:40

remove outdate arguments

ea972f3

fix typo

8822192

update plc API of heterogeneous neighbor sample

e02a513

fix typo

d6cb1d5

change back the fanout type from a sparse to a dense structure

54fa155

fix typo

499e041

add implementation of heterogeneous/homogeneous biased/uniform neighb…

b571deb

…orhood sampling

properly handle edge types

f6c4ce3

add tests for 'homogeneous_uniform_neighbor_sampling'

e71660d

add tests for homogeneous_biased_neighbor_sampling.cpp

4e2c8cf

update type combination

2458149

add tests for heterogeneous uniform/biased neighborhood sampling

df3e4ff

properly sample with edge types

d4847e4

remove outdated tests

dc2c9ba

add SG python implementation of neighborhood sampling both homogeneou…

c01f4e4

…s and heterogeneous, with/without biases

remove unused argument

dabd0c8

kingmesal assigned jnke2016 Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support heterogenous fanout type #4608

support heterogenous fanout type #4608

jnke2016 commented Aug 13, 2024 •

edited

Loading

ChuckHastings left a comment

ChuckHastings Aug 13, 2024

seunghwak Aug 14, 2024 •

edited

Loading

seunghwak Aug 14, 2024

seunghwak Aug 14, 2024

ChuckHastings Aug 13, 2024

ChuckHastings Aug 13, 2024

jnke2016 Aug 21, 2024 •

edited

Loading

ChuckHastings Aug 13, 2024

jnke2016 Aug 21, 2024

ChuckHastings Aug 13, 2024

jnke2016 Aug 21, 2024

ChuckHastings Aug 13, 2024

ChuckHastings Sep 26, 2024

ChuckHastings Aug 13, 2024

ChuckHastings Sep 23, 2024

ChuckHastings Sep 24, 2024

ChuckHastings Sep 23, 2024

ChuckHastings Sep 24, 2024

ChuckHastings Sep 24, 2024

ChuckHastings Sep 24, 2024

ChuckHastings Sep 24, 2024

ChuckHastings Sep 24, 2024

ChuckHastings Sep 24, 2024

ChuckHastings Sep 24, 2024

KyleFromNVIDIA Sep 25, 2024

jnke2016 Sep 25, 2024 •

edited

Loading

ChuckHastings Sep 26, 2024

jnke2016 Sep 26, 2024

ChuckHastings Sep 26, 2024

ChuckHastings Sep 26, 2024

		#neighbor_sample.pyx // Fix me, break the APi into homogeneous nad heterogeneous neighbor sample
		#biased_neighbor_sample.pyx

support heterogenous fanout type #4608

Are you sure you want to change the base?

support heterogenous fanout type #4608

Conversation

jnke2016 commented Aug 13, 2024 • edited Loading

ChuckHastings left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

seunghwak Aug 14, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnke2016 Aug 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnke2016 Sep 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnke2016 commented Aug 13, 2024 •

edited

Loading

seunghwak Aug 14, 2024 •

edited

Loading

jnke2016 Aug 21, 2024 •

edited

Loading

jnke2016 Sep 25, 2024 •

edited

Loading