Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Improve graph primitives API for better usability #2003

Closed
14 of 15 tasks
seunghwak opened this issue Jan 5, 2022 · 1 comment · Fixed by #2290
Closed
14 of 15 tasks

[ENH] Improve graph primitives API for better usability #2003

seunghwak opened this issue Jan 5, 2022 · 1 comment · Fixed by #2290
Assignees
Milestone

Comments

@seunghwak
Copy link
Contributor

seunghwak commented Jan 5, 2022

Describe the solution you'd like
Thank you very much @kaatish for providing various inputs. Additional suggestions will be appreciated.

Graph primitives includes classes/functions

under
https://github.com/rapidsai/cugraph/tree/branch-22.02/cpp/include/cugraph/prims

and in

https://github.com/rapidsai/cugraph/blob/branch-22.02/cpp/include/cugraph/graph.hpp
https://github.com/rapidsai/cugraph/blob/branch-22.02/cpp/include/cugraph/graph_view.hpp
https://github.com/rapidsai/cugraph/blob/branch-22.02/cpp/include/cugraph/graph_functions.hpp
https://github.com/rapidsai/cugraph/blob/branch-22.02/cpp/include/cugraph/vertex_partition_view.hpp
https://github.com/rapidsai/cugraph/blob/branch-22.02/cpp/include/cugraph/vertex_partition_device_view.cuh
https://github.com/rapidsai/cugraph/blob/branch-22.02/cpp/include/cugraph/matrix_partition_view.hpp
https://github.com/rapidsai/cugraph/blob/branch-22.02/cpp/include/cugraph/matrix_partition_device_view.cuh

  • Improve primitives names; especially, in/out_nbr in copy_v_transform_reduce_in/out_nbr is very confusing.
  • transform_reduce_in_nbr here actually means iterate over the incoming edges of each vertex; produce one value per edge (by transforming edge source/destination IDs, edge weights, and edge source/destination properties); and reduce the produced values to update the (source) vertex (of the incoming edges) properties. Better rename to make this clearer.
  • We're mixing vertex/edge-centric terminologies and matrix-centric terminologies. Better be consistent and consistently use vertex/edge-centric terminologies.
  • row/col=>src/dst, matrix_partition=>edge_partition
  • e.g. row/col_properties_t => edge_src/dst_properties_t, matrix_partition_(device_)view_t => edge_partition_(device_)view_t, is_adj_matrix_transposed => storage_transposed
  • Better group graph(_view)_t's member functions about vertex partition & edge partition.
  • v_op of count_if_v or transform_reduce_v may better take a vertex ID as an input parameter (as e_op takes source/destination vertex IDs and to be more consistent).
  • Shouldn't use major/minor in non-detail public namespace. Currently, renumber_edgelist is using major/minor. Consistently use major/minor for the functions in the detail namespace.
  • Add do_expensive_check flag to the primitives to better detect invalid use cases.
  • We have three different styles of reduction parameters. 1) raft::comms::op_t, 2) functors pre-defined in https://github.com/rapidsai/cugraph/blob/branch-22.02/cpp/include/cugraph/prims/reduce_op.cuh, and 3) general reduction functors. Can we better unify these?
  • In case of 3), we should better document that reduction operations should be side-effect free (https://github.com/rapidsai/cugraph/blob/branch-22.02/cpp/include/cugraph/prims/reduce_op.cuh#L33).
  • reduce_v & transform_reduce_v take a reduction op (raft::comms::op_t) but transform_reduce_e, transform_reduce_by_key_aggregated_in_out_nbr, copy_v_transform_reduce_in_out_nbr and copy_v_transform_reduce_key_aggregated_out_nbr do not. Better be consistent.
  • num_buckets of VertexFrontier (https://github.com/rapidsai/cugraph/blob/branch-22.02/cpp/include/cugraph/prims/vertex_frontier.cuh#L283) doesn't need to be a template parameter.
  • No benefit in performance & possibly increase file size (need to double check).
  • VertexFrontier & SortedUniqueBucket may better be renamed to vertex_frontier_t and sorted_unique_bucket_t to be consistent with graph(_view)_t.
  • Better not use enums for bucket indices (e.g. https://github.com/rapidsai/cugraph/blob/branch-22.02/cpp/src/traversal/bfs_impl.cuh#L127) as this requires static_cast to size_t. Better use constexpr size_t instead.
  • Make graph(_view)_t's member functions that users do not need to use (functions mainly added to implement primitives) as private.
  • Need to figure out how to declare graph primitives as friend functions.
  • Make sure update_frontier_v_push_if_out_nbr works with key_t = vertex_t payload_t = void
  • row/col_properties_t::fill may better take handle instead of stream (https://github.com/rapidsai/cugraph/blob/branch-22.02/cpp/include/cugraph/prims/row_col_properties.cuh#L436)

Discarded ideas

  • Should we ask users to call device_view() when passing row/col_properties_t (should be renamed to edge_src/dst_properties_t) to graph primitives? => Better not to allow creating a device_view from a raw pointer (in SG) or to concatenate device_view objects
  • row/col_properties_t is expensive to pass by value (and actually, cannot be copied as a copy operator of rmm::device_uvector is deleted)
  • Implicitly convert to device_view objects?
  • Think about adding vertex_properties_t as we have edge_src/dst_properties_t (currently row/col_properties_t). => Maybe an overkill at this moment as vertex property values are simply 1D partitioned.
  • This can be wasteful if a primitive iterates over only a subset of (local-)vertices.
  • Vertex properties are often passed as raw arrays in algorithms. Need a mechanism to create a vertex_properties_device_view_t from a raw array and release an internal buffer as a raw array.
@seunghwak seunghwak added the ? - Needs Triage Need team to review and classify label Jan 5, 2022
@github-actions
Copy link

github-actions bot commented Feb 4, 2022

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

rapids-bot bot pushed a commit that referenced this issue Mar 20, 2022
This is one of the series of planned PRs to address #2003

We want to consistently use source & destination (or src & dst as abbreviation) instead of mixing source/destination and row/column and better group function/class/parameter names based on the concept of vertex partition and edge partition.

This PR renames

`row/col_properties_t` to `edge_partition_src/dst_property_t`
and
`copy_to_adj_matrix_partition_row/col` to `update_edge_partition_src/dst_property`
and
`transform_reduce_by_adj_matrix_row|col_key_e` to `transform_reduce_by_src|dst_key_e`

and fix resulting compile errors.

Possibly a breaking change (even though I am not aware of any external projects to be affected).

Authors:
  - Seunghwa Kang (https://github.com/seunghwak)

Approvers:
  - Andrei Schaffer (https://github.com/aschaffer)
  - Chuck Hastings (https://github.com/ChuckHastings)

URL: #2100
rapids-bot bot pushed a commit that referenced this issue Mar 22, 2022
Partially addresses #2003

We should not use major,minor (instead of src, dst) in the public non-detail API and should use major,minor in the detail space.

The public non-detail namespace renumber_edgelist currently uses major, minor and this PR fixes this.

This PR also replaces row/col to src/dst in `single_gpu_renumber_edgelist_given_number_map` as we're aiming to consistently use src/dst instead of row/col in our public C++ API.

This is possibly breaking as this PR changes the public API (but I am not aware of any non-cugraph libraries directly calling renumber_edgelist).

Authors:
  - Seunghwa Kang (https://github.com/seunghwak)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Brad Rees (https://github.com/BradReesWork)
  - Joseph Nke (https://github.com/jnke2016)
  - Rick Ratzel (https://github.com/rlratzel)

URL: #2116
rapids-bot bot pushed a commit that referenced this issue Apr 7, 2022
…tric terminologies instead of matrix centric terminolgies (#2187)

Partially address #2003

cuGrpah public API has been mixing vertex/edge-centric terminologies and vector/matrix-centric terminologies.

- matrix_partition => edge_partition
- row => source (src), column (col) => destination (dst)
- rename `get_property_name()` functions to `property_name()` for brevity and better follow naming conventions for C++ standard library
- other updates to better group/clarify vertex/edge partition related functions.

Authors:
  - Seunghwa Kang (https://github.com/seunghwak)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)

URL: #2187
rapids-bot bot pushed a commit that referenced this issue Apr 19, 2022
Partially address #2003.

1. Renumber VertexFrontier & SortedUniqueKeyBucket to vertex_frontier_t & sorted_unique_key_bucket_t to be consistent with the naming schemes for graph_t & graph_view_t.
2. vertex_frontier_t to take number of buckets as an input parameter (instead of non-type template parameter) and rename `get_bucket` to `bucket`.
3. Use `constexpr size_t` instead of `enum calss` for bucket indices to avoid unnecessary type casting.
4. Update `fill()` of `edge_partition_src|dst_property_t` to take `handle` instead of `stream` to be consistent with other member functions (e.g. `clear()`)
5. Remove `..._v` primitives that working on a subset of local vertices.
6. Update `v_op` to take vertex ID (to be consistent with `e_op` which takes source & destination IDs).
7. Other miscellaneous clean-ups.

Authors:
  - Seunghwa Kang (https://github.com/seunghwak)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Kumar Aatish (https://github.com/kaatish)

URL: #2220
@BradReesWork BradReesWork added Graph Prims and removed ? - Needs Triage Need team to review and classify inactive-30d labels Apr 25, 2022
@BradReesWork BradReesWork added this to the 22.06 milestone Apr 25, 2022
rapids-bot bot pushed a commit that referenced this issue Apr 28, 2022
Partially address #2003
Rename
- copy_v_transform_reduce_in|out_nbr => transform_reduce_incoming|outgoing_e_of_v
- copy_v_transform_reduce_key_aggregated_out_nbr => transform_reduce_key_aggregated_outgoing_e_of_v
- update_frontier_v_push_if_out_nbr => update_frontier_from_outgoing_e_of_v

Authors:
  - Seunghwa Kang (https://github.com/seunghwak)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Kumar Aatish (https://github.com/kaatish)

URL: #2234
rapids-bot bot pushed a commit that referenced this issue May 11, 2022
Resolve inconsistencies in reduction support in primitives (partially addresses #2003).
  * If `reduce_op` is missing, + is assumed.
  * If `init` is missing, T{} is assumed.
  * Provides a pre-defined set of reduction operations in `reduce_op.cuh`
  * Provides a guidance on writing custom reduction operators.
  * Fix a bug in reduction support for thrust::tuple (std::numeric_limits does not work properly for thrust::tuple, std::numeric_limits should be applied to individual elements of thrust::tuple).

Authors:
  - Seunghwa Kang (https://github.com/seunghwak)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Kumar Aatish (https://github.com/kaatish)

URL: #2257
rapids-bot bot pushed a commit that referenced this issue May 19, 2022
Partially address #2003

Add the `do_expensive_check` input parameter to graph primitives.

Authors:
  - Seunghwa Kang (https://github.com/seunghwak)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)

URL: #2274
rapids-bot bot pushed a commit that referenced this issue May 25, 2022
)

Close #2003

- `Split update_v_frontier_from_outgoing_e` to `transform_reduce_v_frontier_outgoing_e_by_dst` and `update_v_frontier`
- Previously `per_v` was used to specify both vertex-centric primitives (e.g. for each vertex, iterate over all the incoming or outgoing edges) AND reduction granularity of edge operation outputs (e.g. reduce edge operation outputs by keys associated with destinations). Updated to use `by_...` to specify reduction granularity.
- Bug fix in `update_v_frontier_from_outgoing_e` when key_t == vertex_t && payload_t = void.
- Other code-style updates.

Authors:
  - Seunghwa Kang (https://github.com/seunghwak)

Approvers:
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Kumar Aatish (https://github.com/kaatish)

URL: #2290
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants