Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Dummy Edge Weights, Support Specifying Edge Ids/Edge Types/Weights Separately #3495

Merged
merged 40 commits into from
May 9, 2023

Conversation

alexbarghi-nv
Copy link
Member

@alexbarghi-nv alexbarghi-nv commented Apr 17, 2023

Closes #3486
Merge after #3513 - Merged

@alexbarghi-nv alexbarghi-nv added this to the 23.06 milestone Apr 17, 2023
@alexbarghi-nv alexbarghi-nv added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Apr 17, 2023
@alexbarghi-nv alexbarghi-nv added the Blocked Cannot progress due to external reasons label Apr 28, 2023
@alexbarghi-nv alexbarghi-nv removed the Blocked Cannot progress due to external reasons label May 5, 2023
@alexbarghi-nv alexbarghi-nv marked this pull request as ready for review May 5, 2023 14:44
@@ -338,19 +338,19 @@ extern "C" cugraph_error_code_t cugraph_mg_graph_create(
weight_type = cugraph_data_type_id_t::FLOAT32;
}

CAPI_EXPECTS((edge_type_ids == nullptr) || (p_edge_ids->type_ == edge_type),
CAPI_EXPECTS((edge_ids == nullptr) || (p_edge_ids->type_ == edge_type),
CUGRAPH_INVALID_INPUT,
"Invalid input arguments: Edge id type must match edge (src/dst) type",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment accurate?

Edge ID should match with edge_t, not vertex_t.

*error);
CAPI_EXPECTS((edge_ids == nullptr) || (p_edge_ids->type_ == edge_type),
CUGRAPH_INVALID_INPUT,
"Invalid input arguments: Edge id type must match edge (src/dst) type",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. edge ID type should match with edge_t, not vertex_t.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the message, but it's still confusing to the user why they may get an error if their input graph has int64 vertex ids but only int32 edge ids. It's not that crazy for that to happen (i.e. vertex ids are a hash). I'll consider adding something at the Python layer to make that more clear.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added the warning at the Python layer.

Copy link
Contributor

@seunghwak seunghwak May 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's still confusing to the user why they may get an error if their input graph has int64 vertex ids but only int32 edge ids.

Yeah, this needs some investigation.

One reason is, we're using vertex_t for both original & renumbered vertex IDs. Another reason is to limit the number of explicit template instantiations (this affects compile time & binary size).

After renumbering, # edges is generally more than # vertices (not true for graphs with many isolated vertices, but this is rather an exception).

Copy link
Contributor

@rlratzel rlratzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I reviewed these changes in #3524

@alexbarghi-nv alexbarghi-nv requested a review from a team as a code owner May 9, 2023 14:20
@BradReesWork
Copy link
Member

/merge

1 similar comment
@rlratzel
Copy link
Contributor

rlratzel commented May 9, 2023

/merge

@rapids-bot rapids-bot bot merged commit c6bacd5 into rapidsai:branch-23.06 May 9, 2023
rapids-bot bot pushed a commit that referenced this pull request May 10, 2023
Resolves #3318 
Resolves #3385 

Updates the bulk sampler to use the offsets returned by the `uniform_neighbor_sample` call, and take advantage of the pre-partitioned data returned from C++ to avoid repartitioning and scanning, and also to reduce the number of files written.  This PR is expected to significantly improve memory and compute time for bulk sampling.

Merge after #3517  - Merged
Merge after #3495  - Merged

Authors:
  - Alex Barghi (https://github.com/alexbarghi-nv)
  - Chuck Hastings (https://github.com/ChuckHastings)
  - Brad Rees (https://github.com/BradReesWork)

Approvers:
  - Rick Ratzel (https://github.com/rlratzel)

URL: #3524
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update Graph creation API (add_cudf_edgelist) to allow for individual edge ID/type/weight specifications
6 participants