Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add options to extract_subgraph() to bypass renumbering and adding edge_data, exclude internal _WEIGHT_ column from edge_property_names, added num_vertices_with_properties attr #2419

Merged

Conversation

rlratzel
Copy link
Contributor

@rlratzel rlratzel commented Jul 16, 2022

Add options to extract_subgraph() to bypass renumbering and adding edge_data, exclude internal _WEIGHT_ column from edge_property_names.

Also added a new attribute num_vertices_with_properties which returns the number of vertices with properties, which is different than the number of vertices, since vertices can be added via add_edge_data(). This is needed for GNN use cases which need to know how many verts have properties which can be accessed (this corresponds to the number of rows in the internal vertex prop data table).

Added unit tests to verify new extract_subgraph() options work, the new num_vertices_with_properties attribute, and _WEIGHT_ columns names aren't included, for both SG and MG versions.

closes #2418
closes #2410

…edge_data to extraced subgraph, fixed issue with internal weight column being returned by property names API, added tests for new features and bug fix.
@rlratzel rlratzel added bug Something isn't working non-breaking Non-breaking change labels Jul 16, 2022
@rlratzel rlratzel added this to the 22.08 milestone Jul 16, 2022
@rlratzel rlratzel requested a review from eriknw July 16, 2022 00:21
@rlratzel rlratzel self-assigned this Jul 16, 2022
@rlratzel rlratzel requested a review from a team as a code owner July 16, 2022 00:21
…hich returns the number of verts that have properties (different than the number of verts). This is needed for GNN use cases.
@rlratzel rlratzel changed the title Add options to extract_subgraph() to bypass renumbering and adding edge_data, exclude internal _WEIGHT_ column from edge_property_names Add options to extract_subgraph() to bypass renumbering and adding edge_data, exclude internal _WEIGHT_ column from edge_property_names, added num_vertices_with_properties attr Jul 16, 2022
@codecov-commenter
Copy link

codecov-commenter commented Jul 16, 2022

Codecov Report

Merging #2419 (c5186a3) into branch-22.08 (2aad5f2) will decrease coverage by 0.04%.
The diff coverage is 40.00%.

@@               Coverage Diff                @@
##           branch-22.08    #2419      +/-   ##
================================================
- Coverage         60.11%   60.06%   -0.05%     
================================================
  Files               102      102              
  Lines              5155     5174      +19     
================================================
+ Hits               3099     3108       +9     
- Misses             2056     2066      +10     
Impacted Files Coverage Δ
...ugraph/cugraph/dask/structure/mg_property_graph.py 17.62% <8.00%> (-0.87%) ⬇️
python/cugraph/cugraph/structure/property_graph.py 96.79% <93.33%> (+0.36%) ⬆️
...pylibcugraph/pylibcugraph/experimental/__init__.py 100.00% <0.00%> (ø)
python/cugraph/cugraph/structure/graph_classes.py 80.00% <0.00%> (+0.51%) ⬆️
...ython/cugraph/cugraph/community/ktruss_subgraph.py 88.23% <0.00%> (+2.94%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2aad5f2...c5186a3. Read the comment docs.

…, minor tests refactoring. Still need to update SG the same way.
Copy link
Contributor

@eriknw eriknw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I can update num_vertices_with_properties in my PR to instead be get_num_vertices(include_edge_data=False).

eriknw added a commit to eriknw/cugraph that referenced this pull request Jul 22, 2022
- add `include_edge_data=True` keyword to `get_num_vertices`
- improved docstrings of `get_num_vertices` and `get_num_edges`
- change default type name to `""`
- remove `num_vertices` and `num_edges` properties
- use `series.value_counts` to compute counts of types (assuming types are low cardinality)
- add and update tests
- copy `pG.num_vertices_with_properties` tests from rapidsai#2419

MG is not quite finished yet.
@BradReesWork
Copy link
Member

@gpucibot merge

@rapids-bot rapids-bot bot merged commit efc05b3 into rapidsai:branch-22.08 Jul 25, 2022
@rlratzel rlratzel deleted the branch-22.08-pg_updates_for_gnns branch September 28, 2023 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Non-breaking change
Projects
None yet
5 participants