Updates to cugraph.hypergraph
(Duplicate Col Labels Bug)
#4610
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
cc: @rlratzel @ChuckHastings
This PR addresses failures seen in certain PRs (like here) due to a recent change to
cudf
that disallows selecting duplicate column labels.In
hypergraph.py
, this PR modifies_create_hyper_edges
and_create_direct_edges
to ensure that DataFrames are being indexed by non-duplicate column values.This is done by taking a list that includes duplicates (
fs
), and removing the non-unique valuesThis part requires some attention from the author of the unit test @jnke2016
In
test_hypergraph.py
, this PR adds thecheck_like=True
arg toassert_frame_equals
function because the ordering of the columns is different for the two DFs.