Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strong index types for equality comparator #10883

Merged
merged 60 commits into from
May 19, 2022

Conversation

ttnghia
Copy link
Contributor

@ttnghia ttnghia commented May 17, 2022

This adds strong index types for equality comparator, along with #10730 to unblock #10548, #10656, and several others nested type feature requests.

bdice and others added 30 commits May 2, 2022 17:19
…p both weakly and strongly typed row comparators.
Co-authored-by: Nghia Truong <ttnghia@users.noreply.github.com>
Co-authored-by: Jake Hemstad <jhemstad@nvidia.com>
cpp/include/cudf/table/experimental/row_operators.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/table/experimental/row_operators.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/table/experimental/row_operators.cuh Outdated Show resolved Hide resolved
cpp/src/structs/search/contains.cu Outdated Show resolved Hide resolved
Copy link
Member

@ajschmidt8 ajschmidt8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving ops-codeowner file changes

@ttnghia ttnghia requested a review from bdice May 18, 2022 16:23
cpp/include/cudf/table/experimental/row_operators.cuh Outdated Show resolved Hide resolved
cpp/include/cudf/table/experimental/row_operators.cuh Outdated Show resolved Hide resolved
cpp/src/structs/search/contains.cu Outdated Show resolved Hide resolved
cpp/src/structs/search/contains.cu Outdated Show resolved Hide resolved
cpp/src/table/row_operators.cu Outdated Show resolved Hide resolved
@ttnghia ttnghia force-pushed the strong_typed_index branch from d1a5b7b to c51b053 Compare May 18, 2022 20:25
@ttnghia ttnghia requested a review from bdice May 18, 2022 20:31
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small suggestion, otherwise LGTM. Thanks for your help with this, @ttnghia!

Are there new cases that can be supported by this implementation? Are tests needed for those?

cpp/src/structs/search/contains.cu Outdated Show resolved Hide resolved
Co-authored-by: Bradley Dice <bdice@bradleydice.com>
@ttnghia
Copy link
Contributor Author

ttnghia commented May 18, 2022

Are there new cases that can be supported by this implementation? Are tests needed for those?

The search.cu file was modified to use this, and its tests directly use this.

More implementation in #10656 and #10548 will use this immediately.

@ttnghia
Copy link
Contributor Author

ttnghia commented May 19, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 54789ee into rapidsai:branch-22.06 May 19, 2022
@jrhemstad
Copy link
Contributor

@ttnghia the policy is for 2 cpp approvals before merging a PR.

@ttnghia
Copy link
Contributor Author

ttnghia commented May 19, 2022

Wait, I had only 1 cpp approval? Sorry I missed that, merged after seeing 2 approvals. Will pay more attention next time.

Comment on lines +1011 to +1012
return strong_index_comparator_adapter<device_row_comparator<Nullate>>{
device_row_comparator<Nullate>(nullate, *d_left_table, *d_right_table, nulls_are_equal)};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi, CTAD is your friend.

    return strong_index_comparator_adapter{device_row_comparator(nullate, *d_left_table, *d_right_table, nulls_are_equal)};

Comment on lines +39 to +48
// Create a (structs) column_view of one row having children given from the input scalar.
auto const needle_tv = static_cast<struct_scalar const*>(&needle)->view();
auto const needle_as_col =
column_view(data_type{type_id::STRUCT},
1,
nullptr,
nullptr,
0,
0,
std::vector<column_view>{needle_tv.begin(), needle_tv.end()});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably add overloads of the comparators that take a scalar for the lhs or rhs that do this automaticaly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad you mentioned this. I was thinking that would be a helpful next step as well, but I hadn't yet spent the time to identify how many times that pattern occurs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be an interesting follow up PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed an issue: #10892

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also filed a related FEA issue: #10893

@ttnghia ttnghia deleted the strong_typed_index branch May 19, 2022 22:04
rapids-bot bot pushed a commit that referenced this pull request Aug 17, 2022
This extends the `cudf::contains` API to support nested types (lists + structs) with arbitrarily nested levels. As such, `cudf::contains` will work with literally any type of input data.

In addition, this fixes null handling of `cudf::contains` with structs column + struct scalar input when the structs column contains null rows at the top level while the scalar key is valid but all nulls at children levels.

Closes: #8965
Depends on:
 * #10730
 * #10883
 * #10802
 * #10997
 * NVIDIA/cuCollections#172
 * NVIDIA/cuCollections#173
 * #11037
 * #11356

Authors:
  - Nghia Truong (https://github.com/ttnghia)
  - Devavret Makkar (https://github.com/devavret)
  - Bradley Dice (https://github.com/bdice)
  - Karthikeyan (https://github.com/karthikeyann)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Bradley Dice (https://github.com/bdice)
  - Yunsong Wang (https://github.com/PointKernel)

URL: #10656
rapids-bot bot pushed a commit that referenced this pull request Aug 22, 2022
This extends the `lists::contains` API to support nested types (lists + structs) with arbitrarily nested levels. As such, `lists::contains` will work with literally any type of input data.

In addition, the related implementation has been significantly refactored to facilitate adding new implementation.

Closes #8958.
Depends on:
 * #10730
 * #10883
 * #10999
 * #11019
 * #11037

Authors:
  - Nghia Truong (https://github.com/ttnghia)
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - MithunR (https://github.com/mythrocks)
  - Bradley Dice (https://github.com/bdice)

URL: #10548
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Spark Functionality that helps Spark RAPIDS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants