Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TASK][JNI] Investigate train of null_count after explode #11923

Open
abellina opened this issue Oct 14, 2022 · 2 comments
Open

[TASK][JNI] Investigate train of null_count after explode #11923

abellina opened this issue Oct 14, 2022 · 2 comments
Labels
libcudf Affects libcudf (C++/CUDA) code. Performance Performance related issue Spark Functionality that helps Spark RAPIDS

Comments

@abellina
Copy link
Contributor

While analyzing an nsys trace for a Spark job with deeply nested tables, we see an explode kernel call that is followed by a train of null_count, which end in is_valid.

After we call cudf::explode we build up a table, and construct java ColumnVector objects. I think the construction of these objects is triggering it.

This task is to confirm that the columns with missing a null count are coming from the explode kernels. If they are coming from explode, it would be great if explode could compute null count as part of that kernel.

In this screenshot, it is the ~20ms at the end after explode:
Screenshot from 2022-10-14 11-21-24

@abellina abellina added feature request New feature or request Needs Triage Need team to review and classify Spark Functionality that helps Spark RAPIDS labels Oct 14, 2022
@GregoryKimball
Copy link
Contributor

I'd like to cross-reference this issue with #11968. It's likely that the null_count appearances in profiles will change as we refactor null_count for compatibility with user-provided streams.

@GregoryKimball GregoryKimball added this to the Enable streams milestone Oct 21, 2022
@GregoryKimball GregoryKimball added libcudf Affects libcudf (C++/CUDA) code. Performance Performance related issue and removed feature request New feature or request Needs Triage Need team to review and classify labels Oct 21, 2022
@GregoryKimball GregoryKimball removed this from the Enable streams milestone Oct 21, 2022
@jrhemstad
Copy link
Contributor

it would be great if explode could compute null count as part of that kernel.

I'm not following. explode returns a table:

std::unique_ptr<table> explode(

Are you then constructing a column_view for each of the lists that were exploded?

If so, then yeah, you're going to have a problem with computing the null count of each of those column_views individually.

To make that efficient, you'd have to do what we do in cudf::split where we compute the individual null counts in bulk with a single segmented_null_count:

/**
* @brief Given a validity bitmask, counts the number of null elements (unset
* bits) in every range `[indices[2*i], indices[(2*i)+1])` (where 0 <= i <
* indices.size() / 2).
*
* If `bitmask == nullptr`, all elements are assumed to be valid and a vector of
* length `indices.size()` containing all zeros is returned.
*
* @throws cudf::logic_error if `indices.size() % 2 != 0`
* @throws cudf::logic_error if `indices[2*i] < 0 or indices[2*i] > indices[(2*i)+1]`
*
* @param[in] bitmask Validity bitmask residing in device memory.
* @param[in] indices A host_span of indices specifying ranges to count the number of null elements.
* @param[in] stream CUDA stream used for device memory operations and kernel launches.
* @return A vector storing the number of null elements in each specified range.
*/
std::vector<size_type> segmented_null_count(bitmask_type const* bitmask,
host_span<size_type const> indices,
rmm::cuda_stream_view stream);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libcudf Affects libcudf (C++/CUDA) code. Performance Performance related issue Spark Functionality that helps Spark RAPIDS
Projects
Status: Needs owner
Development

No branches or pull requests

3 participants