Skip to content

REF: give rank1d/2d same nan filling #41916

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 17, 2021

Conversation

mzeitlin11
Copy link
Member

A piece broken off the rank_2d deduplication

@mzeitlin11 mzeitlin11 added Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Refactor Internal refactoring of code labels Jun 10, 2021
@@ -931,6 +931,32 @@ ctypedef fused rank_t:
int64_t


cdef rank_t get_rank_nan_fill_val(bint rank_nans_highest, ndarray[rank_t, ndim=1] _):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the second argument here need to be an ndarray or can it be a type?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great suggestion! type makes sense as a way to allow the rank_t parameterization - messed around with this for a while trying to use type[rank_t], but kept running into failures when actually trying to define this value. For example, I wanted to do something like cdef type[rank_t] = values.dtype.type, but kept running into unhelpful errors (maybe this definition just doesn't make sense, but it compiled).

But just pushed a much cleaner solution than the previous way of handling this type specialization.

@simonjayhawkins
Copy link
Member

@mzeitlin11 do the stubs need updating for the extra function and extra parameter?

@mzeitlin11
Copy link
Member Author

mzeitlin11 commented Jun 10, 2021

@mzeitlin11 do the stubs need updating for the extra function and extra parameter?

This added function (and the function with a new param) can only be called from Cython code, so compilation should handle type checking correctness. For another internal function kth_smallest_c, I don't see it typed in algos.pyi, so I don't think anything needs to be added.

@simonjayhawkins
Copy link
Member

@mzeitlin11 do the stubs need updating for the extra function and extra parameter?

This added function (and the function with a new param) will only be called from Cython code, so compilation should handle type checking correctness. For another internal function kth_smallest_c, I don't see it typed in algos.pyi, so I don't think anything needs to be added.

@mzeitlin11 do the stubs need updating for the extra function and extra parameter?

This added function (and the function with a new param) will only be called from Cython code, so compilation should handle type checking correctness. For another internal function kth_smallest_c, I don't see it typed in algos.pyi, so I don't think anything needs to be added.

I see. they are both cdef so not exposed to Python.

@@ -931,6 +931,32 @@ ctypedef fused rank_t:
int64_t


cdef rank_t get_rank_nan_fill_val(bint rank_nans_highest, rank_t[:] _=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is there an _ here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can rank_t be required

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _ was to signify that the argument is unused. Not requiring rank_t here is purposeful - the thought here was that we need an argument with rank_t present for fused type specialization, but with it being optional we never have to worry about passing anything.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i c, might be worth comment ing in the doc-string as this is non-obvious (oh you did this nvm)

@jreback jreback added this to the 1.4 milestone Jun 17, 2021
@@ -931,6 +931,32 @@ ctypedef fused rank_t:
int64_t


cdef rank_t get_rank_nan_fill_val(bint rank_nans_highest, rank_t[:] _=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i c, might be worth comment ing in the doc-string as this is non-obvious (oh you did this nvm)

@jreback jreback merged commit c704d04 into pandas-dev:master Jun 17, 2021
@mzeitlin11 mzeitlin11 deleted the rank_2d_dedup_chunk branch June 17, 2021 17:08
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Refactor Internal refactoring of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants