-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: Categorical sort_values and sort Documentation #12785
Comments
prob an overight, should be using |
Given how straightforward the PR is, would it be for |
its a bug, why would it not be 0.18.1? |
Just wanted to double check. 😄 |
Are we sure this is a bug? In any case, I think it can be quite annoying if you cannot sort values (eg to just group them). |
you could be right @jorisvandenbossche, I don't really remember. cc @JanSchulz |
@jorisvandenbossche : Is calling @everyone : Regardless of whether this is a bug or not, something will have to give. The documentation insists that |
That was a consious descision sometime after implementing categoricals (and I just saw that the docstring was not changed: https://github.com/pydata/pandas/blob/master/pandas/core/categorical.py#L205) commit is here: 87fec5b |
Yep, I think it started here: #9611 (comment) |
But in the end, it was changed here: #9622 (@JanSchulz you can see that in the commit (below the commit message)) |
Perhaps refactoring |
What are people's thoughts on this? Should we stick with the current functionality, which is consistent with R but seems to be logically inconsistent, or should it be enforced? |
Personally, I would leave it as is. Anyway, if we do the above, the docs of
Another numpy compat issue .. :-) |
@jorisvandenbossche : Don't worry about the |
@everyone: I think if it is just a semantics thing, it's just a DOC-change PR then (not an API one as I had originally titled it)? |
yes why don't u update doc string and docs as needed to be more clear |
I have generalized this issue to just questions about |
@gfyoung Because having good docs is hard and needs many eyes. |
@jorisvandenbossche : Ah, that's what I actually meant. Sorry, the "it" was a little vague! My question is: why is |
@gfyoung |
@jreback : Does >>> c = pd.Categorical([np.nan, 2, 2, np.nan, 5])
>>> c.sort_values(ascending=True, na_position='first')
# expected : [2.0, 2.0, 5.0, NaN, NaN]
[NaN, NaN, 2.0, 2.0, 5.0]
Categories: (2, int64): [2, 5]
>>> c.sort_values(ascending=True, na_position='last')
[NaN, NaN, 2.0, 2.0, 5.0]
Categories: (2, int64): [2, 5] |
|
@jreback : Ah, okay, so it makes sense to align with the |
yep, misspoke, its independent, but would like to tie |
@jreback : IMO the |
Clarifies the meaning of 'sort' in the context of Categorical to mean 'organization' rather than 'order', as it is possible to call this method (as well as 'sort_values') when the Categorical is unordered. Also patches a bug in 'Categorical.sort_values' in which 'na_position' was not being respected when 'ascending' was set to 'True'. This commit aligns the behaviour with that of Series. Finally, this commit deprecates 'sort' in favor of 'sort_values,' which is in alignment with the Series API as well. Closes pandas-devgh-12785.
In
categorical.py
, we enforce the fact thatself
must be ordered when callingmin
ormax
. However,self
can be unordered when callingsort_values
. This doesn't make sense in my mind, for if you can sort the values for unorderedself
, then I can then find a minimum value ofself
. The same comment applies toargsort
as well.The text was updated successfully, but these errors were encountered: