Skip to content

Doc/df filter #12395

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Doc/df filter #12395

wants to merge 4 commits into from

Conversation

cswarth
Copy link
Contributor

@cswarth cswarth commented Feb 19, 2016

Updates doc comments for DataFrame.filter and adds usage examples.

DataFrame.filter(items=None, like=None, regex=None, axis=None)

Subset rows or columns of dataframe according to labels in the index.

Note that this routine does not filter a dataframe on its contents. The filter is applied to the labels of the index. This method is a thin veneer on top of DateFrame Select

Parameters:

items : list-like

List of info axis to restrict to (must not all be present)

like : string

Keep info axis where “arg in col == True”

regex : string (regular expression)

Keep info axis with re.search(regex, col) == True

axis : int or None

The axis to filter on.

Returns:

same type as input object with filtered info axis

Notes

The items, like, and regex parameters should be mutually exclusive, but this is not checked.

axis defaults to the info axis that is used when indexing with [].

Examples

>>> df
        one  two  three
mouse     1    2      3
rabbit    4    5      6
>>> # select columns by name
>>> df.filter(items=['one', 'three'])  
        one  three
mouse     1      3
rabbit    4      6
>>> # select columns by regular expression
>>> df.filter(regex='e$', axis=1)
        one  three
mouse     1      3
rabbit    4      6
>>> # select rows containing 'bbi'
>>> df.filter(like='bbi', axis=0)
        one  two  three
rabbit    4    5      6

@max-sixty
Copy link
Contributor

I've never understood exactly what the role of this method is - it seems to do a lot of stuff not very well, particularly when select (also not well documented) is available.

Is it worth looking at filter + select + query, and thinking about whether they all need to be in the API?

( @cswarth thanks for the docs, they look much better imo)

@shoyer
Copy link
Member

shoyer commented Feb 19, 2016

Is it worth looking at filter + select + query, and thinking about whether they all need to be in the API?

Yes, this is definitely worth considering -- maybe open another issue for this? My inclination would be to drop filter and select in favor of just using indexing operations.

@cswarth
Copy link
Contributor Author

cswarth commented Feb 19, 2016

Bummer, all CI tests have failed on account of a failure in lint.sh

https://travis-ci.org/pydata/pandas/jobs/110452254#L1231

It's failing on generic.py so this is probably my fault. I'm looking into this.

@TomAugspurger
Copy link
Contributor

you can run git diff upstream/master | flake8 --diff to just lint your section.

@cswarth
Copy link
Contributor Author

cswarth commented Feb 19, 2016

darn it, ignore this PR for now - I merged instead of rebasing. There really aren't 900+ changes here.
I'll fix it.

@cswarth cswarth closed this Feb 19, 2016
@cswarth cswarth deleted the doc/df_filter branch February 19, 2016 23:59
@jorisvandenbossche
Copy link
Member

A previous attempt that didn't get merged to clear up the sky on filter / select is here: #6599
Probably worth a look for such a discussion

@jorisvandenbossche
Copy link
Member

@cswarth for future reference, you don't need to close a PR if you screwed up a rebase. Just clean-up the branch (or make the branch again with the same name) and force push it again

@cswarth
Copy link
Contributor Author

cswarth commented Feb 20, 2016

ah ok, thank you. I apologize for making such a simple task difficult.
I deleted the branch before pushing and that automatically nuked the PR.
I'm assuming my only option now it to create a new PR?

@jorisvandenbossche
Copy link
Member

Yes, I think so. Also if you force pushed while the PR was closed, you cannot open it again (I find that a quite annoying thing of github)

@jreback
Copy link
Contributor

jreback commented Feb 20, 2016

note we had quite a bit of discussion w.r.t. deprecateing filter/select here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants