Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regex filter on non-string columns raises an exception #5798

Closed
jseabold opened this issue Dec 30, 2013 · 3 comments · Fixed by #30222
Closed

regex filter on non-string columns raises an exception #5798

jseabold opened this issue Dec 30, 2013 · 3 comments · Fixed by #30222
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Milestone

Comments

@jseabold
Copy link
Contributor

I'm not really sure what should be done here. At least a better error message I suppose. Perhaps just skip non-string columns or maybe even try to turn them into strings? E.g., make match("\d", 123) work? Obviously could run into some problems for things that can't be converted (or do "unexpected" things) for turning into string/unicode.

>>> pd.version.version
>>> '0.12.0-1149-g141e93a'

(if columns are all numerical also bombs with different msg)
>>> pd.DataFrame(np.random.random((3,2)), columns=['STRING', 123]).filter(regex='STRING')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-427-e5acd93a79d6> in <module>()
----> 1 pd.DataFrame(np.random.random((3,2)), columns=['STRING', 123]).filter(regex='STRING')

/usr/local/lib/python2.7/dist-packages/pandas-0.12.0_1149_g141e93a-py2.7-linux-x86_64.egg/pandas/core/generic.pyc in filter(self, items, like, regex, axis)
   1522             matcher = re.compile(regex)
   1523             return self.select(lambda x: matcher.search(x) is not None,
-> 1524                                axis=axis_name)
   1525         else:
   1526             raise TypeError('Must pass either `items`, `like`, or `regex`')

/usr/local/lib/python2.7/dist-packages/pandas-0.12.0_1149_g141e93a-py2.7-linux-x86_64.egg/pandas/core/generic.pyc in select(self, crit, axis)
   1116         if len(axis_values) > 0:
   1117             new_axis = axis_values[
-> 1118                 np.asarray([bool(crit(label)) for label in axis_values])]
   1119         else:
   1120             new_axis = axis_values

/usr/local/lib/python2.7/dist-packages/pandas-0.12.0_1149_g141e93a-py2.7-linux-x86_64.egg/pandas/core/generic.pyc in <lambda>(x)
   1521         elif regex:
   1522             matcher = re.compile(regex)
-> 1523             return self.select(lambda x: matcher.search(x) is not None,
   1524                                axis=axis_name)
   1525         else:

TypeError: expected string or buffer
@jreback
Copy link
Contributor

jreback commented Dec 30, 2013

yep...needs better error msg

@dylancis
Copy link

specially when dealing with multi-level index. We need to drop of the level to make it single index before be able to use filter method. Perhaps we could add a 'level' argument to the filter method so we can specify a level to look-at.

@jreback jreback modified the milestones: 0.15.0, 0.14.0 Apr 4, 2014
@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 3, 2015
@datapythonista datapythonista modified the milestones: Contributions Welcome, Someday Jul 8, 2018
@mroeschke
Copy link
Member

Looks to work correctly on master. Could use a test.

In [5]: pd.__version__
Out[5]: '0.26.0.dev0+684.g953757a3e'

In [6]: >>> pd.DataFrame(np.random.random((3,2)), columns=['STRING', 123]).filter(regex='STRING')
Out[6]:
     STRING
0  0.904452
1  0.546133
2  0.691725

@mroeschke mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions and removed Error Reporting Incorrect or improved errors from pandas labels Oct 27, 2019
@jreback jreback modified the milestones: Someday, 1.0 Dec 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants