Skip to content

Series upconversion in apply #2316

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jreback opened this issue Nov 21, 2012 · 7 comments
Closed

Series upconversion in apply #2316

jreback opened this issue Nov 21, 2012 · 7 comments
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Nov 21, 2012

See the below examples. My use case started out as trying to use apply on a Series (and having the applied function return a Series) in order to get a DataFrame - but seeing some inconsistencies of how the returned function is accumulated (obviously can do this by iterating over the series in a list comprehension and using concat too).

In [1]: import pandas as pd

In [2]: import numpy as np

In [3]: print pd.__version__
0.8.1

In [4]: 

In [4]: s  = pd.Series(np.random.rand(5))

In [5]: s
Out[5]: 
0    0.681552
1    0.299908
2    0.212221
3    0.365597
4    0.779190

In [6]: print type(s)
<class 'pandas.core.series.Series'>

In [7]: print type(s[0])
<type 'numpy.float64'>

base case

In [6]: y = s.apply(lambda x: x)

In [7]: print type(y)
<class 'pandas.core.series.Series'>

In [8]: print type(y[0])
<type 'numpy.float64'>

In [9]: print y[0]
0.681551901344

this looks like it is doing some sort of conversion
e.g. using the Series values to construct the applied series

In [10]: y = s.apply(lambda x: pd.Series(x))

In [11]: print type(y)
<class 'pandas.core.series.Series'>

In [12]: print type(y[0])
<type 'numpy.float64'>

In [13]: print y[0]
0.681551901344

This is a bit inconsistent with the prior example;
Should this be upconverting to a DataFrame?

In [14]: y = s.apply(lambda x: pd.Series(x, index = ['foo']))

In [15]: print type(y)
<class 'pandas.core.series.Series'>

In [16]: print type(y[0])
<class 'pandas.core.series.Series'>

In [17]: print y[0]
foo    0.681552
@dalejung
Copy link
Contributor

What does s[0] print for you? Series output will abbreviated.

@jreback
Copy link
Contributor Author

jreback commented Nov 22, 2012

updated....s[0] just a float

@wesm
Copy link
Member

wesm commented Nov 26, 2012

I'm getting something different, what commit of pandas are you on?

In [14]: s
Out[14]: 
0    0.432431
1    0.511677
2    0.187196
3    0.108227
4    0.987615

In [15]: s.apply(lambda x: pd.Series(x, index = ['foo']))
Out[15]: 
0   NaN
1   NaN
2   NaN
3   NaN
4   NaN

@wesm
Copy link
Member

wesm commented Nov 26, 2012

The issue is that apply tries to infer whether a function passed is a ufunc by calling it on the Series itself and seeing whether the result is an ndarray. I'm not sure whether this is right, though. Perhaps if the result is an array that you should get back a DataFrame?

@jreback
Copy link
Contributor Author

jreback commented Nov 26, 2012

using 0.9.1 release...seems to work fine....

my original motivation was to do an apply operation and have it (possibily) upconverted...

should these do essentially the same thing? (or is this just asking for trouble?)

essentially apply is a concat operation anyhow.

pd.concat([ pd.Series(y, index = ['foo']) for x, y in s.iteritems() ], axis=1)

s.apply(lambda x: pd.Series(x, index = ['foo']))

@wesm wesm closed this as completed in d4810f0 Nov 26, 2012
@wesm
Copy link
Member

wesm commented Nov 26, 2012

Let me know if anyone disagrees with this API change. Seems somewhat reasonable:

In [1]: s  = pd.Series(np.random.rand(5))

In [2]: s
Out[2]: 
0    0.808653
1    0.074522
2    0.824759
3    0.429260
4    0.821308

In [3]: s.apply(lambda x: Series([x, x**2], index=['x', 'x^2']))
Out[3]: 
          x       x^2
0  0.808653  0.653919
1  0.074522  0.005554
2  0.824759  0.680228
3  0.429260  0.184264
4  0.821308  0.674547

@jreback
Copy link
Contributor Author

jreback commented Nov 26, 2012

you are awesome! thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants