Skip to content

Fixed bug #9733 where stat functions returned a python scalar for empty series #9829

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

remiremi
Copy link

@remiremi remiremi commented Apr 7, 2015

closes #9733

@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions Compat pandas objects compatability with Numpy or Python functions labels Apr 7, 2015
@jreback jreback added this to the 0.16.1 milestone Apr 7, 2015
@jreback
Copy link
Contributor

jreback commented Apr 7, 2015

can you run thru all dtypes (look in pandas/src/inference.pyx) for a list
and do this for say sum/mean on empties and see what comes back and post it here. thanks.

obviously some of these will raise.

@jreback
Copy link
Contributor

jreback commented Apr 17, 2015

@remiremi can you update

@remiremi
Copy link
Author

Yes, I have updated unit tests to loop through types and methods.

For numpy types, I compare the result type with the equivalent function in numpy.

For the string types, I guess the result should be empty string for the sum and nan for the rest.

It still needs some work: master...Remiremi:issue_9733_bis

Let me know if you don't think this is the right direction.

@jreback
Copy link
Contributor

jreback commented Apr 17, 2015

you can't change this line, otherwise other platforms will break
the_sum = values.sum(axis, dtype=dtype_max)

uints are very odd, not sure how to handle all this

@jreback jreback modified the milestones: 0.17.0, 0.16.1 Apr 23, 2015
@jreback
Copy link
Contributor

jreback commented May 9, 2015

@remiremi can you update? I

@jreback
Copy link
Contributor

jreback commented Jul 28, 2015

can you rebase / update?

@jreback
Copy link
Contributor

jreback commented Aug 15, 2015

@remiremi this looked ok, but I would like to run thru a number of dtypes comprehensively for this.

@jreback
Copy link
Contributor

jreback commented Aug 15, 2015

need a release note as well

@remiremi remiremi force-pushed the issue_9733 branch 2 times, most recently from 532529f to 8d8c472 Compare August 16, 2015 12:41
@remiremi
Copy link
Author

Just found some time to update the pull request, and added mention in v0.17.0.txt. Let me know

@jreback
Copy link
Contributor

jreback commented Aug 19, 2015

@remiremi well, you have taken on a task here. Can you update for the failures.

@remiremi
Copy link
Author

Mmmh yes, my fix is taking the wrong approach. I'll update later this week.

@remiremi
Copy link
Author

Just updated the branch but there's still a test failure (in test_resample, test_aggregate_with_nat). It looks to me like a bug in TimeGrouper. It somehow calls nanprod on an empty series, which expectedly returns 1.

@jreback
Copy link
Contributor

jreback commented Aug 21, 2015

@remiremi this whole thing is a can of worms, I have a very similar fix, see here for #9422

but actually getting this to work is annoying. Because the tests are now dependent on whether bottleneck is installed for testing (and what version), e.g. < 1.0 is the old behavior, >= 1.0 is the new. To make more complicated we actually turn off bottleneck for testing several places (though that is now fixed and consistent).

So need to fix both these issues, and they follow pretty much the same general soln. Want to have go?

@remiremi
Copy link
Author

Yeah, will give it a shot when I have some time next week

On Fri, Aug 21, 2015 at 3:36 PM, Jeff Reback notifications@github.com
wrote:

@remiremi https://github.com/remiremi this whole thing is a can of
worms, I have a very similar fix, see here
#10815 for #9422
#9422

but actually getting this to work is annoying. Because the tests are now
dependent on whether bottleneck is installed for testing (and what
version), e.g. < 1.0 is the old behavior, >= 1.0 is the new. To make more
complicated we actually turn off bottleneck for testing several places
(though that is now fixed and consistent).

So need to fix both these issues, and they follow pretty much the same
general soln. Want to have go?


Reply to this email directly or view it on GitHub
#9829 (comment).

@jreback jreback modified the milestones: Next Major Release, 0.17.0 Aug 31, 2015
@remiremi
Copy link
Author

@jreback I rebased today and ran the tests locally and the test test_aggregate_with_nat from test_resample is now successful! Do you know what change might have fixed the issue in TimeGrouper's groupby?

I guess all that needs to be done now is to make sure the tests still pass with both bottle < 1.0 and bottle >= 1.0, right?

@remiremi
Copy link
Author

All tests passed with bottleneck 1.0 but there was a failure with 0.8... Getting closer

@jreback
Copy link
Contributor

jreback commented Nov 10, 2015

I know this is a beast!

closing, but pls reopen if you'd like to work on this again.

@jreback jreback closed this Nov 10, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Series.sum has inconsistent return type
2 participants