BUG: Avoid undefined behaviour when converting from float to timedelta #28918

siddhesh · 2019-10-11T06:28:40Z

Summation of timedelta series with NaTs in them result in undefined
behaviour because the final wrapping step of the summation ends up
converting the NaNs in the sum through a direct cast to int64. This
cast is undefined for NaN and just happens to work on x86_64 because
of the way cvttd2si works. On Aarch64, the corresponding fcvtzs sets
the result to 0 on undefined input.

This fix trivially sets the conversion target to m8 instead of i8 so
that numpy correctly casts from NaN to NaT. Note that the fix in
numpy for the same is pending in PR numpy/numpy#14669 .

There is an existing test (test_sum_nanops_timedelta in
frame/test_analytics.py) that exercises this bug and has been verified
to have been fixed with this and the numpy patch.

jreback · 2019-10-11T12:28:07Z

There is an existing test (test_sum_nanops_timedelta in
frame/test_analytics.py) that exercises this bug and has been verified
to have been fixed with this and the numpy patch.

is there a test that would fail currently on our CI and pass with this patch? or is this just platform specific?

jreback · 2019-10-11T12:28:16Z

cc @jbrockmendel

siddhesh · 2019-10-11T14:20:19Z

There is an existing test (test_sum_nanops_timedelta in
frame/test_analytics.py) that exercises this bug and has been verified
to have been fixed with this and the numpy patch.

is there a test that would fail currently on our CI and pass with this patch? or is this just platform specific?

It looks like all of your CI tests are x86-only, so you won't see this issue. If you run the testsuite on aarch64 though, you'll see test_sum_nanops_timedelta fail without this patch and pass with it.

jbrockmendel · 2019-10-11T15:43:22Z

pandas/core/nanops.py

@@ -360,7 +360,7 @@ def _wrap_results(result, dtype, fill_value=None):

            result = tslibs.Timedelta(result, unit="ns")
        else:
-            result = result.astype("i8").view(dtype)
+            result = result.astype("m8").view(dtype)


would making this m8[ns] break things?

It shouldn't. I've started a test run on my arm box with it.

jbrockmendel · 2019-10-11T15:44:57Z

If you run the testsuite on aarch64 though

Do azure or travis offer such a thing? (Or anyone else?)

siddhesh · 2019-10-11T16:12:28Z

If you run the testsuite on aarch64 though

Do azure or travis offer such a thing? (Or anyone else?)

I believe Travis does now, as of just 4 days ago:

https://blog.travis-ci.com/2019-10-07-multi-cpu-architecture-support

Summation of timedelta series with NaTs in them result in undefined behaviour because the final wrapping step of the summation ends up converting the NaNs in the sum through a direct cast to int64. This cast is undefined for NaN and just happens to work on x86_64 because of the way cvttd2si works. On Aarch64, the corresponding fcvtzs sets the result to 0 on undefined input. This fix trivially sets the conversion target to m8 instead of i8 so that numpy correctly casts from NaN to NaT. Note that the fix in numpy for the same is pending in PR #numpy/numpy/14669 . There is an existing test (test_sum_nanops_timedelta in frame/test_analytics.py) that exercises this bug and has been verified to have been fixed with this and the numpy patch.

jreback · 2019-10-12T17:09:10Z

thanks @siddhesh

would be interested in adding a travis build on arm? (can you open a separate issue to track).

pandas-dev#28918) Summation of timedelta series with NaTs in them result in undefined behaviour because the final wrapping step of the summation ends up converting the NaNs in the sum through a direct cast to int64. This cast is undefined for NaN and just happens to work on x86_64 because of the way cvttd2si works. On Aarch64, the corresponding fcvtzs sets the result to 0 on undefined input. This fix trivially sets the conversion target to m8 instead of i8 so that numpy correctly casts from NaN to NaT. Note that the fix in numpy for the same is pending in PR #numpy/numpy/14669 . There is an existing test (test_sum_nanops_timedelta in frame/test_analytics.py) that exercises this bug and has been verified to have been fixed with this and the numpy patch.

jreback added the Timedelta Timedelta data type label Oct 11, 2019

jreback added this to the 1.0 milestone Oct 11, 2019

jbrockmendel reviewed Oct 11, 2019

View reviewed changes

jreback merged commit c387a28 into pandas-dev:master Oct 12, 2019

siddhesh mentioned this pull request Oct 15, 2019

Add arm64 builds to CI #28986

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

BUG: Avoid undefined behaviour when converting from float to timedelta #28918

BUG: Avoid undefined behaviour when converting from float to timedelta #28918

Uh oh!

siddhesh commented Oct 11, 2019 •

edited

Loading

Uh oh!

jreback commented Oct 11, 2019

Uh oh!

jreback commented Oct 11, 2019

Uh oh!

siddhesh commented Oct 11, 2019

Uh oh!

jbrockmendel Oct 11, 2019

Uh oh!

siddhesh Oct 11, 2019

Uh oh!

jbrockmendel commented Oct 11, 2019

Uh oh!

siddhesh commented Oct 11, 2019

Uh oh!

jreback commented Oct 12, 2019

Uh oh!

Uh oh!

Uh oh!

BUG: Avoid undefined behaviour when converting from float to timedelta #28918

BUG: Avoid undefined behaviour when converting from float to timedelta #28918

Uh oh!

Conversation

siddhesh commented Oct 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jreback commented Oct 11, 2019

Uh oh!

jreback commented Oct 11, 2019

Uh oh!

siddhesh commented Oct 11, 2019

Uh oh!

jbrockmendel Oct 11, 2019

Choose a reason for hiding this comment

Uh oh!

siddhesh Oct 11, 2019

Choose a reason for hiding this comment

Uh oh!

jbrockmendel commented Oct 11, 2019

Uh oh!

siddhesh commented Oct 11, 2019

Uh oh!

jreback commented Oct 12, 2019

Uh oh!

Uh oh!

siddhesh commented Oct 11, 2019 •

edited

Loading