-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixing rounding errors in to_dataframe in wfdb records #456
base: main
Are you sure you want to change the base?
Conversation
Thanks, Lama! I have to nitpick here, though. It looks like what this is actually doing is rounding the sample interval to the nearest nanosecond and then using fixed N-nanosecond steps (whereas previously it was rounding to the nearest microsecond.) So this code is much more accurate than before but still has the same fundamental problem. Here's how I figure the timestamp:
so the end time in nanseconds should be 14:37:22.384920325. (On the other hand, 9 * 3600 + 17.566 + 6661119 * 0.016007043 - (24 + 14) * 3600 - 37 * 60 = 22.384261117 exactly.) I think this is a bug in pandas. But still this is an improvement and good enough for many practical purposes. |
This also fails in the case where base_datetime is None. Try it out with a record that doesn't have a base time, such as one of the records from |
Thank you Benjamin! I have updated the changes. Please merge at your convenience. |
I get different results (on pandas main branch as well as on pandas 1.1.5), so perhaps this issue has been fixed? Lama, what do you get if you run the following?
|
Below is my output:
My pandas version is 2.0.2 and has timedelta64[ns] |
Are you sure that's the same version you used before (that gave an end time of 2148-08-17 14:37:22.384261117)? Here's what I get:
|
|
Thanks for checking. It is really strange that we're seeing different results. While I'd like to get to the bottom of this, I don't really have time to debug it right now. It looks like the sequence of values originates from |
This pull request fixes issue #444.
This PR refactors the index generation logic in the
to_dataframe
method inBaseRecord class
The current code uses the
freq
argument in thepd.date_range
andpd.timedelta_range
functions to generate the index for the DataFrame. However, usingend
argument improves accuracy of the time without being off by 0.287secThe get_absolute_time is correct to the nearest microsecond 14:37:22.384920 and the to_dataframe() is correct to the nearest nanosecond 14:37:22.384261117
Please review the PR at your convenience.