Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cast datetime and timedelta to signed 64-bit int #344

Merged
merged 7 commits into from
Dec 10, 2018
Merged

Cast datetime and timedelta to signed 64-bit int #344

merged 7 commits into from
Dec 10, 2018

Conversation

jakirkham
Copy link
Member

@jakirkham jakirkham commented Nov 21, 2018

Related to issue ( #342 )

The NaT type is represented as -0. As a result, casting to an unsigned integral fails and throws an error. However casting to a signed integral type does not have this problem and proceeds without issues.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • Docs build locally (e.g., run tox -e docs)
  • AppVeyor and Travis CI passes
  • Test coverage is 100% (Coveralls passes)

@jakirkham
Copy link
Member Author

jakirkham commented Nov 21, 2018

Pushed another test. It is a bit broken still, but added it anyways as I could use some advice on how to fix it.

Edit: Should be fixed with the latest commit.

assert np.all(
fill_value.view((np.uint8, fill_value.itemsize)) ==
meta_dec['fill_value'].view((np.uint8, meta_dec['fill_value'].itemsize))
)
Copy link
Member Author

@jakirkham jakirkham Nov 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As NaT values may compare False (like NaN values) in the future, this is a workaround to handle that case while still making such comparisons True. The itemsize needs to be included for this to work on 0-D arrays (e.g. scalars).

ref: https://stackoverflow.com/a/49972198

@jakirkham
Copy link
Member Author

Please let me know if there is anything else needed here.

@jakirkham
Copy link
Member Author

Have resolved conflicts and rebased. Should be ready to go once CI clears.

The `NaT` type is represented as `-0`. As a result, casting to an
unsigned integral fails and throws an error. However casting to a signed
integral type does not have this problem and proceeds without issues.
Use a structured array with datetime and timedelta values and a fill
value of NaT to test a bunch of different workarounds for encoding and
decoding datetime and timedelta values and array.
for k in ['m8[s]', 'M8[s]']:
compressor = Blosc(cname='lz4', clevel=3, shuffle=2)
dtype = np.dtype(k)
fill_value = np.full((), np.nan, dtype=dtype)[()]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use NaT as the fill value, to test it explicitly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g.:

Suggested change
fill_value = np.full((), np.nan, dtype=dtype)[()]
fill_value = np.full((), 'NaT', dtype=dtype)[()]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm...that doesn't seem to work. 😕

ValueError: invalid literal for int() with base 10: 'NaT'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed a different change that uses NaT explicitly. Please let me know what you think.

Copy link
Member

@alimanfoo alimanfoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good. Had a small comment on the metadata test. Also wondering if it would be worth adding a test to the test_creation module, to make sure you can use 'NaT' as a fill_value, e.g.:

z = zarr.full(100, fill_value='NaT', dtype='M8[s]')
assert np.all(np.isnat(z[:]))

@jakirkham
Copy link
Member Author

Should be ready for another look when you have a chance.

@alimanfoo
Copy link
Member

Thanks @jakirkham. I added in a test to verify you can pass 'NaT' as fill_value to the array creation functions. If CI passes I'll merge.

@jakirkham
Copy link
Member Author

Pushed one small change to fix a linting error. Looks like it is passing now.

@alimanfoo alimanfoo merged commit 8e18c81 into zarr-developers:master Dec 10, 2018
@alimanfoo
Copy link
Member

Awesome, thanks!

@jakirkham jakirkham deleted the hdl_nat branch December 10, 2018 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants