Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dataset] Bump numpy >=1.20 dependency #20374

Merged
merged 3 commits into from
Nov 15, 2021

Conversation

wuisawesome
Copy link
Contributor

@wuisawesome wuisawesome commented Nov 15, 2021

Why are these changes needed?

We need to ensure our numpy version is greater than 1.20. In 1.19 and under, the following code doesn't work

import pandas as pd
import torch
pd.DataFrame([torch.arange(27).reshape(3,3,3) for _ in range(10)])

Right now, this is pretty fundamental to how we build datasets of tensors, which is common when doing last-mile data preprocessing (i.e. a preprocessed image is most likely represented as a tensor).

Note that 1.20 dropped support for python 3.6 so the best we can do is enforced this versioning when using newer python versions.

Related issue number

Closes #20258

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Alex Wu added 2 commits November 15, 2021 08:59
@@ -219,6 +219,10 @@ def get_packages(self):
],
}

if sys.version_info >= (3, 7):
# Numpy dropped python 3.6 support in 1.20.
setup_spec.extras["data"].append("numpy >= 1.20")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data extras is what's actually installed when you do pip install ray[data]. Those requirements files should be for additional ray developer add-ons.

@ericl
Copy link
Contributor

ericl commented Nov 15, 2021

@wuisawesome can you elaborate on what "doesn't work" means? Can we raise a better error message in this condition?

@wuisawesome
Copy link
Contributor Author

Yeah by "doesn't work" I mean it throws this error

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/ubuntu/.local/lib/python3.8/site-packages/pandas/core/frame.py", line 711, in __init__
    mgr = ndarray_to_mgr(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/pandas/core/internals/construction.py", line 304, in ndarray_to_mgr
    values = _prep_ndarray(values, copy=copy)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/pandas/core/internals/construction.py", line 540, in _prep_ndarray
    values = np.array([convert(v) for v in values])
ValueError: only one element tensors can be converted to Python scalars

Note that other tensors (like ndarray) also don't work, but throw different error messages. To make better error messages we could try to swallow all errors and tell people to upgrade their numpy version (seems a little scary) or start inspecting the contents of the dataset and doing special case checks for the various tensor types if numpy < 1.20 on python 3.6.

@wuisawesome
Copy link
Contributor Author

test_output is broken on master

@wuisawesome wuisawesome merged commit 884bb3d into ray-project:master Nov 15, 2021
@wuisawesome wuisawesome mentioned this pull request Nov 16, 2021
6 tasks
rkooo567 pushed a commit that referenced this pull request Nov 17, 2021
<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

The change in #20374 was interpreted as a file redirect, not a "greater than" by docker (strangely enough, differently than bash interprets it locally). 

<!-- Please give a short summary of the change and the problem this solves. -->

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(


Co-authored-by: Alex <alex@anyscale.com>
fishbone pushed a commit that referenced this pull request Nov 18, 2021
<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

The change in #20374 was interpreted as a file redirect, not a "greater than" by docker (strangely enough, differently than bash interprets it locally). 

<!-- Please give a short summary of the change and the problem this solves. -->

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(


Co-authored-by: Alex <alex@anyscale.com>
wuisawesome pushed a commit that referenced this pull request Nov 20, 2021
<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

The change in #20374 was interpreted as a file redirect, not a "greater than" by docker (strangely enough, differently than bash interprets it locally). 

<!-- Please give a short summary of the change and the problem this solves. -->

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(


Co-authored-by: Alex <alex@anyscale.com>
wuisawesome pushed a commit that referenced this pull request Nov 21, 2021
<!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. -->

## Why are these changes needed?

The change in #20374 was interpreted as a file redirect, not a "greater than" by docker (strangely enough, differently than bash interprets it locally). 

<!-- Please give a short summary of the change and the problem this solves. -->

## Related issue number

<!-- For example: "Closes #1234" -->

## Checks

- [ ] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(


Co-authored-by: Alex <alex@anyscale.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[release][core-nightly] inference is failing.
5 participants