[tests] Ray nightly image tests with pandas+numpy fails with TensorDType error #2452

tgaddair · 2022-09-05T17:52:22Z

(_map_block_nosplit pid=31146)   self._tensor = np.array([np.asarray(v) for v in values])
(_map_block_nosplit pid=31146) 2022-09-04 16:31:45,179	INFO worker.py:754 -- Task failed with retryable exception: TaskID(85e1c1d08ad412b6ffffffffffffffffffffffff01000000).
(_map_block_nosplit pid=31146) Traceback (most recent call last):
(_map_block_nosplit pid=31146)   File "/opt/hostedtoolcache/Python/3.9.13/x64/lib/python3.9/site-packages/ray/air/util/data_batch_conversion.py", line 158, in _cast_ndarray_columns_to_tensor_extension
(_map_block_nosplit pid=31146)     df.loc[:, col_name] = TensorArray(col)
(_map_block_nosplit pid=31146)   File "/opt/hostedtoolcache/Python/3.9.13/x64/lib/python3.9/site-packages/ray/air/util/tensor_extensions/pandas.py", line 720, in __init__
(_map_block_nosplit pid=31146)     raise TypeError(
(_map_block_nosplit pid=31146) TypeError: Tried to convert an ndarray of ndarray pointers (object dtype) to a well-typed ndarray but this failed; convert the ndarray to a well-typed ndarray before casting it as a TensorArray, and note that ragged tensors are NOT supported by TensorArray. First 5 subndarray types: [dtype('uint8'), dtype('uint8'), dtype('uint8'), dtype('uint8'), dtype('uint8')]
(_map_block_nosplit pid=31146) 
(_map_block_nosplit pid=31146) The above exception was the direct cause of the following exception:
(_map_block_nosplit pid=31146) 
(_map_block_nosplit pid=31146) Traceback (most recent call last):
(_map_block_nosplit pid=31146)   File "python/ray/_raylet.pyx", line 715, in ray._raylet.execute_task
(_map_block_nosplit pid=31146)   File "python/ray/_raylet.pyx", line 719, in ray._raylet.execute_task
(_map_block_nosplit pid=31146)   File "/opt/hostedtoolcache/Python/3.9.13/x64/lib/python3.9/site-packages/ray/data/_internal/compute.py", line 449, in _map_block_nosplit
(_map_block_nosplit pid=31146)     for new_block in block_fn(block, *fn_args, **fn_kwargs):
(_map_block_nosplit pid=31146)   File "/opt/hostedtoolcache/Python/3.9.13/x64/lib/python3.9/site-packages/ray/data/dataset.py", line 482, in transform
(_map_block_nosplit pid=31146)     yield output_buffer.next()
(_map_block_nosplit pid=31146)   File "/opt/hostedtoolcache/Python/3.9.13/x64/lib/python3.9/site-packages/ray/data/_internal/output_buffer.py", line 74, in next
(_map_block_nosplit pid=31146)     block = self._buffer.build()
(_map_block_nosplit pid=31146)   File "/opt/hostedtoolcache/Python/3.9.13/x64/lib/python3.9/site-packages/ray/data/_internal/delegating_block_builder.py", line 64, in build
(_map_block_nosplit pid=31146)     return self._builder.build()
(_map_block_nosplit pid=31146)   File "/opt/hostedtoolcache/Python/3.9.13/x64/lib/python3.9/site-packages/ray/data/_internal/table_block.py", line 85, in build
(_map_block_nosplit pid=31146)     return self._concat_tables(tables)
(_map_block_nosplit pid=31146)   File "/opt/hostedtoolcache/Python/3.9.13/x64/lib/python3.9/site-packages/ray/data/_internal/pandas_block.py", line 110, in _concat_tables
(_map_block_nosplit pid=31146)     df = _cast_ndarray_columns_to_tensor_extension(df)
(_map_block_nosplit pid=31146)   File "/opt/hostedtoolcache/Python/3.9.13/x64/lib/python3.9/site-packages/ray/air/util/data_batch_conversion.py", line 160, in _cast_ndarray_columns_to_tensor_extension
(_map_block_nosplit pid=31146)     raise ValueError(
(_map_block_nosplit pid=31146) ValueError: Tried to cast column value to the TensorArray tensor extension type but the conversion failed. To disable automatic casting to this tensor extension, set ctx = DatasetContext.get_current(); ctx.enable_tensor_extension_casting = False.

https://github.com/ludwig-ai/ludwig/runs/8177672264?check_suite_focus=true#step:10:7153

For some reason this does not repro locally, so could be an issue with different versions of pyarrow or another dependency.

The text was updated successfully, but these errors were encountered:

arnavgarg1 · 2022-09-08T17:03:22Z

Seems to be okay on the CI now - will investigate this if it starts showing up again

arnavgarg1 · 2022-09-09T16:49:46Z

Seems like the issue is back, will investigate: https://github.com/ludwig-ai/ludwig/runs/8273624112?check_suite_focus=true

tgaddair · 2022-09-09T17:04:34Z

@arnavgarg1 the CI was okay because the test is being skipped on nightly. We still need to fix it.

It's curious that now it's showing up for ray 2.0 tests as well all of a sudden.

tgaddair · 2022-09-25T02:42:03Z

@arnavgarg1 re-opening this issue to track.

arnavgarg1 · 2022-09-26T17:24:48Z

@tgaddair Thanks for re-opening the issue. It seems like this happens because of the non-determinism of "nan_percentage", particularly in cases where the last row in a partition is NaN since the missing value strategy is bfill. The result is that all other NaNs get filled except for the last row, and that results in the creation of ragged tensors.

I'll create a fix that involves ensuring that the last row of our random sampling isn't a NaN so that this situation is avoided. Might be worth calling out in our documentation somewhere as well since this can cause other errors downstream beyond our tests.

arnavgarg1 · 2022-09-26T18:00:29Z

Actually, the better way might be to do a bfill followed by ffill, or ffill followed by bfill to ensure there's never any NaNs

tgaddair added the tests Issue with the tests label Sep 5, 2022

tgaddair assigned arnavgarg1 Sep 7, 2022

arnavgarg1 linked a pull request Sep 14, 2022 that will close this issue

Fix flaky ray nightly image test #2493

Merged

tgaddair closed this as completed in #2493 Sep 14, 2022

tgaddair reopened this Sep 25, 2022

arnavgarg1 mentioned this issue Sep 26, 2022

Ensure bfill/ffill leave no residual NaNs in the dataset during preprocessing #2553

Merged

arnavgarg1 linked a pull request Sep 26, 2022 that will close this issue

Ensure bfill/ffill leave no residual NaNs in the dataset during preprocessing #2553

Merged

arnavgarg1 closed this as completed in #2553 Sep 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tests] Ray nightly image tests with pandas+numpy fails with TensorDType error #2452

[tests] Ray nightly image tests with pandas+numpy fails with TensorDType error #2452

tgaddair commented Sep 5, 2022

arnavgarg1 commented Sep 8, 2022

arnavgarg1 commented Sep 9, 2022

tgaddair commented Sep 9, 2022 •

edited

Loading

tgaddair commented Sep 25, 2022

arnavgarg1 commented Sep 26, 2022

arnavgarg1 commented Sep 26, 2022

[tests] Ray nightly image tests with pandas+numpy fails with TensorDType error #2452

[tests] Ray nightly image tests with pandas+numpy fails with TensorDType error #2452

Comments

tgaddair commented Sep 5, 2022

arnavgarg1 commented Sep 8, 2022

arnavgarg1 commented Sep 9, 2022

tgaddair commented Sep 9, 2022 • edited Loading

tgaddair commented Sep 25, 2022

arnavgarg1 commented Sep 26, 2022

arnavgarg1 commented Sep 26, 2022

tgaddair commented Sep 9, 2022 •

edited

Loading