We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_bad_disk
distributed/shuffle/tests/test_shuffle.py::test_bad_disk has started failing on main with the traceback below. See this CI run for an example.
distributed/shuffle/tests/test_shuffle.py::test_bad_disk
main
________________________________ test_bad_disk _________________________________ 1 thread(s) were leaked from test ------ Call stack of leaked thread 1/1: <Thread(ThreadPoolExecutor-69_0, started 140170015799040)> ------ File "/usr/share/miniconda3/envs/dask-distributed/lib/python3.9/threading.py", line 937, in _bootstrap self._bootstrap_inner() File "/usr/share/miniconda3/envs/dask-distributed/lib/python3.9/threading.py", line 980, in _bootstrap_inner self.run() File "/usr/share/miniconda3/envs/dask-distributed/lib/python3.9/threading.py", line 917, in run self._target(*self._args, **self._kwargs) File "/usr/share/miniconda3/envs/dask-distributed/lib/python3.9/concurrent/futures/thread.py", line 81, in _worker work_item = work_queue.get(block=True) ----------------------------- Captured stderr call ----------------------------- 2022-10-27 14:34:53,950 - distributed.worker - WARNING - Compute Failed Key: ('shuffle-p2p-4651c82ee05b6682b3f73b2b18ad74e5', 7) Function: shuffle_unpack args: ('3110a8a90a5b642409b0a20f83b03722', 7, None) kwargs: {} Exception: "FileNotFoundError(2, 'No such file or directory')" 2022-10-27 14:34:53,951 - distributed.worker - WARNING - Compute Failed Key: ('shuffle-p2p-4651c82ee05b6682b3f73b2b18ad74e5', 1) Function: shuffle_unpack args: ('3110a8a90a5b642409b0a20f83b03722', 1, None) kwargs: {} Exception: "FileNotFoundError(2, 'No such file or directory')" 2022-10-27 14:34:53,955 - distributed.worker - WARNING - Compute Failed Key: ('shuffle-p2p-4651c82ee05b6682b3f73b2b18ad74e5', 0) Function: shuffle_unpack args: ('3110a8a90a5b642409b0a20f83b03722', 0, None) kwargs: {} Exception: "FileNotFoundError(2, 'No such file or directory')" 2022-10-27 14:34:53,987 - distributed.worker - WARNING - Compute Failed Key: ('shuffle-p2p-4651c82ee05b6682b3f73b2b18ad74e5', 2) Function: shuffle_unpack args: ('3110a8a90a5b642409b0a20f83b03722', 2, None) kwargs: {} Exception: "FileNotFoundError(2, 'No such file or directory')" 2022-10-27 14:34:53,987 - distributed.worker - WARNING - Compute Failed Key: ('shuffle-p2p-4651c82ee05b6682b3f73b2b18ad74e5', 5) Function: shuffle_unpack args: ('3110a8a90a5b642409b0a20f83b03722', 5, None) kwargs: {} Exception: "FileNotFoundError(2, 'No such file or directory')" 2022-10-27 14:34:53,987 - distributed.worker - WARNING - Compute Failed Key: ('shuffle-p2p-4651c82ee05b6682b3f73b2b18ad74e5', 3) Function: shuffle_unpack args: ('3110a8a90a5b642409b0a20f83b03722', 3, None) kwargs: {} Exception: "FileNotFoundError(2, 'No such file or directory')" 2022-10-27 14:34:53,990 - distributed.worker - ERROR - Exception during execution of task ('shuffle-p2p-4651c82ee05b6682b3f73b2b18ad74e5', 4). Traceback (most recent call last): File "/home/runner/work/distributed/distributed/distributed/worker.py", line 2341, in _prepare_args_for_execution data[k] = self.data[k] File "/usr/share/miniconda3/envs/dask-distributed/lib/python3.9/site-packages/zict/buffer.py", line 108, in __getitem__ raise KeyError(key) KeyError: 'shuffle-barrier-3110a8a90a5b642409b0a20f83b03722' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/runner/work/distributed/distributed/distributed/worker.py", line 2239, in execute args2, kwargs2 = self._prepare_args_for_execution(ts, args, kwargs) File "/home/runner/work/distributed/distributed/distributed/worker.py", line 2345, in _prepare_args_for_execution data[k] = Actor(type(self.state.actors[k]), self.address, k, self) KeyError: 'shuffle-barrier-3110a8a90a5b642409b0a20f83b03722' 2022-10-27 14:34:53,996 - distributed.diskutils - ERROR - Failed to remove '/tmp/dask-worker-space/worker-dt9g6wgx' (failed in <built-in function lstat>): [Errno 2] No such file or directory: '/tmp/dask-worker-space/worker-dt9g6wgx' 2022-10-27 14:34:53,996 - distributed.diskutils - ERROR - Failed to remove '/tmp/dask-worker-space/worker-ihvksve8' (failed in <built-in function lstat>): [Errno 2] No such file or directory: '/tmp/dask-worker-space/worker-ihvksve8' ------------------------------ Captured log call ------------------------------- ERROR asyncio:base_events.py:1753 Task exception was never retrieved future: <Task finished name='Task-65302' coro=<Shuffle.receive() done, defined at /home/runner/work/distributed/distributed/distributed/shuffle/_shuffle_extension.py:142> exception=FileNotFoundError(2, 'No such file or directory')> Traceback (most recent call last): File "/home/runner/work/distributed/distributed/distributed/shuffle/_shuffle_extension.py", line 148, in receive raise self._exception File "/home/runner/work/distributed/distributed/distributed/shuffle/_shuffle_extension.py", line 172, in receive await self.multi_file.put(groups) File "/home/runner/work/distributed/distributed/distributed/shuffle/_multi_file.py", line 124, in put raise self._exception File "/home/runner/work/distributed/distributed/distributed/shuffle/_multi_file.py", line 202, in process with open( FileNotFoundError: [Errno 2] No such file or directory: '/tmp/dask-worker-space/worker-ihvksve8/shuffle-3110a8a90a5b642409b0a20f83b03722/1' - generated xml file: /home/runner/work/distributed/distributed/reports/pytest.xml -
cc @fjetter as I know you've made some shuffle-related changes recently (not sure if they're related though)
The text was updated successfully, but these errors were encountered:
decide_worker_rootish_queuing_disabled
Thank you. I am aware and am on it
Sorry, something went wrong.
Successfully merging a pull request may close this issue.
distributed/shuffle/tests/test_shuffle.py::test_bad_disk
has started failing onmain
with the traceback below. See this CI run for an example.cc @fjetter as I know you've made some shuffle-related changes recently (not sure if they're related though)
The text was updated successfully, but these errors were encountered: