You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Got the following log from running a job on AWS Batch:
2020-06-25 11:48:12.924 Bootstrapping conda environment...(this could take a few minutes)
2020-06-25 11:48:15.650 Workflow starting (run-id 1593085694938901):
2020-06-25 11:48:15.765 [1593085694938901/start/1 (pid 13039)] Task is starting.
2020-06-25 11:48:16.772 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] Task is starting (status SUBMITTED)...
2020-06-25 11:48:20.970 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] Task is starting (status RUNNABLE)...
2020-06-25 11:48:51.102 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] Task is starting (status RUNNABLE)...
2020-06-25 11:49:21.238 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] Task is starting (status RUNNABLE)...
2020-06-25 11:49:25.868 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] Task is starting (status STARTING)...
2020-06-25 11:49:52.043 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] Task is starting (status RUNNING)...
2020-06-25 11:49:58.404 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] Setting up task environment.
2020-06-25 11:50:11.631 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] Downloading code package.
2020-06-25 11:50:11.631 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden
2020-06-25 11:50:21.414 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] Downloading code package.
2020-06-25 11:50:21.414 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden
2020-06-25 11:50:31.419 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] Downloading code package.
2020-06-25 11:50:35.731 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden
2020-06-25 11:50:46.324 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] Downloading code package.
2020-06-25 11:50:46.324 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden
2020-06-25 11:50:56.977 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] Downloading code package.
2020-06-25 11:50:56.978 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden
2020-06-25 11:51:06.525 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] Downloading code package.
2020-06-25 11:51:06.525 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] fatal error: An error occurred (403) when calling the HeadObject operation: Forbidden
2020-06-25 11:51:12.959 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] tar: job.tar: Cannot open: No such file or directory
2020-06-25 11:51:12.959 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] tar: Error is not recoverable: exiting now
2020-06-25 11:51:15.041 [1593085694938901/start/1 (pid 13039)] Batch error:
2020-06-25 11:51:15.042 [1593085694938901/start/1 (pid 13039)] Essential container in task exited This could be a transient error. Use @retry to retry.
Metaflow should specify which operation specifically got a 403 (and if it's related to S3, say what the bucket+object was), to make the error easily actionable.
What I did as a workaround is search for "Downloading code package" in the github repo, found
Needless to say, I shouldn't have to dig through the source code to understand what the problem is, the error should be more clear instead.
BTW this line: 2020-06-25 11:51:12.959 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] tar: job.tar: Cannot open: No such file or directory
seems to suggest that the code is still looking for that file even after the download failed. Instead it should abort I think.
The text was updated successfully, but these errors were encountered:
Got the following log from running a job on AWS Batch:
Metaflow should specify which operation specifically got a 403 (and if it's related to S3, say what the bucket+object was), to make the error easily actionable.
What I did as a workaround is search for "Downloading code package" in the github repo, found
metaflow/metaflow/environment.py
Line 88 in 592515e
metaflow/metaflow/plugins/aws/batch/batch.py
Line 137 in 592515e
Needless to say, I shouldn't have to dig through the source code to understand what the problem is, the error should be more clear instead.
BTW this line:
2020-06-25 11:51:12.959 [1593085694938901/start/1 (pid 13039)] [1a261446-4a1b-46b6-a632-129a0db30abe] tar: job.tar: Cannot open: No such file or directory
seems to suggest that the code is still looking for that file even after the download failed. Instead it should abort I think.
The text was updated successfully, but these errors were encountered: