Sagemaker training smdebug/core/state_store.py FileNotFoundError #2370
Replies: 2 comments
-
Hello, I encountered a similar issue using Tensorflow estimator. I used pre-built container image with framework version = 2.2 and py_version = py37. It has trained 3 epochs and threw the error in the 4th epoch.
|
Beta Was this translation helpful? Give feedback.
-
Hi
I1121 13:42:35.783539 140148447449344 basic_session_run_hooks.py:260] loss = 0.65470797 (0.405 sec) |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I am trying to train a model using MXnet estimator. As soon as training starts I see following error when Sagemaker tries to upload checkpoints:
Few details about the job:
Beta Was this translation helpful? Give feedback.
All reactions