You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It appears the destination for the training is being lost after sending to replicate server, so suspect this is causing the failure to create the training image.
I am using a local clone of the replicate-python library for debugging.
In "training.py" I have a trace to see what is sent to the server as the training request body. Output is:
I'm fine-tuning sakemin /musicgen-fine-tuner:bc57274e.
Same error as #308 which was closed without resolution.
Error message after otherwise successful execution: "Training failed. Failed to create trained image after successful training run"
Last line of logs is "Executor: All workers completed successfully"
Has completed successfully many times in the past. May be related to later versions of sake min/musicgen-fine-tuner.
Would appreciate any suggestions or assistance.
Thanks!
The text was updated successfully, but these errors were encountered: