How to speed up development workflow with local SageMaker? #4797
Unanswered
richardkmichael
asked this question in
Help
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm working with a large model framework (Nvidia NeMo), and a relatively complex inference script. There is a lot of experimentation, so I thought it would be faster to use SageMaker in "local" mode, and I have it working.
However:
the model data bundle has a
requirements.txt
which installs ~1GB of dependencies (NeMo and deps), downloaded each time. SageMaker'sdocker compose
rebuilds the container every time I re-run my localmodel.deploy(instance_type='local', ...)
which takes at least 5 minutes (installing dependencies). So it is a very slow development process -- change code, re-deploy, find a bug, etc.Since the inference code is not "live" within the running torch server, I need to save, recreate the model tar.gz bundle, and re-deploy for any change to the inference code. This is also painful, even if the docker container was re-used.
Any suggestions to speed up my local workflow?
I've considered --
Re 1:
Is it possible to specify a custom image with
instance_type='local'
? Themodel.deploy()
function doesn't seem to accept an image name argument. But if so, I could build a custom Docker image, with the framework already installed, so that pip would quickly findrequirement already satisfied
when it processes myrequirements.txt
file.Re 2:
I could automate the model bundle rebuild with a watcher on the inference code.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions