You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @UmarRamzan. I hear you — large models can take a while to setup from a cold boot. We do what we can to optimize network storage and caches, but at a certain point you're limited by physical limitations of hardware for transferring and loading 10^2GB of weights into GPU VRAM.
If your application is sensitive to long cold starts, you can try creating a deployment and configuring a certain number of instances to always be running.
Hi, creating a deployment is not financially feasible for us. We're currently using Whisper Large v-3, which is around 8 GB and takes a minute or two to load. I had this in mind when originally asking the question: https://modal.com/docs/guide/checkpointing
Is there any way to store large models in some kind of network storage to avoid long cold boot times?
The text was updated successfully, but these errors were encountered: