Replies: 9 comments 1 reply
-
Another trial today: A cold start: Connected to 66.241.125.29:443 from 192.168.1.5:53615
HTTP/2 502
server: Fly/f9c163a6 (2024-01-16)
via: 2 fly.io
fly-request-id: 01HMKTKPQBGN3XRXEY95MKF2W5-bog
date: Sat, 20 Jan 2024 16:19:17 GMT
Body stored in: /var/folders/mz/91hbds1j23125yksdf67dcgm0000gn/T/tmp893dwse9
DNS Lookup TCP Connection TLS Handshake Server Processing Content Transfer
[ 45ms | 25ms | 203ms | 98992ms | 0ms ]
| | | | |
namelookup:45ms | | | |
connect:70ms | | |
pretransfer:273ms | |
starttransfer:99265ms |
total:99265ms Seems that nothing is persisted on disk and that you have to download everything? DNS Lookup TCP Connection TLS Handshake Server Processing Content Transfer
[ 1ms | 27ms | 29ms | 485ms | 1ms ]
| | | | |
namelookup:1ms | | | |
connect:28ms | | |
pretransfer:57ms | |
starttransfer:542ms |
total:543ms |
Beta Was this translation helpful? Give feedback.
-
@LuchoTurtle You run the upload the image model in "Application.ex" so I also need to upload the whisper model. I tried to load in parallel the models but for some reason this does not give any speed. #Application.ex
@models_folder_path Application.compile_env!(:app, :models_cache_dir)
@captioning_prod_model %ModelInfo{
name: "Salesforce/blip-image-captioning-base",
cache_path: Path.join(@models_folder_path, "blip-image-captioning-base"),
load_featurizer: true,
load_tokenizer: true,
load_generation_config: true
}
@whisper_model %ModelInfo{
name: "openai/whisper-small",
cache_path: Path.join(@models_folder_path, "whisper-small"),
load_featurizer: true,
load_tokenizer: true,
load_generation_config: true
}
def start(_type, _args) do
[
@whisper_model,
@captioning_prod_model,
@captioning_test_model
]
|> Enum.each(&App.Models.verify_and_download_models/1)
# this "async upoad" isn't faster ???
#|> Task.async_stream(&App.Models.verify_and_download_models/1), timeout: :infinity)
#|> Enum.to_list()
[...]
|
Beta Was this translation helpful? Give feedback.
-
I've documented everything in https://github.com/dwyl/image-classifier/blob/main/deployment.md regarding deployment to fly.io. I'm indeed using a volume to store the models and last time I deployed, everything seemed to be working. The models were downloaded after deploying and being used first and then used in subsequent runs. In fact, because the way the models are being served with As you know, I've gone through this situation of persisting models quite a few times: first by changing the I can see your activity on the logs. Here's the volume being mounted:
As you know, when a model is downloaded, a message like
Unless the volumes are actively being pruned when downscaling due to inactivity, I don't understand this behaviour :( Thank you for sharing |
Beta Was this translation helpful? Give feedback.
-
Yes indeed, seems that volumes are pruned when the machine is killed. Maybe we could save these 3 models into a Postgres blob field (large object)? A DB is persisted, and a db_query/copy_if_not_exists should be faster option? I may try this |
Beta Was this translation helpful? Give feedback.
-
That seems like a plausible option (and to be quite frank, the only option we probably have given that we want the machines to scale down with inactivity). It sucks that we have to undergo a "hacky way" to get it to work :( But, as much as I'd love to do that, I don't think it's pertinent (at least to my/this repo's scenario). Volumes shouldn't be pruned if downscaled :( . The strategy that is documented should work file in most cases, so I don't really feel the need to try to save models within a relational database, it just seems counter-intuitive and may stray beginners to think it's ok, when it's not really suitable for this case. Although I appreciate your feedback (I really, really, really do), you can try it for yourself if you want. But I don't see myself hacking my around and saving models into a database and all the headache that may come along with it. I'm really excited in actually getting the audio transcription PRs you've implemented and then work from there :D |
Beta Was this translation helpful? Give feedback.
-
I am curious but you are wise, so this project does not need this. |
Beta Was this translation helpful? Give feedback.
-
@LuchoTurtle |
Beta Was this translation helpful? Give feedback.
-
Unfortunately, I can't Since the models are usually 1GB, I can assume the volume is being cleaned up :/ |
Beta Was this translation helpful? Give feedback.
-
I read volumes forks. Could this one be permanent??
I understand Dwyl is a "real" customer, aren't you? Any chance to use this help to use a fork as a backup? Fly may be reactive with "real" customers? 🤔 |
Beta Was this translation helpful? Give feedback.
-
I used httpstat to get some stats on a cold start vs a warm start to get an idea on the state of the current app (only Image-To-Text models are loaded).
The first run:
The next run is a "warm" start:
Beta Was this translation helpful? Give feedback.
All reactions