-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model loading is very memory hungry #9
Comments
I've been unhappy with that for a long time. There are mainly two reasons why I haven't changed the model format, so far:
Of course, dealing with 2 is just a matter of making the tagger recognize the format and handle the model file appropriately. It's just that it hasn't been a top priority for me. |
Sure. It has only become an issue for me because I'm working with a project that wants to expose a web service based on your tagger in a platform that uses Kubernetes. I need to apply memory limits to the pod definitions but for this service I have to make the pod request 4GB even though it only needs 1.7GB after the startup phase. For this particular use case I've developed a workaround where I transform the model into a gzipped pickle format file, which is quite a bit larger than the original gzipped JSON but loads faster and with virtually no additional memory overhead. However it occurred to me today that it's actually possible to implement a more efficient streaming load of the current model format using |
Ah, the |
PR submitted - I've made it use the optimised algorithm on CPython 3.6+ or (any) Python 3.7+, which are the ones where dict iteration order is guaranteed, and fall back to the original algorithm on earlier versions. |
Thank you! I've updated the README and created a new release. |
Taking the spoken Italian model as an example, the process of loading the model into memory (
ASPTagger.load
) causes memory usage of the python process to briefly rise to nearly 4GB. Once the model is loaded memory usage drops to a more reasonable 1.7GB and remains there in the steady state.The format used to store models on disk is gzip-compressed JSON, with the weight numbers stored as base85-encoded strings. This format is rather inefficient to load, since we must
vocabulary
list to turn it into a setIf the feature name/weight pairs were instead serialized together (either as a
{"feature":"base85-weight",...}
object or as a transposed list-of-2-element-lists) then it would be possible to parse the model file in a single pass in a streaming fashion, eliminating the need to make multiple copies of potentially very large arrays in memory.The text was updated successfully, but these errors were encountered: