-
-
Notifications
You must be signed in to change notification settings - Fork 756
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DeflatePerMessage VS memory consumption #1862
Comments
@M1ha-Shvn I still don't have a clear view on how to solve the issue with the |
Also, do you have a MRE for me to confirm the issue? |
What I suggest, is one of the following:
|
Thanks! |
I haven't tested this PR, made it fast in order to show general idea |
@aaugustin I'm sorry to bother, but when you have time, would you mind giving me your input here to understand if we are going in the right direction? 🙏 |
If you're using context takeover (i.e. you don't set If you aren't using context takeover, then you don't have a memory usage problem. |
Apart from that, is there a knob to configure max_window_bits? With a very high number of connections, probably you would benefit from lowering it. See https://websockets.readthedocs.io/en/stable/topics/compression.html#compression-settings for defaults. |
Hi, thanks for your answer.
First of all, I understand that I've changed this behavior. But what I don't understand: why deflate object is created per connection? Why can not Deflate context be shared between all connections? From my point of view, in a typical websocket server you have lots of connections with clients and send very similar (particularly, JSON) messages to all clients => to all connections. So it would be worth using same deflate object with single context and increasing it's capacity, so it could compress messages better. Of course, it depends on connection settings, set by request headers. But there is a small number of combinations of these parameters => A limited number of objects can be created and used.
I'm not so sure about it. Creating an object in python has its memory cost. Even if I won't use context takeover, objects PerMessageDeflate would be created and use memory for each connection.
Yes, adding deflate tuning settings to uvicorn settings was one of my proposals. Though, it would lower the problem for me, but would not solve it: I'll just have higher connection limit still consuming lots of memory per each connection |
Kludex asked for my inputs; I gave them. If you don't trust me, make your experiments and reach your own conclusions. In your experiments, don't stop at opening connections; exchange a significant number of different messages on each connection, in both directions, and make sure they make it through the compress / decompress cycle correctly. |
In case it helps:
|
Thanks @aaugustin! I really appreciate it. 🙏 |
Did you try this @M1ha-Shvn ? |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Discussed in #1850
Originally posted by M1ha-Shvn January 27, 2023
Hi.
I'm devloping a server serving websockets connections using FastAPI.
I've noticed, that creating several thousands of simultinious websocket connections leads to high memory usage (4 Gb per 5-8k websocket connections in my case). I've started debugging it with tracemalloc and found out that the largest amount of this memory is consumed by websockets deflate extension in this line.
After that I've digged into websockets deflate mechanics and found out that it can be tuned wisely in order to achive lower memory consumption using custom
ServerPerMessageDeflateFactory
. I've tried searching for it in FastApi => Starlette => Uvicorn code and it lead me here.What is the source of memory leak:
PerMessageDeflate
instance for each websocket connection (usingServerPerMessageDeflateFactory
). From my point of view, it is a disadvantageous behaviour: it would be much better if it is created not for each websocket connection, but for each combination of connection parameters (like Singleton pattern, but created for each combination of parameters. Something like lru_cache).--ws-per-message-deflate
. It is not flexible for different cases.The text was updated successfully, but these errors were encountered: