-
-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memoized packing for pushing identical messages to large groups #276
Comments
Hey Matt, thanks for pinging about this! You've spotted a definite inefficiency (just something I hadn't considered), and a good fix (with One potential minor difficulty with using In this case, think it might actually make sense to consider adding a new API fn for bulk broadcasting that'll just generate the payload once before looping. Just have my hands quite full atm, but will think about it a little and get back to you. Again, this was a good (and valuable) catch - and a nice clear report: much appreciated! Cheers :-) |
Objectives: - [#276] Support easy+efficient generic un/pack caching. - In particular: support cache-friendly server->client event buffering. - [#16] Support client->server buffering with arb cbs. - Read+write performance. - Bandwidth efficiency. - Flexibility for possible future extensions/changes. - Human-readable on the wire (handy for debugging, etc.). - Lay groundwork for possible broadcast API fn. Downsides: - Would involve a breaking incompatibility with old clients.
Hey, thanks for the quick reply + commit! I figured this would touch on other concerns, like buffering. Thanks for the informative comments in the commit. A bit off topic, but I'm not sure a separate issue is warranted or if this is covered elsewhere - are there tests that aren't committed into the repo? I see a testing profile in profile.clj, but I don't seem to see anything in the test directory. While experimenting locally I was wondering how you approached testing for performance w/ many connected clients. |
Objectives: - [#276] Support easy+efficient generic un/pack caching. - In particular: support cache-friendly server->client event buffering. - [#16] Support client->server buffering with arb cbs. - Read+write performance. - Bandwidth efficiency. - Flexibility for possible future extensions/changes. - Human-readable on the wire (handy for debugging, etc.). - Lay groundwork for possible broadcast API fn. Downsides: - Would involve a breaking incompatibility with old clients.
I haven't looked at your changes, but I believe that Sente keeps the transit readers and writers for each client since #161. Is it still safe to memoize this? I think the memoized result would not be valid for any client other than the one it's writer is used for, based on how transit replaces data with shorter tokens. |
@theasp Thanks for pinging about this Andrew. The readers and writer are indeed cached, but I don't see how additionally caching particular reads or writes would be a problem? Last I checked the implementation, Transit can be stateful for the duration of a particular read/write, but shouldn't automatically (to my knowledge) be stateful between separate reads/writes. For example, it uses token replacement to help deduplicate content within a particular payload. Unless you're specifically requesting something like it though (and providing the necessary data), I don't see how/why it'd try provide deduplication between payloads. Does that make sense? I don't use Transit myself though, so may be missing something obvious? |
Yes, it looks like you are correct.
|
@theasp Thanks for the confirmation :-) BTW I'll note that there could (conceivably) be other instances where serialized output is not identical between calls with the same input, but that would still be okay so long as the deserialization doesn't depend on any out-of-band state. If |
Closing in favour of #421 |
Is there any reason the transit packer doesn't include memoization of payloads?
The recommended way to broadcast a message to a group of subscribers is to loop through a list of users:
But if we have 1,000 subscribers, the message data is serialized for transport independently 1,000 times. As it's relatively easy to have a data payload that takes 1ms to pack, these milliseconds add up.
In a quick test, I added
memoize
to this line, and that did the trick. To minimize memory usage, I imagine encore/memoize* with a small cache size might be a good choice?Ideas/feedback welcome, maybe I'm missing something.
The text was updated successfully, but these errors were encountered: