Memoized packing for pushing identical messages to large groups #276

mhuebert · 2016-10-25T18:46:45Z

Is there any reason the transit packer doesn't include memoization of payloads?

The recommended way to broadcast a message to a group of subscribers is to loop through a list of users:

  (doseq [uid (pubsub/subscribers :my-topic)]
      (chsk-send! uid [:my-app/action {:text "This Is Not A Tweet", :payload some-data}]))

But if we have 1,000 subscribers, the message data is serialized for transport independently 1,000 times. As it's relatively easy to have a data payload that takes 1ms to pack, these milliseconds add up.

In a quick test, I added memoize to this line, and that did the trick. To minimize memory usage, I imagine encore/memoize* with a small cache size might be a good choice?

Ideas/feedback welcome, maybe I'm missing something.

The text was updated successfully, but these errors were encountered:

ptaoussanis · 2016-10-26T03:36:25Z

Hey Matt, thanks for pinging about this!

You've spotted a definite inefficiency (just something I hadn't considered), and a good fix (with encore/memoize*).

One potential minor difficulty with using memoize* is that its caching/gc parameters aren't easily tuneable when used in a library.

In this case, think it might actually make sense to consider adding a new API fn for bulk broadcasting that'll just generate the payload once before looping.

Just have my hands quite full atm, but will think about it a little and get back to you.

Again, this was a good (and valuable) catch - and a nice clear report: much appreciated!

Cheers :-)

Objectives: - [#276] Support easy+efficient generic un/pack caching. - In particular: support cache-friendly server->client event buffering. - [#16] Support client->server buffering with arb cbs. - Read+write performance. - Bandwidth efficiency. - Flexibility for possible future extensions/changes. - Human-readable on the wire (handy for debugging, etc.). - Lay groundwork for possible broadcast API fn. Downsides: - Would involve a breaking incompatibility with old clients.

mhuebert · 2016-10-26T11:15:13Z

Hey, thanks for the quick reply + commit!

I figured this would touch on other concerns, like buffering. Thanks for the informative comments in the commit.

A bit off topic, but I'm not sure a separate issue is warranted or if this is covered elsewhere - are there tests that aren't committed into the repo? I see a testing profile in profile.clj, but I don't seem to see anything in the test directory. While experimenting locally I was wondering how you approached testing for performance w/ many connected clients.

Objectives: - [#276] Support easy+efficient generic un/pack caching. - In particular: support cache-friendly server->client event buffering. - [#16] Support client->server buffering with arb cbs. - Read+write performance. - Bandwidth efficiency. - Flexibility for possible future extensions/changes. - Human-readable on the wire (handy for debugging, etc.). - Lay groundwork for possible broadcast API fn. Downsides: - Would involve a breaking incompatibility with old clients.

This reverts commit c7dfdad.

theasp · 2016-10-26T19:52:13Z

I haven't looked at your changes, but I believe that Sente keeps the transit readers and writers for each client since #161. Is it still safe to memoize this? I think the memoized result would not be valid for any client other than the one it's writer is used for, based on how transit replaces data with shorter tokens.

ptaoussanis · 2016-10-27T03:44:39Z

@theasp Thanks for pinging about this Andrew.

The readers and writer are indeed cached, but I don't see how additionally caching particular reads or writes would be a problem?

Last I checked the implementation, Transit can be stateful for the duration of a particular read/write, but shouldn't automatically (to my knowledge) be stateful between separate reads/writes.

For example, it uses token replacement to help deduplicate content within a particular payload. Unless you're specifically requesting something like it though (and providing the necessary data), I don't see how/why it'd try provide deduplication between payloads.

Does that make sense?

I don't use Transit myself though, so may be missing something obvious?

theasp · 2016-10-28T00:57:39Z

Yes, it looks like you are correct.

cljs.user=> (require '[cognitect.transit :as t])
nil
cljs.user=> (def writer (t/writer :json))
#'cljs.user/writer
cljs.user=> (t/write writer [:msg {:a 1 :b 2}])
"[\"~:msg\",[\"^ \",\"~:a\",1,\"~:b\",2]]"
cljs.user=> (t/write writer [:msg {:a 1 :b 2}])
"[\"~:msg\",[\"^ \",\"~:a\",1,\"~:b\",2]]"

ptaoussanis · 2016-10-28T02:58:41Z

@theasp Thanks for the confirmation :-)

BTW I'll note that there could (conceivably) be other instances where serialized output is not identical between calls with the same input, but that would still be okay so long as the deserialization doesn't depend on any out-of-band state.

If freeze(x)->y, then we only need thaw(y)->x; we don't otherwise care about the interim y value in Sente's case (or for read/write caching).

ptaoussanis · 2023-02-24T16:53:55Z

Closing in favour of #421

ptaoussanis self-assigned this Oct 26, 2016

ptaoussanis added a commit that referenced this issue Oct 26, 2016

[#276] Experimental: basic un/pack caching

c7dfdad

ptaoussanis added a commit that referenced this issue Oct 26, 2016

Revert "[#276] Experimental: basic un/pack caching"

abaafd1

This reverts commit c7dfdad.

ptaoussanis added the enhancement label Mar 24, 2017

ptaoussanis mentioned this issue Feb 24, 2023

Add util for bulk server->uids push? #421

Open

ptaoussanis closed this as completed Feb 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memoized packing for pushing identical messages to large groups #276

Memoized packing for pushing identical messages to large groups #276

mhuebert commented Oct 25, 2016 •

edited

Loading

ptaoussanis commented Oct 26, 2016

mhuebert commented Oct 26, 2016

theasp commented Oct 26, 2016

ptaoussanis commented Oct 27, 2016

theasp commented Oct 28, 2016

ptaoussanis commented Oct 28, 2016

ptaoussanis commented Feb 24, 2023

Memoized packing for pushing identical messages to large groups #276

Memoized packing for pushing identical messages to large groups #276

Comments

mhuebert commented Oct 25, 2016 • edited Loading

ptaoussanis commented Oct 26, 2016

mhuebert commented Oct 26, 2016

theasp commented Oct 26, 2016

ptaoussanis commented Oct 27, 2016

theasp commented Oct 28, 2016

ptaoussanis commented Oct 28, 2016

ptaoussanis commented Feb 24, 2023

mhuebert commented Oct 25, 2016 •

edited

Loading