Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store state.narrows as an Immutable.Map (continued) #4201

Merged
merged 10 commits into from
Dec 11, 2020

Conversation

chrisbobbe
Copy link
Contributor

@chrisbobbe chrisbobbe commented Jul 21, 2020

@gnprice said:

Please go ahead and open a new PR with (a rebased version of) 1a80243.

Here you go!

An instance of #3949 and #3950.

@chrisbobbe
Copy link
Contributor Author

Huh, redux-logger has a way to make it easier to look at logs of Immutable things: https://github.com/LogRocket/redux-logger#transform-immutable-with-combinereducers

@chrisbobbe
Copy link
Contributor Author

OK, just fixed some conflicts.

Huh, redux-logger has a way to make it easier to look at logs of Immutable things: https://github.com/LogRocket/redux-logger#transform-immutable-with-combinereducers

And added something based on this.

@chrisbobbe
Copy link
Contributor Author

Just fixed a recent conflict. This would set a good example for, e.g., #4252.

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chrisbobbe ! Sorry for delay -- this looks great. See comments below.

The other major comment is: I'm pretty confident that this refactor is correct, thanks to Flow. But then the other important question about it (and future further changes to our major data structures) is to have some idea of what its effect is on the app's performance. This doesn't have to be super precise, but I'd like to (a) be confident we aren't accidentally making things a lot slower, and (b) ideally be able to confirm that as we start using good immutable-data-structure algorithms, we're making things faster.

I'll go make a chat thread to discuss.

Comment on lines 46 to 90
const eventNewMessage = (state, action) => {
let stateChange = false;
const newState: NarrowsState = {};
Object.keys(state).forEach(key => {
const newState = state.map((value, key) => {
const isInNarrow = isMessageInNarrow(action.message, JSON.parse(key), action.ownEmail);
const isCaughtUp = action.caughtUp[key] && action.caughtUp[key].newer;
const messageDoesNotExist = state[key].find(id => action.message.id === id) === undefined;
const messageDoesNotExist = value.find(id => action.message.id === id) === undefined;

if (isInNarrow && isCaughtUp && messageDoesNotExist) {
stateChange = true;
newState[key] = [...state[key], action.message.id];
return [...value, action.message.id];
} else {
newState[key] = state[key];
return value;
}
});

return stateChange ? newState : state;
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good translation of the old code, and I think it should be comparably efficient. So this is good for the initial conversion commit.

I think with Immutable it should be possible to do a better algorithm, though. The thing that's wasteful here -- and is unavoidable with objects-as-maps, or for that matter with Map, when combined with the Redux expectation of immutable values, but becomes avoidable with something like Immutable -- is that if there are N different narrows in the state, there'll only be a handful (say K) that match this message, but we still end up constructing a new narrows state from scratch with N entries.

Instead, it should be possible to only create an amount of new data that scales more like K than like N. (Probably O(K log N).) Looking back at the Immutable docs, I think their main API for this is withMutations. Inside that, we'd use set -- or perhaps cleaner yet, updateIn.

Thinking a bit about the workflow here, maybe the most efficient thing is:

  • We merge a version of this PR, doing a direct translation of the old code;
  • I experiment with these APIs, and send a PR to demonstrate an algorithm that takes advantage of Immutable's fancier abilities;
  • You try that out, we discuss, we merge a version of it;
  • You do the next piece of our state, like Use Immutable for state.flags. #4252, with both the direct translation and then on top of it the fancier efficient algorithm.

Sound good?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, sounds good!

src/chat/__tests__/narrowsReducer-test.js Outdated Show resolved Hide resolved
Comment on lines 242 to 243
// Make things like `state.narrows` appear as plain
// JS objects in the logs (much easier to read)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat.

Have you tried using this implementation? https://github.com/andrewdavey/immutable-devtools It's a little more code, and specifically aims at the Chrome Dev Tools API... and the result seems somewhat cleaner than is possible to get by translating to plain objects-as-maps and arrays. (One difference that seems fundamental is that if we start using some map keys that aren't strings, those can't translate properly to an object-as-map.)

Ran across that in looking at the Immutable.js wiki: https://github.com/immutable-js/immutable-js/wiki/Modules-that-work-with-Immutable.js

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, that looks helpful. Another reason to prefer something like that (as noted in a code comment): the current implementation will silently drop actions if there's an error thrown from the code that makes plain objects out of Immutable objects.

@chrisbobbe
Copy link
Contributor Author

Thanks for the review! I just pushed a new revision. I'll put together some performance observations soon.

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the revisions! All LGTM pending those performance results, except a nit below.

That custom Chrome DevTools formatter looks great. There are some UI choices that aren't the best -- I'd rather that the expando for an element appeared at the left of the key, like it does for objects in the built-in behavior, rather than at the right of the key (which is potentially long, and potentially varies in length between different keys of a map) -- but definitely better than trying to interpret the Immutable.Map internals. Perhaps at some point I'll feel moved to try to tweak the formatter.

Comment on lines 186 to 189
issue in `zulip-mobile`.

<div id="webview" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: double blank line between sections (helps with visually parsing the source)

@chrisbobbe
Copy link
Contributor Author

chrisbobbe commented Dec 10, 2020

Thanks for the review! And, in particular, the performance measurements, analysis, and a solution to a regression, as discussed in chat. I've just pushed my latest revision.

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @chrisbobbe ! Just small comments below.

unread_message_ids: [1, 2],
stream_id: message1.stream_id,
topic: message1.subject,
unread_message_ids: [message1.id, 2],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These [message1.id, 2] feel kind of incongruous. And there's probably an invariant here that the message IDs should come in order, right? We might not be relying on that in this spot now, but it'd be good for the test not to break if we did.

Probably eg.streamMessage({ id: 1 }) would be a good solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These [message1.id, 2] feel kind of incongruous.

I think the 2 here can be removed in this spot without losing anything important. And in other cases, where it couldn't be removed, I think you're saying that we should probably represent the 2 as something like message2.id, so it's like [message1.id, message2.id], right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably eg.streamMessage({ id: 1 }) would be a good solution.

Ah, right—in the definition of message1, right, so

- const message1 = eg.streamMessage();
+ const message1 = eg.streamMessage({ id: 1 });

Comment on lines 89 to 85
type: EVENT_NEW_MESSAGE,
message: {
id: 1,
type: 'stream',
stream_id: 1,
subject: 'some topic',
},
...eg.eventNewMessageActionBase,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: type is redundant (at this spot and others)

@@ -115,20 +120,16 @@ describe('unreadHuddlesReducer', () => {
test('if message is sent by self, do not mutate state', () => {
const initialState = deepFreeze([]);

const message2 = eg.pmMessage({
sender: eg.selfUser,
recipients: [eg.makeUser({ name: 'john' }), eg.makeUser({ name: 'mark' }), eg.selfUser],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are fine, but I usually just turn them into eg.otherUser and eg.thirdUser, if there don't appear to be other places that depend on the specific names.

(If there are such places, usually it's ideal to replace those too -- it just makes it more work and so I don't always.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would that be OK to do in the same commit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, or can break them up if the commit is getting too unwieldy.

Comment on lines 78 to 79
const messageDoesNotExist = value.find(id => action.message.id === id) === undefined;
if (isCaughtUp && messageDoesNotExist) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One subtle but potentially performance-significant wrinkle that's in my fastest version:

I tried another optimization: [graph]
This is one I spotted because of staring at this code, but it's actually orthogonal to Immutable vs. objects-as-maps, and we should just do it in any case:

     for (const [key, value] of state.entries()) {
-      const isInNarrow = isMessageInNarrow(action.message, flags, JSON.parse(key), action.ownEmail);
-      const isCaughtUp = action.caughtUp[key] && action.caughtUp[key].newer;
-      const messageDoesNotExist = value.find(id => action.message.id === id) === undefined;
-
-      if (isInNarrow && isCaughtUp && messageDoesNotExist) {
+      if (
+        isMessageInNarrow(action.message, flags, JSON.parse(key), action.ownEmail)
+        && (action.caughtUp[key] && action.caughtUp[key].newer)
+        && value.find(id => action.message.id === id) === undefined
+      ) {
         stateMut.set(key, [...value, action.message.id]);
       }
     }

I.e., no need to go looking through the whole narrow to see if the message is already there, if it's a narrow it doesn't even go in.

In general, the pattern this existing code is doing is a bad practice: computing several different complex or potentially-expensive conditions, and only at the end looking to see that the first one already gave us our answer and the rest were irrelevant. Instead, it's better to check each condition after computing it and act on it immediately. This probably isn't as big a deal for performance as it was in my measurements, because those were mostly reflecting the isMessageInNarrow condition, but best to not have to worry about it.

Probably the cleanest way to format that, in the presence of your helpful comments in the next commit, is to use the early-return pattern:

      if (!(action.caughtUp[key] && action.caughtUp[key].newer)) {
        // Don't add a message to the end of the list unless we know
        // it's the most recent message, i.e., unless we know we're
        // currently looking at (caught up with) the newest messages in
        // the narrow.
        return;
      }

though a long chain of && in one if statement (like in my demo version) is another time-honored alternative.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. I had read and considered that optimization, but I skipped it for readability with those new comments in mind (not having considered the quite readable early-return pattern, oops) since the number of iterations has decreased to a constant ~5, as you say:

This probably isn't as big a deal for performance as it was in my measurements, because those were mostly reflecting the isMessageInNarrow condition

But you're right; EVENT_NEW_MESSAGE handling is a place where we should take any easy performance gains enthusiastically. 🙂

chrisbobbe and others added 10 commits December 11, 2020 11:58
And tweak a few tests to be a bit more straightforward.
And tweak a few tests to be a bit more straightforward.
And tweak a few tests to be a bit more straightforward.
Unsurprisingly, these tests are pretty similar to the tests for
`unreadStreamsReducer` and `unreadHuddlesReducer`. But these two
don't match the order in the other two. Might as well fix this small
inconsistency now that we've spotted it.
An instance of zulip#3949 and zulip#3950.

When I wrote the commit message for e834f33, I was most interested
in the exciting new ability to serialize/deserialize custom,
hand-made data types: in particular, ZulipVersion instances.

An important thing that *also* happened in that commit was the new
ability to store data structures from Immutable in Redux, without
any additional setup! `remotedev-serialize` comes with off-the-shelf
support for this. I gave this fact a brief shout-out near the end of
the commit message for a4c29e9.

So, do this, for the first time, paying careful attention to the
instructions in a4c29e9 for writing a migration (copied here, as
that commit message is quite long):

"""
    For making a value "live", where it wasn't before, the migration
    needs to:

    1) As input, take the previous shape of the data. Don't confuse this
    with the *current* way of storing the "dead" shape. Just like any
    other migration, the previous shape is the input.

    2) As output, give the "live" form of the data. Once it's in Redux,
    the replacer will take care of persisting it in the correct "dead"
    form.
"""

The previous shape of the data was an object-as-map, which
`Immutable.Map` takes as input quite cheerfully, and gives us our
"live" Immutable.Map instance.
In a recent commit, we converted `state.narrows` from a plain
object-as-map into an `Immutable.Map`. This will make it convenient
to inspect its data in Chrome.
Not for storing anywhere, just for giving us what we need to handle
EVENT_NEW_MESSAGE a bit differently in `narrowsReducer` in an
upcoming commit. We'll be calling the recently added
`getNarrowsForMessage`, which takes `ownUser`.
Thanks to Greg for the analysis, which I'm summarizing here, and the
suggested solution. For more background, see discussion [1].

When we started using `Immutable.Map` for `state.narrows` in a
recent commit, there was a performance regression on handling
EVENT_NEW_MESSAGE when `state.narrows` has very many keys (we tested
with 1000). We left the regression in, until now, so that commit
could remain a direct, easy-to-follow translation from plain
object-as-map to `Immutable.Map`.

Although using `Immutable.Map` the way we did had the advantage
(over an object-as-map, or over an ES6 `Map`) of not having to
allocate a whole new copy of `state.narrows` just to change one
thing, it still meant *iterating* through all the narrows to check
whether we needed to update each one. So the work being done was
still linear with the number of narrows, even supposing (which is
very reasonable) that the constant factor for read-only iterations
was much smaller than for iterations that had to allocate.

Greg then tried the approach in this commit, which isn't linear in
the number of narrows, and he found that it was significantly faster
for 1000 narrows. The key fact is that you can look at a message and
tell what narrows it belongs in, and that's always a very low number
(like five) [2].

It also makes an improvement that's orthogonal to `Immutable.Map`
vs. objects-as-map vs. `Map`: on an iteration, we should skip the
`messageDoesNotExist` computation (which goes and checks all items
in the array) if `isCaughtUp` has already indicated we're not
interested in that narrow [3].

See his observations for 1000 narrows [4] and for (the much more
usual) 30 narrows [5], where the approach in this commit still shows
up as faster than the object-as-map approach before this series.

[1] https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/.60Immutable.2EMap.60.20performance.20for.20.60state.2Enarrows.60/near/1068889

[2] As long as you exclude search narrows, which there's a pretty
    much unbounded number of -- but we've never bothered
    live-updating them with message events anyway.

[3] https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/.60Immutable.2EMap.60.20performance.20for.20.60state.2Enarrows.60/near/1067739

[4] https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/.60Immutable.2EMap.60.20performance.20for.20.60state.2Enarrows.60/near/1068895

[5] https://chat.zulip.org/#narrow/stream/243-mobile-team/topic/.60Immutable.2EMap.60.20performance.20for.20.60state.2Enarrows.60/near/1068917
These comments have been true for a while; I just noticed that they
were missing while working on the adjacent code.
Having stared at these conditions for a bit (particularly the
latter), I think they're harder to read than they need to be. So,
simplify.
@chrisbobbe
Copy link
Contributor Author

Thanks for the review! I've just pushed another revision.

@gnprice gnprice merged commit 07b163e into zulip:master Dec 11, 2020
@gnprice
Copy link
Member

gnprice commented Dec 11, 2020

LGTM, thanks -- merged!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants