unreads: Prepare for converting to efficient data structures #4400

gnprice · 2021-01-12T21:40:43Z

Currently the app can be very sluggish in responding to new data: I measured it a few weeks ago at several hundred ms on any of a new message; or marking messages as read; or handling a batch of fetched messages. And that was on a Pixel 2 XL, which is a fairly recent (2017) flagship device, so it'd be slower on many others. I was viewing the app's home screen (so there wasn't even a message list involved), with a test user on czo who -- like many real users -- has a lot of unreads.

Then breaking it down, a large chunk of that time (for some of those events) is going into updating our data structures, and a larger chunk (or almost all, for some events) is going into updating React components. A root cause of both of these (including the React work) is that many of our data structures are inefficient. We've recently started moving our central data structures to Immutable.js, as in #4201 for state.narrows, #4392 for state.pmConversations, and #4390 (open) for state.messages.

Another of our inefficient data structures, one particularly implicated in those slow React updates, is state.unreads. I have a draft of a branch, begun Friday, to convert state.unreads.streams to something efficient, as a start and as the part of state.unreads that's most likely to be large for many users. The actual conversion still needs a bit more work, but here's a series of prep commits.

The largest part of the change is that we take the tests which had been meddling in the details of the internal representation of state.unreads.streams, and convert them to a form where they instead (a) build up the test states by using the actual reducer, and (b) inspect the resulting states by using selectors, or a helper which would be a reasonable selector. This continues in the direction of organizing reducers and selectors into encapsulated "models", discussed last week in 7c697fb (#4392): the tests test the model as a whole, not its internal pieces.

Converting the tests to this whole-model style makes them a lot shorter and clearer. It also saves us from going and fiddling with all the test cases each time we change the internals of that data structure -- the draft of the actual conversion commit doesn't touch the tests at all, except updating one selector-like helper function. And it keeps the tests in place so they can check all the same facts about the new implementation as they do now about the old one... which is good because yesterday they already caught several bugs in my initial draft 😉 .

Giving this reducer a name will help us start writing some tests for this whole reducer combined with the selectors.

Instead of inspecting the internals of the state, the tests now look at the state in a way that the app code in general, via selectors, might look at it. In particular, no selector cares (or should care) about the order in which streams and topics appear. Similarly, instead of constructing starting states by writing out the internal representation, these tests now do so the same way the app in general does: by building states up through the reducer. I say "whole-model style" because I see this as a step toward organizing our state-managing code in terms of a "model" that combines the role of a reducer and selectors, and in which the representation of the data is an internal implementation detail that the rest of the app doesn't need to care about and can't even see. See also previous discussion in 7c697fb. In particular, I'd like to change the data structures we use for tracking unreads, and moving that to be encapsulated inside a model will make it much cleaner to change it. As a useful bonus, the tests sure are a lot shorter this way! I think they're easier to read and understand.

Often it becomes easier to notice these things when the test cases are all much shorter, so that one can take in more of them at once.

chrisbobbe

Thanks! I continue to be excited for this new way of organizing our selectors and reducers. Just a few comments, below; otherwise, this LGTM.

chrisbobbe · 2021-01-13T01:59:13Z

src/unread/__tests__/unreadModel-test.js

+    };
+
+    const baseState = (() => {
+      const streamAction = args => mkMessageAction(eg.streamMessage(args));


nit: streamAction could be renamed streamMessageAction. I don't think there's a huge chance of confusion with, e.g., event actions with event type EventTypes.stream (something streamsReducer deals with) but I suppose it's not impossible? 🤔

It could, but I'm inclined to keep the name short: this helper is local to a quite short stretch of code, so it's always easy to go look at its definition. And I like that the state = reducer(state, …); statements fit on one line each, and those lines are already kind of long.

If this were a more global name (like if it were exported, or even exposed to this whole test file), then I'd definitely want to give it a more explicit name like streamMessageAction.

chrisbobbe · 2021-01-13T02:18:05Z

src/unread/__tests__/unreadSelectors-test.js

  },
  {
-    user_ids_string: '4,5',
-    unread_message_ids: [3, 4, 5],
+    user_ids_string: '1,4,5',


unread tests: Fix some impossible data.

Hmm, I don't yet follow why '4,5' (changed to '1,4,5') is impossible for user_ids_string here.

Ah -- it's because in this data structure, the key for a group PM thread includes all users including self. So in particular there's some user ID (namely self's) that is in every key in the map. So there can't be both 1,2,3 and 4,5 as keys.

I'll add a remark about this.

(I am glad to see that we have documented that fact about this data structure in general, though! In particular if you hover on getUnreadByHuddles you'll see it in its jsdoc.)

Ah OK, got it, thanks! I thought it might have something to do with the self-user, but I couldn't find anything labelling 1 as the self-user's user ID and I guess I stopped going down that path.

It makes sense though now you put it that way; it's impossible to say that 1,2,3 and 4,5 are both valid keys under the constraint that each key must include the self-user, but it is possible for 1,2,3 and 1,4,5 to both be valid keys given that it's possible for 1 to be the self-user's ID.

chrisbobbe · 2021-01-13T02:26:08Z

src/unread/__tests__/unreadSelectors-test.js

-      unread: {
-        streams: unreadStreamData,
-      },
+      unread: unreadState,


Separate from this refactor, I think there's something a bit wrong with the description for this test: 'when there are unread stream messages, returns a list with counts per stream_id '.

The thing returned isn't a list, it's an object-as-map.

Hmm indeed. I'd guess that the description was true when originally written, and then the data structure was improved 🙂 ... huh, nope, git log -S on that description finds 81a481e where it was already wrong in just this way.

Anyway, I'll fix it.

chrisbobbe · 2021-01-13T02:29:40Z

src/unread/__tests__/unreadSelectors-test.js

+import * as eg from '../../__tests__/lib/exampleData';
+import {
+  initialState,
+  initialState as initialUnreadState,


Hmm, what's the reason for importing initialState twice (once retaining the initialState name, once giving it the initialUnreadState name)?

Hmm, good catch. That is probably a consequence of letting the IDE add the import 😉. I'll reduce to just initialUnreadState.

... Or maybe just initialState. This is after all the unread selector tests... plus that's the name used in unreadModel-test.js, and using the same name in both will simplify unifying those in the future.

(The main work to do to unify them is to make all these unreadSelectors tests well-typed. I started on that right after converting the reducer tests, then realized it wasn't actually needed for what this branch is doing, and paused that in favor of pursuing the rest of the branch.)

chrisbobbe · 2021-01-13T02:37:00Z

src/topics/__tests__/topicsSelectors-test.js

+        eg.streamMessage({ stream_id: 1, subject: 'topic 4', id: 8 }),
+      ].reduce(
+        (st, message) => unreadReducer(st, mkMessageAction(message)),
+        eg.baseReduxState.unread,


Would the initialState export from unread-testlib.js do just as well as eg.baseReduxState.unread, or should we stick with the latter?

Ah, yes. I think it was as I was fixing up this test file (after writing most of the other changes) that I realized that eg.baseReduxState.unread should be exactly equivalent to that initialState export -- after all, the point of the way we construct that initialState, with { type: eg.randString() }, is that it exercises exactly the same path through the reducer that Redux uses to initialize the state. (And eg.baseReduxState comes via Redux's createStore.)

I'm not entirely sure what the most helpful style is for referring to that state.

Here, eg.baseReduxState.unread feels natural because it fits in with the rest of this global state we're assembling.

In the unread-model tests (including unreadSelectors-test.js), initialState or initialUnreadState feels natural because it makes it clear that it's local -- these are the tests of this very code, so they shouldn't need to appeal to some central library to get an instance of this state.

I'm inclined to leave the style different between these places at least for now, and perhaps in the future we'll settle on one of them.

I'm inclined to leave the style different between these places at least for now, and perhaps in the future we'll settle on one of them.

Sounds good; I don't have strong feelings either way.

One consequence of building test data manually through the internal representation is that sometimes it doesn't make any sense! These message IDs collide with each other; and then in the "huddles" (i.e. group-PMs) part of the data, the keys are all supposed to contain the self user, so they must have a user ID in common. If we say user 1 is self, that's consistent with the 1:1 PMs side of this data, so we pick that. Discovered these while converting this data to a more structured form where these inconsistencies can't happen. Fix them first, to simplify the conversion.

Now these tests don't know or care about the internal representation of the unreads data. The new "testlib" file enables these two test files to share these helpers.

This description has been untrue since it was added in 81a481e. No need to be so specific about the shape of the data, anyway: the test's code speaks for itself. And unlike the description, the code will naturally get updated as the answer changes :-)

This follows doing the same thing for the unreads code's own tests, a few commits ago.

This is a step toward encapsulating these as an internal implementation detail of the "unreads" model, and then changing them to a data structure we can use more efficiently. Ideally these would go inside the `unreadModel` module/file itself, but that doesn't work because it entangles all the unread*Reducer modules in the `types.js` import cycle-blob. The eventual right solution there is probably that the all-in-one `types` module shouldn't exist. But this works as an immediate solution.

This is another step toward encapsulating the details of this part of our state from the rest of the app.

Flow infers these just fine. They don't add much for the reader, as the selectors that feed these parameters are right there in the lines above and are generally pretty clear. (The reader might be surprised to learn that `getMute` returns a list of pairs... but this annotation, as `MuteState`, wasn't helping with that anyway.) And by repeating the expected types for these selectors to return, they add friction for changing them. So take them out. (A lot of our other createSelector calls could usefully get the same treatment.)

gnprice · 2021-01-13T20:57:17Z

New revision pushed! Take a look.

chrisbobbe · 2021-01-13T21:40:53Z

Great, thanks! Merged.

gnprice · 2021-01-27T04:13:43Z

I've just filed #4438 to give a name to the issue that this work was in preparation for fixing.

chrisbobbe · 2021-01-27T21:06:51Z

I've just filed #4438 to give a name to the issue that this work was in preparation for fixing.

Thanks!

gnprice requested a review from chrisbobbe January 12, 2021 21:40

gnprice added 3 commits January 12, 2021 17:44

unread [nfc]: Start a model file; move combined reducer there.

faabe2e

Giving this reducer a name will help us start writing some tests for this whole reducer combined with the selectors.

unread tests: Add a missing case that the conversion highlighted.

335da49

Often it becomes easier to notice these things when the test cases are all much shorter, so that one can take in more of them at once.

chrisbobbe reviewed Jan 13, 2021

View reviewed changes

gnprice force-pushed the pr-unread branch from 007ab91 to 0ac971b Compare January 13, 2021 20:16

gnprice added 7 commits January 13, 2021 12:50

unread tests: Convert selector tests to black-box style.

fbb8316

Now these tests don't know or care about the internal representation of the unreads data. The new "testlib" file enables these two test files to share these helpers.

topics tests: Convert unreads data to black-box form.

6f88818

This follows doing the same thing for the unreads code's own tests, a few commits ago.

unread [nfc]: Move the "direct" selectors into model file.

444d0e5

This is another step toward encapsulating the details of this part of our state from the rest of the app.

gnprice force-pushed the pr-unread branch from 0ac971b to 47496ee Compare January 13, 2021 20:57

chrisbobbe merged commit 47496ee into zulip:master Jan 13, 2021

gnprice deleted the pr-unread branch January 13, 2021 21:48

gnprice mentioned this pull request Jan 27, 2021

Make "unread" model efficient at marking a message as read #4438

Closed

gnprice added the a-redux label Jan 27, 2021

gnprice mentioned this pull request Jan 27, 2021

Consolidate internal details of each of our Redux data structures #3934

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unreads: Prepare for converting to efficient data structures #4400

unreads: Prepare for converting to efficient data structures #4400

gnprice commented Jan 12, 2021 •

edited

Loading

chrisbobbe left a comment

chrisbobbe Jan 13, 2021

gnprice Jan 13, 2021

chrisbobbe Jan 13, 2021

gnprice Jan 13, 2021 •

edited

Loading

chrisbobbe Jan 13, 2021 •

edited

Loading

chrisbobbe Jan 13, 2021

gnprice Jan 13, 2021

chrisbobbe Jan 13, 2021

gnprice Jan 13, 2021 •

edited

Loading

gnprice Jan 13, 2021

chrisbobbe Jan 13, 2021

gnprice Jan 13, 2021

chrisbobbe Jan 13, 2021

gnprice commented Jan 13, 2021

chrisbobbe commented Jan 13, 2021

gnprice commented Jan 27, 2021

chrisbobbe commented Jan 27, 2021

unreads: Prepare for converting to efficient data structures #4400

unreads: Prepare for converting to efficient data structures #4400

Conversation

gnprice commented Jan 12, 2021 • edited Loading

chrisbobbe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gnprice Jan 13, 2021 • edited Loading

Choose a reason for hiding this comment

chrisbobbe Jan 13, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gnprice Jan 13, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gnprice commented Jan 13, 2021

chrisbobbe commented Jan 13, 2021

gnprice commented Jan 27, 2021

chrisbobbe commented Jan 27, 2021

gnprice commented Jan 12, 2021 •

edited

Loading

gnprice Jan 13, 2021 •

edited

Loading

chrisbobbe Jan 13, 2021 •

edited

Loading

gnprice Jan 13, 2021 •

edited

Loading