Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unreads: Prepare for converting to efficient data structures #4400

Merged
merged 10 commits into from
Jan 13, 2021

Conversation

gnprice
Copy link
Member

@gnprice gnprice commented Jan 12, 2021

Currently the app can be very sluggish in responding to new data: I measured it a few weeks ago at several hundred ms on any of a new message; or marking messages as read; or handling a batch of fetched messages. And that was on a Pixel 2 XL, which is a fairly recent (2017) flagship device, so it'd be slower on many others. I was viewing the app's home screen (so there wasn't even a message list involved), with a test user on czo who -- like many real users -- has a lot of unreads.

Then breaking it down, a large chunk of that time (for some of those events) is going into updating our data structures, and a larger chunk (or almost all, for some events) is going into updating React components. A root cause of both of these (including the React work) is that many of our data structures are inefficient. We've recently started moving our central data structures to Immutable.js, as in #4201 for state.narrows, #4392 for state.pmConversations, and #4390 (open) for state.messages.

Another of our inefficient data structures, one particularly implicated in those slow React updates, is state.unreads. I have a draft of a branch, begun Friday, to convert state.unreads.streams to something efficient, as a start and as the part of state.unreads that's most likely to be large for many users. The actual conversion still needs a bit more work, but here's a series of prep commits.

The largest part of the change is that we take the tests which had been meddling in the details of the internal representation of state.unreads.streams, and convert them to a form where they instead (a) build up the test states by using the actual reducer, and (b) inspect the resulting states by using selectors, or a helper which would be a reasonable selector. This continues in the direction of organizing reducers and selectors into encapsulated "models", discussed last week in 7c697fb (#4392): the tests test the model as a whole, not its internal pieces.

Converting the tests to this whole-model style makes them a lot shorter and clearer. It also saves us from going and fiddling with all the test cases each time we change the internals of that data structure -- the draft of the actual conversion commit doesn't touch the tests at all, except updating one selector-like helper function. And it keeps the tests in place so they can check all the same facts about the new implementation as they do now about the old one... which is good because yesterday they already caught several bugs in my initial draft 😉 .

Giving this reducer a name will help us start writing some tests
for this whole reducer combined with the selectors.
Instead of inspecting the internals of the state, the tests now look
at the state in a way that the app code in general, via selectors,
might look at it.  In particular, no selector cares (or should care)
about the order in which streams and topics appear.

Similarly, instead of constructing starting states by writing out the
internal representation, these tests now do so the same way the app
in general does: by building states up through the reducer.

I say "whole-model style" because I see this as a step toward
organizing our state-managing code in terms of a "model" that combines
the role of a reducer and selectors, and in which the representation
of the data is an internal implementation detail that the rest of the
app doesn't need to care about and can't even see.  See also previous
discussion in 7c697fb.

In particular, I'd like to change the data structures we use for
tracking unreads, and moving that to be encapsulated inside a model
will make it much cleaner to change it.

As a useful bonus, the tests sure are a lot shorter this way!
I think they're easier to read and understand.
Often it becomes easier to notice these things when the test cases
are all much shorter, so that one can take in more of them at once.
Copy link
Contributor

@chrisbobbe chrisbobbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I continue to be excited for this new way of organizing our selectors and reducers. Just a few comments, below; otherwise, this LGTM.

};

const baseState = (() => {
const streamAction = args => mkMessageAction(eg.streamMessage(args));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: streamAction could be renamed streamMessageAction. I don't think there's a huge chance of confusion with, e.g., event actions with event type EventTypes.stream (something streamsReducer deals with) but I suppose it's not impossible? 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could, but I'm inclined to keep the name short: this helper is local to a quite short stretch of code, so it's always easy to go look at its definition. And I like that the state = reducer(state, …); statements fit on one line each, and those lines are already kind of long.

If this were a more global name (like if it were exported, or even exposed to this whole test file), then I'd definitely want to give it a more explicit name like streamMessageAction.

},
{
user_ids_string: '4,5',
unread_message_ids: [3, 4, 5],
user_ids_string: '1,4,5',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unread tests: Fix some impossible data.

Hmm, I don't yet follow why '4,5' (changed to '1,4,5') is impossible for user_ids_string here.

Copy link
Member Author

@gnprice gnprice Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah -- it's because in this data structure, the key for a group PM thread includes all users including self. So in particular there's some user ID (namely self's) that is in every key in the map. So there can't be both 1,2,3 and 4,5 as keys.

I'll add a remark about this.

(I am glad to see that we have documented that fact about this data structure in general, though! In particular if you hover on getUnreadByHuddles you'll see it in its jsdoc.)

Copy link
Contributor

@chrisbobbe chrisbobbe Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK, got it, thanks! I thought it might have something to do with the self-user, but I couldn't find anything labelling 1 as the self-user's user ID and I guess I stopped going down that path.

It makes sense though now you put it that way; it's impossible to say that 1,2,3 and 4,5 are both valid keys under the constraint that each key must include the self-user, but it is possible for 1,2,3 and 1,4,5 to both be valid keys given that it's possible for 1 to be the self-user's ID.

unread: {
streams: unreadStreamData,
},
unread: unreadState,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate from this refactor, I think there's something a bit wrong with the description for this test: 'when there are unread stream messages, returns a list with counts per stream_id '.

The thing returned isn't a list, it's an object-as-map.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm indeed. I'd guess that the description was true when originally written, and then the data structure was improved 🙂 ... huh, nope, git log -S on that description finds 81a481e where it was already wrong in just this way.

Anyway, I'll fix it.

import * as eg from '../../__tests__/lib/exampleData';
import {
initialState,
initialState as initialUnreadState,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, what's the reason for importing initialState twice (once retaining the initialState name, once giving it the initialUnreadState name)?

Copy link
Member Author

@gnprice gnprice Jan 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good catch. That is probably a consequence of letting the IDE add the import 😉. I'll reduce to just initialUnreadState.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... Or maybe just initialState. This is after all the unread selector tests... plus that's the name used in unreadModel-test.js, and using the same name in both will simplify unifying those in the future.

(The main work to do to unify them is to make all these unreadSelectors tests well-typed. I started on that right after converting the reducer tests, then realized it wasn't actually needed for what this branch is doing, and paused that in favor of pursuing the rest of the branch.)

eg.streamMessage({ stream_id: 1, subject: 'topic 4', id: 8 }),
].reduce(
(st, message) => unreadReducer(st, mkMessageAction(message)),
eg.baseReduxState.unread,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the initialState export from unread-testlib.js do just as well as eg.baseReduxState.unread, or should we stick with the latter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes. I think it was as I was fixing up this test file (after writing most of the other changes) that I realized that eg.baseReduxState.unread should be exactly equivalent to that initialState export -- after all, the point of the way we construct that initialState, with { type: eg.randString() }, is that it exercises exactly the same path through the reducer that Redux uses to initialize the state. (And eg.baseReduxState comes via Redux's createStore.)

I'm not entirely sure what the most helpful style is for referring to that state.

  • Here, eg.baseReduxState.unread feels natural because it fits in with the rest of this global state we're assembling.
  • In the unread-model tests (including unreadSelectors-test.js), initialState or initialUnreadState feels natural because it makes it clear that it's local -- these are the tests of this very code, so they shouldn't need to appeal to some central library to get an instance of this state.

I'm inclined to leave the style different between these places at least for now, and perhaps in the future we'll settle on one of them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm inclined to leave the style different between these places at least for now, and perhaps in the future we'll settle on one of them.

Sounds good; I don't have strong feelings either way.

One consequence of building test data manually through the internal
representation is that sometimes it doesn't make any sense!

These message IDs collide with each other; and then in the "huddles"
(i.e. group-PMs) part of the data, the keys are all supposed to
contain the self user, so they must have a user ID in common.  If we
say user 1 is self, that's consistent with the 1:1 PMs side of this
data, so we pick that.

Discovered these while converting this data to a more structured
form where these inconsistencies can't happen.  Fix them first,
to simplify the conversion.
Now these tests don't know or care about the internal representation
of the unreads data.

The new "testlib" file enables these two test files to share these
helpers.
This description has been untrue since it was added in 81a481e.
No need to be so specific about the shape of the data, anyway: the
test's code speaks for itself.  And unlike the description, the code
will naturally get updated as the answer changes :-)
This follows doing the same thing for the unreads code's own tests,
a few commits ago.
This is a step toward encapsulating these as an internal
implementation detail of the "unreads" model, and then
changing them to a data structure we can use more efficiently.

Ideally these would go inside the `unreadModel` module/file itself,
but that doesn't work because it entangles all the unread*Reducer
modules in the `types.js` import cycle-blob.  The eventual right
solution there is probably that the all-in-one `types` module
shouldn't exist.  But this works as an immediate solution.
This is another step toward encapsulating the details of this
part of our state from the rest of the app.
Flow infers these just fine.

They don't add much for the reader, as the selectors that feed
these parameters are right there in the lines above and are
generally pretty clear.  (The reader might be surprised to learn
that `getMute` returns a list of pairs... but this annotation, as
`MuteState`, wasn't helping with that anyway.)

And by repeating the expected types for these selectors to return,
they add friction for changing them.  So take them out.

(A lot of our other createSelector calls could usefully get the
same treatment.)
@gnprice
Copy link
Member Author

gnprice commented Jan 13, 2021

New revision pushed! Take a look.

@chrisbobbe
Copy link
Contributor

Great, thanks! Merged.

@gnprice
Copy link
Member Author

gnprice commented Jan 27, 2021

I've just filed #4438 to give a name to the issue that this work was in preparation for fixing.

@chrisbobbe
Copy link
Contributor

I've just filed #4438 to give a name to the issue that this work was in preparation for fixing.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants