Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search/offload filtering 2 #1941

Merged
merged 1 commit into from
Mar 13, 2020
Merged

Search/offload filtering 2 #1941

merged 1 commit into from
Mar 13, 2020

Conversation

dmsnell
Copy link
Member

@dmsnell dmsnell commented Feb 28, 2020

Second version #1835

Highlights

  • offloads search into a WebWorker which means search doesn't block the main thread
  • on my heaviest accounts and worst searches it's still finishing within 1ms and completely responsive. the effect is huge when comparing delayed results to instant results

Previously we have had to deal with the inconvenient way in which search
operations occur in the same UI thread as the rest of the app. The inconvenient
part has been that while typing and when you want the app to be most responsive -
that is when it was the slowest due to the filtering happening in the note
list.

For many years this was such a problem and would make typing in the search
field too cumbersome so we created a debounce value to wait until you stopped
typing in order to update the note list. This debounce, carried out as a
separate data stream from the main app state, caused problems in that the
debounced search value and the app state could get out of sync with each other.
This led to even more confusion creating separate values for the text in the
search field and a debounced search field value.

With recent refactorings to the app state we have been able to eliminate some
of the spurious data flows and get back to a single state value representing
the currently-typed value in the search field. We have separated the act of
updating the note list from the act of typing in a search query. With that
final necessary separate we are able to get to this patch.

In this patch we're recreating the search system and moving it into a
WebWorker, outside of the main UI thread. What this means is that we can now
update the filtering of the note list as often as we want and it won't
interfere with the search field or with typing. By moving the data into its own
asynchronous "search indexer" too we are able to make several trivial
optimizations which have resulted in improving the performance of searching by
several magnitudes. These mostly revolve around doing less work by being
reactive instead of repeating calculations on every keypress:

  • Instead of building a case-insensitive search RegExp, store the lower-cased
    version of each note in the search index and lower-case the search field and
    use more basic string search functions.
  • Instead of parsing the search query every time we execute a search/filter
    operation, only parse it when it changes.
  • Instead of scanning the list of note tags on every search, build a Set()
    when we index the note which provides O(1) membership tests instead of O(N)
    with the number of tags in a note.
  • Provide a guard against long-running searches by performing a "quick" search
    up until 10ms have passed and then queuing a longer full search if we hit
    that threshold. (In all my testing I could not find a search that took more
    than about 1ms).

In hind sight a few of the optimizations in this PR could have been merged into
develop directly without the WebWorker and because of the performance impact
we could probably get by without making search asynchronous. However, there is
still a clear benefit to pushing this into a WebWorker and out of the main
thread. As we continue to iterate on the quality of the search results this
will become more important.

Props to @belcherj and @codebykat for their hard work refactoring the app
state which has enabled this kind of work to be done.

@dmsnell dmsnell requested a review from a team February 28, 2020 02:56
@dmsnell dmsnell marked this pull request as ready for review March 1, 2020 23:31
Copy link
Member

@codebykat codebykat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Holy cannolis this is BLAZING FAST I love it 😍

};

const updateFilter = (scope = 'quickSearch') => {
const tic = performance.now();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"tick" probably?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Common practice I picked up in my MATLAB days: tic and toc - l like their symmetry

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just that "tic" has another meaning and it's not a good one..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really feel strongly about it though

// receive inbound messages from the main thread
// in testing this was rare and may only happen in unexpected
// circumstances such as when performing a garbage-collection
const toc = performance.now();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and "tock" idk but "tic" is not the best :)

@@ -0,0 +1,106 @@
import SearchWorker from 'worker-loader!./worker';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, what is this syntax?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a way to get a webpack loader working without having to configure webpack to handle the file type. In this case it runs the import through the webworker loader so that it turns it into something we can load in the worker frame

@codebykat codebykat self-requested a review March 5, 2020 16:06
Copy link
Member

@codebykat codebykat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh whoops it looks like it makes search case-sensitive though

Copy link
Contributor

@belcherj belcherj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tests really well! I do agree with @codebykat that tick/tock is preferable but I don't feel strongly either way.

@dmsnell
Copy link
Member Author

dmsnell commented Mar 13, 2020

Thanks both of you for the review. If you don't have strong feeling about tic/toc I'd like to keep them; maybe it's just my way of putting a mark on the world. Maybe there's something about the computational world that led them this way, but MATLAB, R, and octave all use tic/toc, and some other languages adopted that same nomenclature as well.

now you have me wondering what the history of its origin is 🤔

@dmsnell dmsnell force-pushed the search/offload-filtering-2 branch from 627118a to 451046c Compare March 13, 2020 00:38
@dmsnell dmsnell added this to the 1.16 milestone Mar 13, 2020
@dmsnell dmsnell force-pushed the search/offload-filtering-2 branch from ad6c930 to a18ddf0 Compare March 13, 2020 01:23
Previously we have had to deal with the inconvenient way in which search
operations occur in the same UI thread as the rest of the app. The inconvenient
part has been that while typing and when you want the app to be most responsive
- that is when it was the slowest due to the filtering happening in the note
list.

For many years this was such a problem and would make typing in the search
field too cumbersome so we created a debounce value to wait until you stopped
typing in order to update the note list. This debounce, carried out as a
separate data stream from the main app state, caused problems in that the
debounced search value and the app state could get out of sync with each other.
This led to even more confusion creating separate values for the text in the
search field _and_ a debounced search field value.

With recent refactorings to the app state we have been able to eliminate some
of the spurious data flows and get back to a single state value representing
the currently-typed value in the search field. We have separated the act of
updating the note list from the act of typing in a search query. With that
final necessary separate we are able to get to this patch.

In this patch we're recreating the search system and moving it into a
WebWorker, outside of the main UI thread. What this means is that we can now
update the filtering of the note list as often as we want and it won't
interfere with the search field or with typing. By moving the data into its own
asynchronous "search indexer" too we are able to make several trivial
optimizations which have resulted in improving the performance of searching by
several magnitudes. These mostly revolve around doing less work by being
reactive instead of repeating calculations on every keypress:

 - Instead of building a case-insensitive search RegExp, store the lower-cased
   version of each note in the search index and lower-case the search field and
use more basic string search functions.
 - Instead of parsing the search query every time we execute a search/filter
   operation, only parse it when it changes.
 - Instead of scanning the list of note tags on every search, build a `Set()`
   when we index the note which provides O(1) membership tests instead of O(N)
with the number of tags in a note.
 - Provide a guard against long-running searches by performing a "quick" search
   up until 10ms have passed and then queuing a longer full search if we hit
that threshold. (In all my testing I could not find a search that took more
than about 1ms).

In hind sight a few of the optimizations in this PR could have been merged into
`develop` directly without the WebWorker and because of the performance impact
we could probably get by without making search asynchronous. However, there is
still a clear benefit to pushing this into a WebWorker and out of the main
thread. As we continue to iterate on the quality of the search results this
will become more important.

Props to @belcherj and @codebykat for their hard work refactoring the app
state which has enabled this kind of work to be done.
@dmsnell dmsnell force-pushed the search/offload-filtering-2 branch from a18ddf0 to 07cac3b Compare March 13, 2020 01:55
@dmsnell dmsnell merged commit b9efa1f into develop Mar 13, 2020
@dmsnell dmsnell deleted the search/offload-filtering-2 branch March 13, 2020 01:56
dmsnell added a commit that referenced this pull request Mar 13, 2020
…#1956)

When I deployed #1941 to staging.simplenote.com I discovered that the WebWorker wasn't loading. The reason was because it was being served from another domain than the website itself. This triggered content security policies in the browser.

With this change we're no longer loading an external script. Instead, webpack builds a BLOB containing the source of the worker and ships it inline with the rest of the application code, eliminating the external script load.

After re-deploying to staging I confirmed that this fixes the loading problem.
dmsnell added a commit that referenced this pull request Mar 17, 2020
In #1941 we introduced two regressions and this patch fixes them.

 - search needs to remember the opened tag and limit results to those
   matching within that tag. this was overlooked when updating the
   search query and updating the `filterTags` _from_ that search query

 - when passing the `filterNotes` action back to the application we
   have to make sure to pass the previous index. this was lost during
   the merge of the PR because because the additions from #1919, which
   fixed the previous regression for it, were deleted as part of
   moving `ui/middleware` into `search/index`

With these changes in place the two regressions should be fixed again.
dmsnell added a commit that referenced this pull request Mar 17, 2020
In #1941 we introduced two regressions and this patch fixes them.

 - search needs to remember the opened tag and limit results to those
   matching within that tag. this was overlooked when updating the
   search query and updating the `filterTags` _from_ that search query

 - when passing the `filterNotes` action back to the application we
   have to make sure to pass the previous index. this was lost during
   the merge of the PR because because the additions from #1919, which
   fixed the previous regression for it, were deleted as part of
   moving `ui/middleware` into `search/index`

With these changes in place the two regressions should be fixed again.
dmsnell added a commit that referenced this pull request Mar 26, 2020
In #1941 we introduced instant search results.
In #1919 and in #1966 we refactored the `previousIndex` into Redux

The result was that we weren't resetting the `previousIndex` properly
and so on every search we would select _some_ note, often the first
one in the list. If this note were large or slow to render then the
performance gains we achieved with search were destroyed by the cost
of rendering the notes.

In this patch we're properly resetting the `previousNote` whenever
we actually filter the notes. This means that searches will clear
the selected note if it's not in the search results. This also brings
back the performance gains we got by refactoring search itself.
dmsnell added a commit that referenced this pull request Mar 30, 2020
In #1941 we introduced instant search results.
In #1919 and in #1966 we refactored the `previousIndex` into Redux

The result was that we weren't resetting the `previousIndex` properly
and so on every search we would select _some_ note, often the first
one in the list. If this note were large or slow to render then the
performance gains we achieved with search were destroyed by the cost
of rendering the notes.

In this patch we're properly resetting the `previousNote` whenever
we actually filter the notes. This means that searches will clear
the selected note if it's not in the search results. This also brings
back the performance gains we got by refactoring search itself.
dmsnell added a commit that referenced this pull request Apr 2, 2020
Fixes #1941
Supplants #1966
Supplants #1979

It turns out that having search inside a WebWorker isn't as necessary
as we thought and at the same time presents one major obstacle: search
must be asynchronous. This causes a few problems with the app is
currently designed, namely that the interaction between `previousIndex`,
trash/delete/restore operations, and the note list.

The primary goal of the larger state-refactor project has been to
eliminate non-atomic state updates and the WebWorker's asyncronous
mandate means that there is no way to synchronously update state, which
means that there's no way to run the trashing actions at the same time
that we update the search filter. This leaves an awkward rendered state
where the note in the note list is trashed and the toolbar above it
shows the trash "Delete forever" and "Restore" buttons but we're still
looking at the note list.

In order to resolve these bugs and eliminate further issues I have
brought the search back into the main thread. Why? Won't this destroy
all that we gained in terms of performance?" you might ask. No, actually
most of the performance gain came from changes I made to the search
mechanism _while_ moving it into a WebWorker. I had wanted it to be in
a WebWorker because it was slow, but now that we can see that it's
very very fast and shouldn't be a UI blocker we don't have the same
motivation to get it out of the main thread.

In summary, this patch moves the search back into the main thread and
exposes the `updateFilter()` function so that it can be called synchronously
for actions which demand immediate UI updates.
dmsnell added a commit that referenced this pull request Apr 3, 2020
Fixes #1941
Supplants #1966
Supplants #1979

It turns out that having search inside a WebWorker isn't as necessary
as we thought and at the same time presents one major obstacle: search
must be asynchronous. This causes a few problems with the app is
currently designed, namely that the interaction between `previousIndex`,
trash/delete/restore operations, and the note list.

The primary goal of the larger state-refactor project has been to
eliminate non-atomic state updates and the WebWorker's asyncronous
mandate means that there is no way to synchronously update state, which
means that there's no way to run the trashing actions at the same time
that we update the search filter. This leaves an awkward rendered state
where the note in the note list is trashed and the toolbar above it
shows the trash "Delete forever" and "Restore" buttons but we're still
looking at the note list.

In order to resolve these bugs and eliminate further issues I have
brought the search back into the main thread. Why? Won't this destroy
all that we gained in terms of performance?" you might ask. No, actually
most of the performance gain came from changes I made to the search
mechanism _while_ moving it into a WebWorker. I had wanted it to be in
a WebWorker because it was slow, but now that we can see that it's
very very fast and shouldn't be a UI blocker we don't have the same
motivation to get it out of the main thread.

In summary, this patch moves the search back into the main thread and
exposes the `updateFilter()` function so that it can be called synchronously
for actions which demand immediate UI updates.
dmsnell added a commit that referenced this pull request May 11, 2020
See #1941
See #1982

When we rebuilt the search to return instant results and removed the debounce
on the search filter we exposed an issue with note rendering performance that
ironically made the new instant search slower than the old one in certain
circumstances, namely when a note in the search results takes a very long time
to render.

The leading reason for the performance issue is that `draft-js` was applying
a new decorator to its note on every change to the search field. With the
search field updating instantly that left no time for the decorators to redraw
(and they were very inefficient to make it worse).

In this patch we're delaying the re-decoration until the search field settles.
This doesn't eliminate the problem but it should bring it roughly on par with
the behavior from before the search updates. Further we have eliminated the
`MultiDecorator` dependency since that functionality is provided by `draft-js`
itself. Since there's no need to create a composite decorator when the search
query doesn't contain any text terms we can futher skip it and only decorate
with the checkbox decorator.

These changes should make searching tolerant to slow notes and should
additionally cut the time it takes to decorate notes approximately in half.
belcherj pushed a commit that referenced this pull request May 12, 2020
See #1941
See #1982

When we rebuilt the search to return instant results and removed the debounce
on the search filter we exposed an issue with note rendering performance that
ironically made the new instant search slower than the old one in certain
circumstances, namely when a note in the search results takes a very long time
to render.

The leading reason for the performance issue is that `draft-js` was applying
a new decorator to its note on every change to the search field. With the
search field updating instantly that left no time for the decorators to redraw
(and they were very inefficient to make it worse).

In this patch we're delaying the re-decoration until the search field settles.
This doesn't eliminate the problem but it should bring it roughly on par with
the behavior from before the search updates. Further we have eliminated the
`MultiDecorator` dependency since that functionality is provided by `draft-js`
itself. Since there's no need to create a composite decorator when the search
query doesn't contain any text terms we can futher skip it and only decorate
with the checkbox decorator.

These changes should make searching tolerant to slow notes and should
additionally cut the time it takes to decorate notes approximately in half.
belcherj pushed a commit that referenced this pull request May 12, 2020
* Fix: Defer re-decorating note when changing search

See #1941
See #1982

When we rebuilt the search to return instant results and removed the debounce
on the search filter we exposed an issue with note rendering performance that
ironically made the new instant search slower than the old one in certain
circumstances, namely when a note in the search results takes a very long time
to render.

The leading reason for the performance issue is that `draft-js` was applying
a new decorator to its note on every change to the search field. With the
search field updating instantly that left no time for the decorators to redraw
(and they were very inefficient to make it worse).

In this patch we're delaying the re-decoration until the search field settles.
This doesn't eliminate the problem but it should bring it roughly on par with
the behavior from before the search updates. Further we have eliminated the
`MultiDecorator` dependency since that functionality is provided by `draft-js`
itself. Since there's no need to create a composite decorator when the search
query doesn't contain any text terms we can futher skip it and only decorate
with the checkbox decorator.

These changes should make searching tolerant to slow notes and should
additionally cut the time it takes to decorate notes approximately in half.

* Don't remove multidecorator - leave that for later

* Also don't update decorators when switching notes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants