Improvements to `Onyx.get` #2762

kidroca · 2021-05-10T18:42:31Z

If you haven’t already, check out our contributing guidelines for onboarding and email contributors@expensify.com to request to join our Slack channel!

Problem:

Each time a component “connects” to Onyx we read that data from disk before returning it to withOnyx.
When multiple components ask for the same key at the same time (e.g. during init | switching chats) the key is requested multiple times from storage
This causes a lot of unnecessary traffic through the native bridge and adds unnecessary CPU / memory usage
Confirmed with the following manual benchmarks: Improvements for Onyx connect react-native-onyx#63 (comment)

Solution:

This was widely discussed in the following places and it was decided to go with the cache solution:

Improvements for Onyx: Improvements for Onyx connect react-native-onyx#63
Benchmark Onyx: Benchmark Onyx react-native-onyx#65
Latest slack thread: https://expensify.slack.com/archives/C01GTK53T8Q/p1620065578311900

Prevent unnecessary reads from disk

When a request is made to retrieve something from disk capture the task that would resolve with the read data, as more calls for the same key arrive, instead of making a separate call for the same thing redirect the reads to be resolved from the already pending task. This way only one round trip would ever happen for a given key. As soon as the data for the first call is available - all calls are resolved at once.

Cache and lazy loading

As data is read from file, keep a reference/pointer to the data in memory in a cache dictionary
Fill the cache lazily - only after a key was requested and read from file
Remove cache entries after the last connection for the given key is disconnected

Additional Work

✔️ Should we implement any benchmarks / metering?

The general principle I like to apply for a benchmark is through method decoration like the same here: Expensify/react-native-onyx#65 (comment)

Create a metrics capturing function that tracks call and response information
- call start/end time
- call execution time
- arguments for the call
- tag returned objects so we can see how much of them are still in memory (Or just tag the cache map since it should pretty much hold the same information - thought this is only possible after the update that adds cache...)
Use a ENV variable (let's say ONYX_BENCHMARK) to apply the metring function through decoration to a list of Onyx methods
- in order to work the methods in the list need to return a promise or have a callback that is triggered when the call is over
- e.g. the base methods like get, set, merge would pretty much work out of the box
Use the same ENV var from above to expose a readCollectMetrics and resetMetrics as an Onyx method
- you will be able to further aggregate data like: how much calls were there for get with the a specific key
- this can also include information about how much data is currently in cache, after the update is made

Expected Result:

Onyx does not try to retrieve data that is already available in memory

Actual Result:

Onyx will always ask data from AsyncStorage

Action Performed:

N/A

Workaround:

Can the user still use Expensify without this being fixed? Have you informed them of the workaround?

Affected Platforms:

-[x] Web
-[x] iOS
-[x] Android
-[x] Desktop App
-[x] Mobile Web

Version Number:
Logs: https://stackoverflow.com/c/expensify/questions/4856
Expensify/Expensify Issue URL:

View all open jobs on Upwork

The text was updated successfully, but these errors were encountered:

MelvinBot · 2021-05-10T19:09:51Z

Triggered auto assignment to @mallenexpensify (AutoAssignerTriage), see https://stackoverflow.com/c/expensify/questions/4749 for more details.

marcaaron · 2021-05-10T21:19:41Z

Should we implement any benchmarks / metering?
Benchmarks can help track future degradations/improvements in Onyx

I think the answer to this is that, yes, we should.

Can you briefly include the plan we will be moving forward with here for benchmarking? I think the plan is to create a single job to cover both so if we can add a bit more context in this issue that would be ideal.

I think we should start with the benchmarks in one PR then move to the Onyx improvements. Lmk if that works for you @kidroca .

mallenexpensify · 2021-05-10T21:21:00Z

Thanks @kidroca, I created the Upwork job, set the price at $2,000, made it private and invited you. Please continue all discussions here, @marcaaron will be assigned for feedback and PR review
https://www.upwork.com/jobs/~016ba245423bf6bb85

kidroca · 2021-05-11T14:06:20Z

Can you briefly include the plan we will be moving forward with here for benchmarking? I think the plan is to create a single job to cover both so if we can add a bit more context in this issue that would be ideal.

I've included it to the "Additional Work" section. Here is a bit more on that:

Through decoration original source don't have to be modified and the benchmarking logic is maintained separately
In order to achieve this some minor tweaks might be necessary. For example Onyx.get is a private method
Private methods might be moved to separate file (only used internally) so that it can be decorated by the above pattern before it is imported, another approach is to add methods to an object or a class so that they can be swapped on the instance.
This would be easier to explain/discuss on a Draft PR, so I'll open one and we can finalize the idea there

These modifications would allow the user of Onyx to display these metrics however they like

print a summary on the console
save/export a json object
send the data to a remote

One simple example would be:

if ONYX_BENCHMARK is true
include a code in E.cash that will
print/save/export or send collectedMetrics each time when navigation route changes

since ONYX_BENCHMARK is an ENV variable that is available right from build time, when it evaluates to false none of the benchmarking code will be bundled (tree shaking / dead code elimination)

I think we should start with the benchmarks in one PR then move to the Onyx improvements. Lmk if that works for you @kidroca .

Yes, I'll open a draft PR and start with that

mallenexpensify · 2021-05-11T15:30:59Z

Thanks @kidroca , I hired you in Upwork, I think you still need to accept. Assigning this issue to you now.

MelvinBot · 2021-05-11T15:31:39Z

Triggered auto assignment to @AndrewGable (Exported), see https://stackoverflow.com/c/expensify/questions/7972 for more details.

kidroca · 2021-05-25T22:47:15Z

Prepared a little teaser with some before / after benchmark data

Note that total time is the sum of all individual calls, but many happen in parallel. The total time is more of an indicator of CPU usage

Device Info

Physical Samsung Galaxy S7 edge
Android 8.0

Before

Scenario: Startup and loading the last viewed chat (with small chat)

Method	Total	Max	Min	Average	Calls
Onyx:get	9.1min	3.31sec	38.410ms	2.03sec	270
Onyx:getAllKeys	10.1min	3.29sec	57.505ms	2.00sec	304
Onyx:set	45.26sec	2.32sec	785.505ms	1.68sec	27
Onyx:merge	1.2min	5.45sec	0.010ms	2.23sec	32
Onyx:mergeCollection	10.67sec	3.94sec	3.28sec	3.56sec	3

Android.Before.mp4

After

Scenario: Startup and loading the last viewed chat (with small chat)

Method	Total	Max	Min	Average	Calls
Onyx:get	30.54sec	2.93sec	0.005ms	126.727ms	241
Onyx:getAllKeys	3.95sec	2.85sec	0.010ms	15.540ms	254
Onyx:set	23.63sec	1.42sec	458.880ms	945.169ms	25
Onyx:merge	32.31sec	3.41sec	11.255ms	1.35sec	24
Onyx:mergeCollection	3.69sec	1.42sec	1.04sec	1.23sec	3

Android.After.mp4

cc @quinthar @marcaaron @tgolen

quinthar · 2021-05-25T22:50:10Z

Woohoo!!

…

On Tue, May 25, 2021 at 3:47 PM Peter Velkov ***@***.***> wrote: Prepared a little teaser with some before / after benchmark data Note that total time is the sum of all individual calls, but many happen in parallel. The total time is more of an indicator of CPU usage Device Info Physical Samsung Galaxy S7 edge Android 8.0 Before Scenario: Startup and loading the last viewed chat (with small chat) Method Total Max Min Average Calls Onyx:get 9.1min 3.31sec 38.410ms 2.03sec 270 Onyx:getAllKeys 10.1min 3.29sec 57.505ms 2.00sec 304 Onyx:set 45.26sec 2.32sec 785.505ms 1.68sec 27 Onyx:merge 1.2min 5.45sec 0.010ms 2.23sec 32 Onyx:mergeCollection 10.67 3.94sec 3.28sec 3.56sec 3 https://user-images.githubusercontent.com/12156624/119576917-560e1280-bdc2-11eb-9a67-d47be70773f7.mp4 After Scenario: Startup and loading the last viewed chat (with small chat) Method Total Max Min Average Calls Onyx:get 30.54sec 2.93sec 0.005ms 126.727ms 241 Onyx:getAllKeys 3.95sec 2.85sec 0.010ms 15.540ms 254 Onyx:set 23.63sec 1.42sec 458.880ms 945.169ms 25 Onyx:merge 32.31sec 3.41sec 11.255ms 1.35sec 24 Onyx:mergeCollection 3.69sec 1.42sec 1.04sec 1.23sec 3 https://user-images.githubusercontent.com/12156624/119577714-d3865280-bdc3-11eb-9658-ab39fd909aba.mp4 cc @quinthar <https://github.com/quinthar> @marcaaron <https://github.com/marcaaron> @tgolen <https://github.com/tgolen> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2762 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAEMNUSK6SHQ5RUP7F6FSRTTPQSIFANCNFSM44R6JR6A> .

tgolen · 2021-05-25T22:57:40Z

Wow, very awesome. In the "Before" table, the mergeCollection row, there is a value of "10.67". What unit of time is that? On Tue, May 25, 2021 at 4:50 PM David Barrett ***@***.***> wrote:

…

Woohoo!! On Tue, May 25, 2021 at 3:47 PM Peter Velkov ***@***.***> wrote: > Prepared a little teaser with some before / after benchmark data > > Note that total time is the sum of all individual calls, but many happen > in parallel. The total time is more of an indicator of CPU usage > Device Info > > Physical Samsung Galaxy S7 edge > Android 8.0 > Before > > Scenario: Startup and loading the last viewed chat (with small chat) > Method Total Max Min Average Calls > Onyx:get 9.1min 3.31sec 38.410ms 2.03sec 270 > Onyx:getAllKeys 10.1min 3.29sec 57.505ms 2.00sec 304 > Onyx:set 45.26sec 2.32sec 785.505ms 1.68sec 27 > Onyx:merge 1.2min 5.45sec 0.010ms 2.23sec 32 > Onyx:mergeCollection 10.67 3.94sec 3.28sec 3.56sec 3 > > > https://user-images.githubusercontent.com/12156624/119576917-560e1280-bdc2-11eb-9a67-d47be70773f7.mp4 > After > > Scenario: Startup and loading the last viewed chat (with small chat) > Method Total Max Min Average Calls > Onyx:get 30.54sec 2.93sec 0.005ms 126.727ms 241 > Onyx:getAllKeys 3.95sec 2.85sec 0.010ms 15.540ms 254 > Onyx:set 23.63sec 1.42sec 458.880ms 945.169ms 25 > Onyx:merge 32.31sec 3.41sec 11.255ms 1.35sec 24 > Onyx:mergeCollection 3.69sec 1.42sec 1.04sec 1.23sec 3 > > > https://user-images.githubusercontent.com/12156624/119577714-d3865280-bdc3-11eb-9658-ab39fd909aba.mp4 > > cc @quinthar <https://github.com/quinthar> @marcaaron > <https://github.com/marcaaron> @tgolen <https://github.com/tgolen> > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > < #2762 (comment) >, > or unsubscribe > < https://github.com/notifications/unsubscribe-auth/AAEMNUSK6SHQ5RUP7F6FSRTTPQSIFANCNFSM44R6JR6A > > . > — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2762 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJMAB6POQMUMOBJO2XOB4DTPQSTBANCNFSM44R6JR6A> .

kidroca · 2021-05-25T23:40:50Z

@tgolen
It's seconds, updated the table

tgolen · 2021-05-26T16:28:24Z

These numbers look fantastic. I think the hardest thing for me to wrap my mind around is the average Onyx:get call takes 2.03sec currently? Is that really true?

kidroca · 2021-05-26T16:32:17Z

Not on ios
This on my physical android device. It's a bit old but still...
Also note that a lot of reads happen in parallel so you won't be waiting 2sec per each prop change

marcaaron · 2021-05-26T16:38:45Z

the average Onyx:get call takes 2.03sec currently

I could see how something like retrieving and parsing a very large chat could take this long. Most other things I'd expect to be very fast.

tgolen · 2021-05-26T16:52:50Z

I'd be interested to see the before/after for iOS too. I assume the numbers won't be drastic, but it would be nice to confirm that.

kidroca · 2021-05-27T21:13:29Z

Setup Info

iPhone 11 Pro
iOS: 14.4.2

Scenario: Startup and load last viewed chat (large chat 250 + messages)

Before

methodName	total	max	min	avg	calls
Onyx:get	1.8min	606.960ms	5.700ms	281.176ms	383
Onyx:getAllKeys	3.1min	593.185ms	4.460ms	363.627ms	518
Onyx:set	4.64sec	606.755ms	8.820ms	160.123ms	29
Onyx:merge	7.45sec	1.05sec	0.015ms	257.053ms	29
Onyx:mergeCollection	738.845ms	304.540ms	152.520ms	246.282ms	3

After

methodName	total	max	min	avg	calls
Onyx:get	8.59sec	472.830ms	0.010ms	24.205ms	355
Onyx:getAllKeys	2.69sec	73.615ms	0.015ms	5.692ms	473
Onyx:set	3.65sec	481.905ms	11.890ms	152.048ms	24
Onyx:merge	2.83sec	500.155ms	0.020ms	134.578ms	21
Onyx:mergeCollection	194.480ms	94.590ms	49.655ms	64.827ms	3

Note: I've run the "After" tests first so any native cache benefits are in favor for the "Before" tests

cc @quinthar @marcaaron @tgolen

tgolen · 2021-05-27T21:29:31Z

Thanks, that looks about like what I was expecting. This change is definitely going to get the averages down, but then we might want to start looking into some of those max values and see if we can do something about improving those specifically.

kidroca · 2021-05-27T23:18:25Z

What caught my eye is that a lot of the get calls are for session and personalData e.g. each chat comment seems to want them, but does it really have to get it from Onyx. The data would be the same for each comment. It would be more performant to have the parent component subscribe for session and personalData and pass this down to the comments - 1 subscription vs +200 individual subscriptions -> +200 withOnyx instances and the involved orchestration and memory

tgolen · 2021-05-28T18:00:45Z

Nice observation. We ran into a similar problem with the `session.authToken` because each IMG needs the authToken as part of it's URL to display properly. While the parent component idea would work, another alternative is what we did with the authToken. We just made the Sessions.js file connect to Onyx once and then export a `getAuthToken()` method that returned the single Onyx value. Now, it looks like a lot of this implementation has been removed (for whatever reason), but the idea is still sound. I think I like the parent component idea better because there is no race condition possibility (like if something called getAuthToken before the authToken was set in Onyx).

…

On Thu, May 27, 2021 at 5:18 PM Peter Velkov ***@***.***> wrote: What caught my eye is that a lot of the get calls are for session and personalData e.g. each chat comment seems to want them, but does it really have to get it from Onyx. The data would be the same for each comment. It would be more performant to have the parent component subscribe for session and personalData and pass this down to the comments - 1 subscription vs +200 individual subscriptions -> +200 withOnyx instances and the involved orchestration and memory — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2762 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAJMABZLJPF7FJIM56BTE2TTP3HM7ANCNFSM44R6JR6A> .

kidroca · 2021-05-28T22:04:42Z

I want to bring something up here so that it's clear
It would not be possible to always provide a direct reference to an item in cache to Onyx subscribers
For example subscriptions to collections are shallow cloning data anyway and the direct reference is lost

This is does not mean that there is a problem, it's not like we'll be making copies - we're not
This actually prevents someone to modify cache by accident

A high level of how retrieving data from Onyx and cache works:

someone subscribes for a value (not available in cache)
- the value is first read from storage
- added to cache
- and provided to the subscriber
another subscriber asks for the same key
- the value is returned from cache
at some point the value changes
- the new value is added to cache replacing the old value
- the new value is send to subscribers
- the old value is no longer referenced and is removed during garbage collection
a subscriber disconnects from this Onyx key
- if this is the last subscriber for the key we remove data from cache

I hope you can see that even if direct reference is lost at some point:

Data is not duplicated - shallow cloned (already works that way before the changes from this ticket)
For an overwritten key in cache - the old value gets GCed
We still remove the value manually when all connections to the key disconnect - the removed value gets GCed
when a subscriber is done with the data - his reference is GCed as well

Without cache: each read from disk retrieves a string value. A new object is then created from the string
With cache: we don't have to retrieve and parse the string again, we already have an object in memory - it might get shallow cloned or not but it's still disposable when we're done with it

cc @quinthar @marcaaron @tgolen

Edit: I'm just saying the above to address memory usage concerns - I still expect memory usage to decrease
From my memory snapshots Onyx data never grew above 1MB and I didn't even have the cache cleanup logic implemented at that point

kidroca · 2021-06-01T18:38:06Z

This PR is ready for a final review: Expensify/react-native-onyx#76

I'm not sure who to tag as a reviewer as per @marcaaron

@kidroca Just a heads up I'm going out of office for a bit. If you would prefer to move forward without my review feel free to tag in a random reviewer with PullerBear and request for another reviewer in the Slack channel. Thanks!

cc @quinthar @marcaaron @tgolen

tgolen · 2021-06-01T22:01:51Z

I've been really heads-down in some highly urgent stuff, plus taking over some of Marc's tasks while he is out so I won't be able to review this PR at all (I've only been lightly following along). I think some good reviewers with be @AndrewGable @Jag96 and @Julesssss

kidroca · 2021-06-18T07:44:54Z

Can we post any updates regarding payment

When should I expect to receive payment for this task?

cc @mallenexpensify (Tagging You as I see your name in the Upwork job)

mallenexpensify · 2021-06-18T23:47:41Z

Thanks for the ping @kidroca it was my responsibility and I missed it (we previously didn't leave Contributor Managers assigned to issues, we now do).
I paid in Upwork and added the bonus for writing the OP. Thanks!

kidroca changed the title ~~Onyx~~ Improvements to Onyx.get May 10, 2021

arielgreen added the AutoAssignerTriage Auto assign issues for triage to an available triage team member label May 10, 2021

MelvinBot assigned mallenexpensify May 10, 2021

MelvinBot removed the AutoAssignerTriage Auto assign issues for triage to an available triage team member label May 10, 2021

arielgreen added the Daily KSv2 label May 10, 2021

mallenexpensify assigned kidroca and unassigned mallenexpensify May 11, 2021

mallenexpensify added Exported and removed Daily KSv2 labels May 11, 2021

MelvinBot assigned AndrewGable May 11, 2021

mallenexpensify unassigned AndrewGable May 11, 2021

kidroca mentioned this issue May 13, 2021

Feature: Benchmark Onyx Expensify/react-native-onyx#70

Merged

arielgreen mentioned this issue May 25, 2021

[Hold for payment 2021-08-17] [Performance] Optimize AsyncStorage on Android #2667

Closed

kidroca mentioned this issue May 25, 2021

Feature: Onyx Cache Expensify/react-native-onyx#76

Merged

marcaaron added the Reviewing Has a PR in review label May 26, 2021

marcaaron mentioned this issue May 26, 2021

Improve Onyx.merge performance by informing listeners that a key has changed before persisting it in local storage #2397

Closed

parasharrajat mentioned this issue Jun 1, 2021

iOS - App is extremely slow to load conversations #3167

Closed

This was referenced Jun 3, 2021

More improvements for Onyx.connect (After cache) Expensify/react-native-onyx#78

Closed

Update react-native-onyx with the latest version #3423

Merged

Kidroca/onyx cache cleanup Expensify/react-native-onyx#79

Merged

Jag96 closed this as completed in #3423 Jun 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements to `Onyx.get` #2762

Improvements to `Onyx.get` #2762

kidroca commented May 10, 2021 •

edited

Loading

MelvinBot commented May 10, 2021

marcaaron commented May 10, 2021

mallenexpensify commented May 10, 2021 •

edited

Loading

kidroca commented May 11, 2021 •

edited

Loading

mallenexpensify commented May 11, 2021

MelvinBot commented May 11, 2021

kidroca commented May 25, 2021 •

edited

Loading

quinthar commented May 25, 2021 via email

tgolen commented May 25, 2021 via email

kidroca commented May 25, 2021

tgolen commented May 26, 2021

kidroca commented May 26, 2021

marcaaron commented May 26, 2021

tgolen commented May 26, 2021

kidroca commented May 27, 2021 •

edited

Loading

tgolen commented May 27, 2021

kidroca commented May 27, 2021

tgolen commented May 28, 2021 via email

kidroca commented May 28, 2021 •

edited

Loading

kidroca commented Jun 1, 2021

tgolen commented Jun 1, 2021

kidroca commented Jun 18, 2021

mallenexpensify commented Jun 18, 2021

Improvements to Onyx.get #2762

Improvements to Onyx.get #2762

Comments

kidroca commented May 10, 2021 • edited Loading

Problem:

Solution:

Prevent unnecessary reads from disk

Cache and lazy loading

Additional Work

Expected Result:

Actual Result:

Action Performed:

Workaround:

Affected Platforms:

MelvinBot commented May 10, 2021

marcaaron commented May 10, 2021

mallenexpensify commented May 10, 2021 • edited Loading

kidroca commented May 11, 2021 • edited Loading

mallenexpensify commented May 11, 2021

MelvinBot commented May 11, 2021

kidroca commented May 25, 2021 • edited Loading

Device Info

Before

After

quinthar commented May 25, 2021 via email

tgolen commented May 25, 2021 via email

kidroca commented May 25, 2021

tgolen commented May 26, 2021

kidroca commented May 26, 2021

marcaaron commented May 26, 2021

tgolen commented May 26, 2021

kidroca commented May 27, 2021 • edited Loading

Setup Info

Before

After

tgolen commented May 27, 2021

kidroca commented May 27, 2021

tgolen commented May 28, 2021 via email

kidroca commented May 28, 2021 • edited Loading

kidroca commented Jun 1, 2021

tgolen commented Jun 1, 2021

kidroca commented Jun 18, 2021

mallenexpensify commented Jun 18, 2021

Improvements to `Onyx.get` #2762

Improvements to `Onyx.get` #2762

kidroca commented May 10, 2021 •

edited

Loading

mallenexpensify commented May 10, 2021 •

edited

Loading

kidroca commented May 11, 2021 •

edited

Loading

kidroca commented May 25, 2021 •

edited

Loading

kidroca commented May 27, 2021 •

edited

Loading

kidroca commented May 28, 2021 •

edited

Loading