Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to Onyx.get #2762

Closed
kidroca opened this issue May 10, 2021 · 23 comments · Fixed by #3423
Closed

Improvements to Onyx.get #2762

kidroca opened this issue May 10, 2021 · 23 comments · Fixed by #3423
Assignees
Labels
Reviewing Has a PR in review

Comments

@kidroca
Copy link
Contributor

kidroca commented May 10, 2021

If you haven’t already, check out our contributing guidelines for onboarding and email contributors@expensify.com to request to join our Slack channel!


Problem:

  • Each time a component “connects” to Onyx we read that data from disk before returning it to withOnyx.
  • When multiple components ask for the same key at the same time (e.g. during init | switching chats) the key is requested multiple times from storage
  • This causes a lot of unnecessary traffic through the native bridge and adds unnecessary CPU / memory usage
  • Confirmed with the following manual benchmarks: Improvements for Onyx connect react-native-onyx#63 (comment)

Solution:

This was widely discussed in the following places and it was decided to go with the cache solution:

Prevent unnecessary reads from disk

When a request is made to retrieve something from disk capture the task that would resolve with the read data, as more calls for the same key arrive, instead of making a separate call for the same thing redirect the reads to be resolved from the already pending task. This way only one round trip would ever happen for a given key. As soon as the data for the first call is available - all calls are resolved at once.

Cache and lazy loading

As data is read from file, keep a reference/pointer to the data in memory in a cache dictionary
Fill the cache lazily - only after a key was requested and read from file
Remove cache entries after the last connection for the given key is disconnected


Additional Work

✔️ Should we implement any benchmarks / metering?

The general principle I like to apply for a benchmark is through method decoration like the same here: Expensify/react-native-onyx#65 (comment)

  1. Create a metrics capturing function that tracks call and response information
    • call start/end time
    • call execution time
    • arguments for the call
    • tag returned objects so we can see how much of them are still in memory (Or just tag the cache map since it should pretty much hold the same information - thought this is only possible after the update that adds cache...)
  2. Use a ENV variable (let's say ONYX_BENCHMARK) to apply the metring function through decoration to a list of Onyx methods
    • in order to work the methods in the list need to return a promise or have a callback that is triggered when the call is over
    • e.g. the base methods like get, set, merge would pretty much work out of the box
  3. Use the same ENV var from above to expose a readCollectMetrics and resetMetrics as an Onyx method
    • you will be able to further aggregate data like: how much calls were there for get with the a specific key
    • this can also include information about how much data is currently in cache, after the update is made

Expected Result:

Onyx does not try to retrieve data that is already available in memory

Actual Result:

Onyx will always ask data from AsyncStorage

Action Performed:

N/A

Workaround:

Can the user still use Expensify without this being fixed? Have you informed them of the workaround?

Affected Platforms:

-[x] Web
-[x] iOS
-[x] Android
-[x] Desktop App
-[x] Mobile Web

Version Number:
Logs: https://stackoverflow.com/c/expensify/questions/4856
Expensify/Expensify Issue URL:

View all open jobs on Upwork

@kidroca kidroca changed the title Onyx Improvements to Onyx.get May 10, 2021
@arielgreen arielgreen added the AutoAssignerTriage Auto assign issues for triage to an available triage team member label May 10, 2021
@MelvinBot
Copy link

Triggered auto assignment to @mallenexpensify (AutoAssignerTriage), see https://stackoverflow.com/c/expensify/questions/4749 for more details.

@MelvinBot MelvinBot removed the AutoAssignerTriage Auto assign issues for triage to an available triage team member label May 10, 2021
@arielgreen arielgreen added the Daily KSv2 label May 10, 2021
@marcaaron
Copy link
Contributor

Should we implement any benchmarks / metering?
Benchmarks can help track future degradations/improvements in Onyx

I think the answer to this is that, yes, we should.

Can you briefly include the plan we will be moving forward with here for benchmarking? I think the plan is to create a single job to cover both so if we can add a bit more context in this issue that would be ideal.

I think we should start with the benchmarks in one PR then move to the Onyx improvements. Lmk if that works for you @kidroca .

@mallenexpensify
Copy link
Contributor

mallenexpensify commented May 10, 2021

Thanks @kidroca, I created the Upwork job, set the price at $2,000, made it private and invited you. Please continue all discussions here, @marcaaron will be assigned for feedback and PR review
https://www.upwork.com/jobs/~016ba245423bf6bb85

@kidroca
Copy link
Contributor Author

kidroca commented May 11, 2021

Can you briefly include the plan we will be moving forward with here for benchmarking? I think the plan is to create a single job to cover both so if we can add a bit more context in this issue that would be ideal.

I've included it to the "Additional Work" section. Here is a bit more on that:

Through decoration original source don't have to be modified and the benchmarking logic is maintained separately
In order to achieve this some minor tweaks might be necessary. For example Onyx.get is a private method
Private methods might be moved to separate file (only used internally) so that it can be decorated by the above pattern before it is imported, another approach is to add methods to an object or a class so that they can be swapped on the instance.
This would be easier to explain/discuss on a Draft PR, so I'll open one and we can finalize the idea there

These modifications would allow the user of Onyx to display these metrics however they like

  • print a summary on the console
  • save/export a json object
  • send the data to a remote

One simple example would be:

  • if ONYX_BENCHMARK is true
  • include a code in E.cash that will
  • print/save/export or send collectedMetrics each time when navigation route changes

since ONYX_BENCHMARK is an ENV variable that is available right from build time, when it evaluates to false none of the benchmarking code will be bundled (tree shaking / dead code elimination)

I think we should start with the benchmarks in one PR then move to the Onyx improvements. Lmk if that works for you @kidroca .

Yes, I'll open a draft PR and start with that

@mallenexpensify
Copy link
Contributor

Thanks @kidroca , I hired you in Upwork, I think you still need to accept. Assigning this issue to you now.

@MelvinBot
Copy link

Triggered auto assignment to @AndrewGable (Exported), see https://stackoverflow.com/c/expensify/questions/7972 for more details.

@kidroca
Copy link
Contributor Author

kidroca commented May 25, 2021

Prepared a little teaser with some before / after benchmark data

Note that total time is the sum of all individual calls, but many happen in parallel. The total time is more of an indicator of CPU usage

Device Info

Physical Samsung Galaxy S7 edge
Android 8.0

Before

Scenario: Startup and loading the last viewed chat (with small chat)

Method Total Max Min Average Calls
Onyx:get 9.1min 3.31sec 38.410ms 2.03sec 270
Onyx:getAllKeys 10.1min 3.29sec 57.505ms 2.00sec 304
Onyx:set 45.26sec 2.32sec 785.505ms 1.68sec 27
Onyx:merge 1.2min 5.45sec 0.010ms 2.23sec 32
Onyx:mergeCollection 10.67sec 3.94sec 3.28sec 3.56sec 3
Android.Before.mp4

After

Scenario: Startup and loading the last viewed chat (with small chat)

Method Total Max Min Average Calls
Onyx:get 30.54sec 2.93sec 0.005ms 126.727ms 241
Onyx:getAllKeys 3.95sec 2.85sec 0.010ms 15.540ms 254
Onyx:set 23.63sec 1.42sec 458.880ms 945.169ms 25
Onyx:merge 32.31sec 3.41sec 11.255ms 1.35sec 24
Onyx:mergeCollection 3.69sec 1.42sec 1.04sec 1.23sec 3
Android.After.mp4

cc @quinthar @marcaaron @tgolen

@quinthar
Copy link
Contributor

quinthar commented May 25, 2021 via email

@tgolen
Copy link
Contributor

tgolen commented May 25, 2021 via email

@kidroca
Copy link
Contributor Author

kidroca commented May 25, 2021

@tgolen
It's seconds, updated the table

@tgolen
Copy link
Contributor

tgolen commented May 26, 2021

These numbers look fantastic. I think the hardest thing for me to wrap my mind around is the average Onyx:get call takes 2.03sec currently? Is that really true?

@kidroca
Copy link
Contributor Author

kidroca commented May 26, 2021

Not on ios
This on my physical android device. It's a bit old but still...
Also note that a lot of reads happen in parallel so you won't be waiting 2sec per each prop change

@marcaaron
Copy link
Contributor

the average Onyx:get call takes 2.03sec currently

I could see how something like retrieving and parsing a very large chat could take this long. Most other things I'd expect to be very fast.

@tgolen
Copy link
Contributor

tgolen commented May 26, 2021

I'd be interested to see the before/after for iOS too. I assume the numbers won't be drastic, but it would be nice to confirm that.

@kidroca
Copy link
Contributor Author

kidroca commented May 27, 2021

Setup Info

iPhone 11 Pro
iOS: 14.4.2

Scenario: Startup and load last viewed chat (large chat 250 + messages)

Before

methodName total max min avg calls
Onyx:get 1.8min 606.960ms 5.700ms 281.176ms 383
Onyx:getAllKeys 3.1min 593.185ms 4.460ms 363.627ms 518
Onyx:set 4.64sec 606.755ms 8.820ms 160.123ms 29
Onyx:merge 7.45sec 1.05sec 0.015ms 257.053ms 29
Onyx:mergeCollection 738.845ms 304.540ms 152.520ms 246.282ms 3

After

methodName total max min avg calls
Onyx:get 8.59sec 472.830ms 0.010ms 24.205ms 355
Onyx:getAllKeys 2.69sec 73.615ms 0.015ms 5.692ms 473
Onyx:set 3.65sec 481.905ms 11.890ms 152.048ms 24
Onyx:merge 2.83sec 500.155ms 0.020ms 134.578ms 21
Onyx:mergeCollection 194.480ms 94.590ms 49.655ms 64.827ms 3

Note: I've run the "After" tests first so any native cache benefits are in favor for the "Before" tests

cc @quinthar @marcaaron @tgolen

@tgolen
Copy link
Contributor

tgolen commented May 27, 2021

Thanks, that looks about like what I was expecting. This change is definitely going to get the averages down, but then we might want to start looking into some of those max values and see if we can do something about improving those specifically.

@kidroca
Copy link
Contributor Author

kidroca commented May 27, 2021

What caught my eye is that a lot of the get calls are for session and personalData e.g. each chat comment seems to want them, but does it really have to get it from Onyx. The data would be the same for each comment. It would be more performant to have the parent component subscribe for session and personalData and pass this down to the comments - 1 subscription vs +200 individual subscriptions -> +200 withOnyx instances and the involved orchestration and memory

@tgolen
Copy link
Contributor

tgolen commented May 28, 2021 via email

@kidroca
Copy link
Contributor Author

kidroca commented May 28, 2021

I want to bring something up here so that it's clear
It would not be possible to always provide a direct reference to an item in cache to Onyx subscribers
For example subscriptions to collections are shallow cloning data anyway and the direct reference is lost

This is does not mean that there is a problem, it's not like we'll be making copies - we're not
This actually prevents someone to modify cache by accident

A high level of how retrieving data from Onyx and cache works:

  • someone subscribes for a value (not available in cache)
    • the value is first read from storage
    • added to cache
    • and provided to the subscriber
  • another subscriber asks for the same key
    • the value is returned from cache
  • at some point the value changes
    • the new value is added to cache replacing the old value
    • the new value is send to subscribers
    • the old value is no longer referenced and is removed during garbage collection
  • a subscriber disconnects from this Onyx key
    • if this is the last subscriber for the key we remove data from cache

I hope you can see that even if direct reference is lost at some point:

  1. Data is not duplicated - shallow cloned (already works that way before the changes from this ticket)
  2. For an overwritten key in cache - the old value gets GCed
  3. We still remove the value manually when all connections to the key disconnect - the removed value gets GCed
  4. when a subscriber is done with the data - his reference is GCed as well

Without cache: each read from disk retrieves a string value. A new object is then created from the string
With cache: we don't have to retrieve and parse the string again, we already have an object in memory - it might get shallow cloned or not but it's still disposable when we're done with it

cc @quinthar @marcaaron @tgolen

Edit: I'm just saying the above to address memory usage concerns - I still expect memory usage to decrease
From my memory snapshots Onyx data never grew above 1MB and I didn't even have the cache cleanup logic implemented at that point

@kidroca
Copy link
Contributor Author

kidroca commented Jun 1, 2021

This PR is ready for a final review: Expensify/react-native-onyx#76

I'm not sure who to tag as a reviewer as per @marcaaron

@kidroca Just a heads up I'm going out of office for a bit. If you would prefer to move forward without my review feel free to tag in a random reviewer with PullerBear and request for another reviewer in the Slack channel. Thanks!

cc @quinthar @marcaaron @tgolen

@tgolen
Copy link
Contributor

tgolen commented Jun 1, 2021

I've been really heads-down in some highly urgent stuff, plus taking over some of Marc's tasks while he is out so I won't be able to review this PR at all (I've only been lightly following along). I think some good reviewers with be @AndrewGable @Jag96 and @Julesssss

@kidroca
Copy link
Contributor Author

kidroca commented Jun 18, 2021

Can we post any updates regarding payment

  • When should I expect to receive payment for this task?

cc @mallenexpensify (Tagging You as I see your name in the Upwork job)

@mallenexpensify
Copy link
Contributor

Thanks for the ping @kidroca it was my responsibility and I missed it (we previously didn't leave Contributor Managers assigned to issues, we now do).
I paid in Upwork and added the bonus for writing the OP. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Reviewing Has a PR in review
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants