Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate performance of PouchDB indexeddb adapter #9208

Open
garethbowen opened this issue Jun 25, 2024 · 12 comments
Open

Evaluate performance of PouchDB indexeddb adapter #9208

garethbowen opened this issue Jun 25, 2024 · 12 comments
Labels
Type: Performance Make something faster

Comments

@garethbowen
Copy link
Contributor

Describe the performance issue

PouchDB 9.0 has just been released with significant improvements to the indexeddb adapter which should make it perform better than the idb adapter we currently use.

Describe the improvement you'd like

Upgrade to Pouch 9.0 and run a suite of client side performance tests with representative data on the old adapter and the new adapter and publish the results. There is currently no migration from one to the other so you will have to completely reset the browser data when you switch adapter.

Measurements

  • Most important will be client side apdex or similar to measure user satisfaction with the new adapter
  • But we should also measure cpu, memory, and disk usage to make sure we don't incur unintended costs

Additional context

Once we've determined if and how much this improves performance we can decide which of several options to take next (in a separate issue):

  1. Do nothing
  2. Use the indexeddb adapter for new clients, but keep the idb adapter around for existing clients. I like this option because it's not a breaking change, and allows for progressive rollout. This could also be used as a workaround for heavy users to do a sync, wipe, initial replication to get the improved performance.
  3. Write code to migrate between adapters, and drop support for the idb adapter. This is difficult because we have to be careful about not exhausting disk space during the migration, and may take a while to execute which would block CHWs from doing work until it's finished. For this reason it's probably a breaking change and would be delayed until the next major release.
@garethbowen garethbowen added the Type: Performance Make something faster label Jun 25, 2024
@latin-panda
Copy link
Contributor

@ralfudx, we will likely run the automation Apdex tests to compare it with our previous numbers. cc: @michaelkohn

@ralfudx
Copy link
Contributor

ralfudx commented Jun 25, 2024

Yeah this is noted

@latin-panda latin-panda self-assigned this Jul 5, 2024
@latin-panda
Copy link
Contributor

I am assigning this to me as I want to run the Apdex Automation now that we get consistent results, and compare before and after. First trying with ugly code

@latin-panda
Copy link
Contributor

Apdex score

This test suite focuses on queries during navigation and searching by free text.

The biggest performance improvements are:

  • tasks:refresh with 70% improvement
  • report_list:query with 11.6%
  • report_list:load with 25.5%

No slowdowns were detected.

scores
Telemetry files

baseline-default-config-jon-july-11.json

pouch9-default-config-kimberly-july-12.json

Telemetry before vs after

Telemetry before vs after

PouchDB 9 integration

The pouch-adapter-indexeddb was integrated into the pouchdb-9 branch. However, this is not a final implementation; we still need to resolve the migrations before sync. and determining how to transition existing users from the old adapter to the new one. The focus of this work was to make it work for performance evaluation.

During the integration, I encounter these errors:

  • onblocked, this should never happen
    • That's telemetry recording too much too fast and also closing the DB when the pouch-adapter-indexeddb already has closed it.
  • Database has a global failure DOMException: Unable to add key to index 'seq': at least one key does not satisfy the uniqueness requirements.
  • invalid ref format

After I fixed the telemetry, the 2 first stopped appearing.

Other helpful Apdex suites

These are other test suites that can be done for this work, but we need to configure the suite for the default config:

  • Run initial replication multiple times
  • Form submits for contacts, reports, and tasks.
  • Perform action by reopening the app in every iteration

@garethbowen
Copy link
Contributor Author

70% is almost too good to believe. Are you confident in this number? Is it possible some of the other changes you made to this branch are causing it? Is there any other explanation you can think of that might explain this?

determining how to transition existing users from the old adapter to the new one

The easiest way I can think of is to have new users use the new adapter and existing users continue to use the old adapter. I like this because...

  1. It means if there are any undetected issues we can pick them up as it gradually rolls out, rather than disrupting thousands of users in one go. This was useful in transitioning from crosswalk to webview in android.
  2. It isn't disruptive to existing users at all. The migration for a user with loads of docs may take some time when they are just trying to do their job. Furthermore the naive implementation would involve duplicating the data, which risks running of out disk space.
  3. Users are already being told to sync and wipe and reload their app if they find it slow, so migration of existing users will slowly happen.
  4. We can provide either manual migration via a button, or automatic migration at a later date.

There is a complexity cost to this but the way PouchDB is architected it's not a big cost.

@latin-panda
Copy link
Contributor

70% is almost too good to believe. Are you confident in this number? Is it possible some of the other changes you made to this branch are causing it? Is there any other explanation you can think of that might explain this?

I've added this to this week's tasks; I'll dig more into tasks and rules engine

@latin-panda
Copy link
Contributor

70% is almost too good to believe. Are you confident in this number? Is it possible some of the other changes you made to this branch are causing it? Is there any other explanation you can think of that might explain this?

The e2e for tasks is working. I debugged tasks, and they are creating and resolving as expected. I can see the local docs about tasks being updated (screenshots below) and changing the status accordingly. There are no errors in the browser's console. So far, I haven't detected weird behaviors.

Screenshot of task docs in the device

task-docs-2
tasks-doc

I opened this PR to visualize the changes better: #9296. This PR is not for merging, it's intended to be a playground for testing. Changes:

  • Fixed the telemetry service to solve the errors described here (Error: onblocked, this should never happen)
  • Commented migrations before sync. because it was failing with invalid ref format. Unsure yet about the actual side effects of this, I need to see it again.

And that's it, the rest is the same.

@latin-panda
Copy link
Contributor

latin-panda commented Aug 7, 2024

But we should also measure cpu, memory, and disk usage to make sure we don't incur unintended costs

I ran tests (3 times) using an offline CHW and an offline Supervisor in 2 instances, one for the baseline and the other for the PouchDB Adapter v9. Both instances have the same configuration and data (I used the test-data-generator for this).
These tests don't include saving forms yet. They are about the same scenarios as covered in the previous Apdex Tests, so we still need to do the tests with forms, however, I didn't experience errors or slow down when saving forms. I replicated the scenarios as similar as possible, in order of execution and timing, in both instances.

  • The storage, CPU and memory are not getting worse when using PouchDB Adapter v9
  • The storage is reduced between 4MB to 6MB.
    • Since we're commenting the migrations before sync., I wonder if that has an effect in the storage.
    • The telemetry, reports, tasks, targets, contact, read reports, and feedback docs are recorded fine and synced, as well as the basic meta is there; the app is functioning fine.
  • There is no significant improvement in CPU and Memory for this low-end device.

Baseline

Detail Value
CHT Android 1.4
Branch master (4.10 as July 9)
Config Default
Phone Safaricom Neon Ultra - 2GB RAM - Android 13

Storage

Detail CHW Supervisor
Docs 1073 1077
App (APK v1.4) 1.27MB 1.27MB
Cache 3.76MB 2.83MB
User Data 32 to 38MB 33.61 to 38.2 MB

The "User Data" varies depending on whether meta and docs were cleaned when opening the app.

Memory (MB)

Scenario  chw chw chw supervisor supervisor supervisor
  avg min max avg min max
Initial replication and first load of contact tab 75 68 80 64 58 71
Load report list, search, load 1 report details 71 69 76 73 67 77
Load contact list, search, load 1 contact, load chw area, back to contact list 74 69 79 75 69 78
Open app and load message list, load targets, load tasks, load task form, load message list, load a conversation 67 55 76 71 51 78

CPU (%)

Scenario  chw chw chw supervisor supervisor supervisor
  avg min max avg min max
Initial replication and first load of contact tab 36% 17% 57% 30% 15% 53%
Load report list, search, load 1 report details 33% 16% 52% 34% 16% 56%
Load contact list, search, load 1 contact, load chw area, back to contact list 36% 14% 57% 42% 20% 62%
Open app and load message list, load targets, load tasks, load task form, load message list, load a conversation 20% 11% 42% 22% 7% 48%
Expand to see timeline graphs for CHW baseline - chw -initial-rep-firs-contact-list baseline  - chw  -reports baseline  - chw  -reports-2 baseline - chw - contacts baseline - chw  -contacts-2 baseline  - chw - nav to other pages baseline - chw - nav to other pages - 2
Expand to see timeline graphs for Supervisor baseline - supervisor- initial rep baseline  - supervisor  -reports baseline - supervisor - contacts baseline  - supervisor - nav to other pages

PouchDB Adapter v9

Detail Value
CHT Android 1.4
Branch master (4.10 as July 9) with PouchDB v9 and pouchdb-adapter-indexeddb
Config Default
Phone Safaricom Neon Ultra - 2GB RAM - Android 13

Storage

Detail CHW Supervisor
Docs 1081 1079
App (APK v1.4) 1.27MB 1.27MB
Cache 2.59MB 2.58MB
User Data 28 to 32.26MB 27.7 to 32.32MB

The "User Data" varies depending on whether meta and docs were cleaned when opening the app.

Memory (MB)

Scenario  chw chw chw supervisor supervisor supervisor
  avg min max avg min max
Initial replication and first load of contact tab 73 64 83 72 64 83
Load report list, search, load 1 report details 71 63 75 70 61 76
Load contact list, search, load 1 contact, load chw area, back to contact list 76 65 81 77 64 82
Open app and load message list, load targets, load tasks, load task form, load message list, load a conversation 66 39 75 64 36 75

CPU (%)

Scenario  chw chw chw supervisor supervisor supervisor
  avg min max avg min max
Initial replication and first load of contact tab 26% 16% 53% 31% 14% 51%
Load report list, search, load 1 report details 33% 15% 52% 32% 16% 56%
Load contact list, search, load 1 contact, load chw area, back to contact list 40% 18% 54% 41% 15% 54%
Open app and load message list, load targets, load tasks, load task form, load message list, load a conversation 22% 11% 44% 21% 11% 40%
Expand to see timeline graphs for CHW pouch-chw-initial-rep pouch  - chw  -reports pouch  - chw  -reports-2 pouch - chw - contacts pouch - chw - contacts -2 pouch  - chw - nav to other pages pouch  - chw - nav to other pages - 2
Expand to see timeline graphs for Supervisor pouchdb - supervisor- initial rep pouchdb - supervisor- initial rep-2 pouch  - supervisor  -reports pouch - supervisor - contacts pouch  - supervisor - nav to other pages

@garethbowen
Copy link
Contributor Author

@latin-panda If I may be so bold as to summarise your findings...

  1. Apdex metrics have improved overall, some by a lot (up to 70%), some by none, but not getting worse in any metric.
  2. Disk usage is improved significantly - user data by ~10% and cache ~30%
  3. CPU and Memory results are within the margin of error.

Is that an accurate summary?

@latin-panda
Copy link
Contributor

  1. Apdex metrics have improved overall, some by a lot (up to 70%), some by none, but not getting worse in any metric.
  2. Disk usage is improved significantly - user data by ~10% and cache ~30%
  3. CPU and Memory results are within the margin of error.

@garethbowen, that's correct based on my test results, which focus on queries to the DB.
The performance and resources (storage, CPU, memory) are the same or better for low end devices (2GB RAM) and not getting worse.

@jkuester
Copy link
Contributor

@dianabarsan you mentioned in a different issue:

I believe #9208 is a long way away

Could we capture here what you think is missing before we can ship this issue?

@dianabarsan
Copy link
Member

@jkuester
This issue is "evaluating performance", and not actually implementing.
It's a debate now whether we will implement a migration between the old adapter and the new, my opinion is to not implement a migration, due to the obvious complexities and high risk.
So even if we will use the new adapter, it will be only used for new "logins", and the old adapter will be used for existent "logins".

So to benefit of the new adapter, users would need to clear cache and reinstall.

This interferes with your idea of matching the two updates, because:

  1. existent users will still need to re-index - and this was your main argument for waiting
  2. we have to wait until we make this decision, which has been postponed already for along time (half a year!)
  3. we need TCO reduction very soon!

@latin-panda latin-panda removed their assignment Nov 29, 2024
@latin-panda latin-panda removed this from Care Teams Nov 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Performance Make something faster
Projects
None yet
Development

No branches or pull requests

5 participants