-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Listing stake pools time grows when there are wallets syncing in background #2082
Comments
I wonder if this is simply due to the fact that wallet syncing occupies CPU and disk resources, and having several wallets syncing in parallel might cause so much resource contention that querying the pool database slows down? One possible factor is that we currently perform at least (2n + 1) database queries to construct the list of pools, where n is the set of known pool identifiers. (See here.) Normally, this wouldn't be an issue, as individual queries are executed very quickly. But if the query latency increases, executing (2n + 1) queries becomes a problem. We have a technical debt task here to examine the use of more efficient queries: https://jira.iohk.io/browse/ADP-383 |
2111: Faster listing of stake pools. r=jonathanknowles a=jonathanknowles ## Issue Number #2082 (_Listing stake pools time grows when there are wallets syncing in background_.) ## Overview This PR: * changes the implementation of `ListStakePools` to execute a small number of handwritten SQL queries with native joins, rather than issuing multiple queries and joining together the results in memory. * adds a new operation [`listPoolLifeCycleData`](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/src/Cardano/Pool/DB.hs#L180). The [SQLite implementation](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/src/Cardano/Pool/DB/Sqlite.hs#L384) of this operation uses a [single hand-optimized query](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/src/Cardano/Pool/DB/Sqlite.hs#L595) to return lifecycle data for all pools. ## Benefits ### ⭐ **Faster execution time** The time taken to execute `ListStakePools` is greatly reduced, even when there are multiple wallets syncing in the background. <details><summary>Click to view data</summary><br> | No. of <br>Wallets<br>Syncing | Time Required<br>(seconds)<br>`master` | Time Required<br>(seconds)<br>`more-efficient-list-pools` | | ---: | ---: | ---: | | 0 | 8 | 3 | | 5 | 30 | 7 | | 10 | 60 | 8 | | 15 | 72 | 9 | | 20 | 95 | 10 | </details> ### ⭐ **Fewer database queries** The number of `select` queries required for a single call to `ListStakePools` is greatly reduced. <details><summary>Click to view data</summary><br> | Branch | Network | No. of Pools | No. of Queries | | :--- | :--- | ---: | ---: | | `master` | `mainnet` | 1,124 | 12,365 | | `more-efficient-list-pools`| `mainnet` | 1,124 | 4 | </details> ## Testing ### ✔️ Properties This PR adds a [property test](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/test/unit/Cardano/Pool/DB/Properties.hs#L987) for `listPoolLifeCycleData`, consisting of the following steps: 1. Add an arbitrary sequence of pool registration and retirement certificates to the database, for multiple pools. 2. Call `listPoolLifeCycleData` once to fetch lifecycle data for all active pools. 3. Call `readPoolLifeCycleStatus` multiple times, once for each known pool, and coalesce the results. 4. Test that the results returned in steps 2 and 3 are identical. ### ✔️ Live Data Run the following command to get a list of all stake pools sorted by ID: ```sh cardano-wallet stake-pool list --stake 1000 | jq 'sort_by(.id)' > stake-pool-list ``` The results should be identical for both `master` and this branch. ## Notes ### System used for Benchmarking The system used for benchmarking (4 cores): ``` vendor_id : GenuineIntel cpu family : 6 model : 85 model name : Intel(R) Xeon(R) CPU stepping : 7 microcode : 0x1 cpu MHz : 2800.176 cache size : 33792 KB ``` Co-authored-by: Jonathan Knowles <jonathan.knowles@iohk.io>
2111: Faster listing of stake pools. r=jonathanknowles a=jonathanknowles ## Issue Number #2082 (_Listing stake pools time grows when there are wallets syncing in background_.) ## Overview This PR: * changes the implementation of `ListStakePools` to execute a small number of handwritten SQL queries with native joins, rather than issuing multiple queries and joining together the results in memory. * adds a new operation [`listPoolLifeCycleData`](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/src/Cardano/Pool/DB.hs#L180). The [SQLite implementation](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/src/Cardano/Pool/DB/Sqlite.hs#L384) of this operation uses a [single hand-optimized query](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/src/Cardano/Pool/DB/Sqlite.hs#L595) to return lifecycle data for all pools. ## Benefits ### ⭐ **Faster execution time** The time taken to execute `ListStakePools` is greatly reduced, even when there are multiple wallets syncing in the background. <details><summary>Click to view data</summary><br> | No. of <br>Wallets<br>Syncing | Time Required<br>(seconds)<br>`master` | Time Required<br>(seconds)<br>`more-efficient-list-pools` | | ---: | ---: | ---: | | 0 | 8 | 3 | | 5 | 30 | 7 | | 10 | 60 | 8 | | 15 | 72 | 9 | | 20 | 95 | 10 | </details> ### ⭐ **Fewer database queries** The number of `select` queries required for a single call to `ListStakePools` is greatly reduced. <details><summary>Click to view data</summary><br> | Branch | Network | No. of Pools | No. of Queries | | :--- | :--- | ---: | ---: | | `master` | `mainnet` | 1,124 | 12,365 | | `more-efficient-list-pools`| `mainnet` | 1,124 | 4 | </details> ## Testing ### ✔️ Properties This PR adds a [property test](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/test/unit/Cardano/Pool/DB/Properties.hs#L987) for `listPoolLifeCycleData`, consisting of the following steps: 1. Add an arbitrary sequence of pool registration and retirement certificates to the database, for multiple pools. 2. Call `listPoolLifeCycleData` once to fetch lifecycle data for all active pools. 3. Call `readPoolLifeCycleStatus` multiple times, once for each known pool, and coalesce the results. 4. Test that the results returned in steps 2 and 3 are identical. ### ✔️ Live Data Run the following command to get a list of all stake pools sorted by ID: ```sh cardano-wallet stake-pool list --stake 1000 | jq 'sort_by(.id)' > stake-pool-list ``` The results should be identical for both `master` and this branch. ## Notes ### System used for Benchmarking The system used for benchmarking (4 cores): ``` vendor_id : GenuineIntel cpu family : 6 model : 85 model name : Intel(R) Xeon(R) CPU stepping : 7 microcode : 0x1 cpu MHz : 2800.176 cache size : 33792 KB ``` Co-authored-by: Jonathan Knowles <jonathan.knowles@iohk.io>
LGTM! Results from manual verification on mainnet -> #2111 (comment). |
Context
Not sure how is the relation but the time of listing pools seems to grow when there are more wallets being synced.
I don't see any obvious outstanding query time that would explain 50s... the longest query is
GetNonMyopicMemberRewards
which is a bit longer the more wallets are syncing. (Which applies to other querries too actually).0 Wallets syncing:
List sp = 3s
Query GetNonMyopicMemberRewards took 2.028777786s
1 wallet syncing:
List sp = 13s
Query GetNonMyopicMemberRewards took 2.39806037s
2 wallets syncing
List sp = 26s
Query GetNonMyopicMemberRewards took 2.452248284s
3 wallets syncing
List sp = 31s
Query GetNonMyopicMemberRewards took 2.402175446s
5 wallets syncing
List sp = 40s
Query GetNonMyopicMemberRewards took 2.821580222s
10 wallets syncing
List sp = 47s
Query GetNonMyopicMemberRewards took 3.001514181s
15 wallets syncing
List sp = 1m 03s
Query GetNonMyopicMemberRewards took 3.012593506s
20 wallets syncing
List sp = 1m 11s
Query GetNonMyopicMemberRewards took 3.519106839s
Times of other querries is also growing a bit too and also the number of queries is growing when there are more wallets:
With 0 wallets (lp exec time = 4s):
With 20 wallets (lp exec time = 1:10m):
NOTE 1:
Please note that when 20 wallets are synced listing stake pools takes ~5s, so the performance problem is visible only while some wallets are syncing.
NOTE 2:
Also, when we have n wallets syncing on cardano-wallet 1 and experiencing perf problems with listing stake-pools on that wallet, at the same time if we have cardano-wallet 2 on the same machine, we are able to list stake pools there just fine within 3-4 seconds (considering no wallets are syncing on cardano-wallet 2).
Steps to Reproduce
Expected behavior
Ideally listing stake pools time is consistent no matter how many wallets are syncing.
Actual behavior
Listing stake pool takes longer while the more wallets are syncing.
Resolution
#2111
The above PR:
changes the implementation of
ListStakePools
to execute a small number of handwritten SQL queries with native joins, rather than issuing multiple queries and joining together the results in memory.adds a new operation
listPoolLifeCycleData
. The SQLite implementation of this operation uses a single hand-optimized query to return lifecycle data for all pools.QA
✔️ Properties
The above PR adds a property test for
listPoolLifeCycleData
, consisting of the following steps:listPoolLifeCycleData
once to fetch lifecycle data for all active pools.readPoolLifeCycleStatus
multiple times, once for each known pool, and coalesce the results.✔️ Live Data
Run the following command to get a list of all stake pools sorted by ID:
The results should be identical for both
master
and the branch from PR #2111.The text was updated successfully, but these errors were encountered: