-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster listing of stake pools. #2111
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
jonathanknowles
force-pushed
the
jonathanknowles/more-efficient-list-pools
branch
from
September 4, 2020 07:48
a4ba88c
to
7106d54
Compare
Awesome 🎉 My results of listing stake-pools with n wallets syncing on this branch (Intel i7 - 6 cores, 32GB RAM, SSD):
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
jonathanknowles
force-pushed
the
jonathanknowles/more-efficient-list-pools
branch
3 times, most recently
from
September 7, 2020 08:09
fac8cce
to
d00c608
Compare
This comment has been minimized.
This comment has been minimized.
jonathanknowles
force-pushed
the
jonathanknowles/more-efficient-list-pools
branch
2 times, most recently
from
September 7, 2020 12:52
cea2818
to
83de0c4
Compare
jonathanknowles
changed the title
WIP: Use more efficient database queries in
WIP: Faster listing of stake pools.
Sep 7, 2020
ListStakePools
.
jonathanknowles
added
the
IMPROVEMENT
Mark a PR as an improvement, for auto-generated CHANGELOG
label
Sep 7, 2020
hasufell
reviewed
Sep 7, 2020
jonathanknowles
added a commit
that referenced
this pull request
Sep 8, 2020
In response to review feedback: #2111 (comment)
jonathanknowles
force-pushed
the
jonathanknowles/more-efficient-list-pools
branch
from
September 8, 2020 02:59
86ee15a
to
6fbc807
Compare
jonathanknowles
added a commit
that referenced
this pull request
Sep 8, 2020
For rows that fail to parse: we create a log entry for each error. For rows that parse successfully: we return these as part of the result. In response to review feedback: #2111 (comment)
jonathanknowles
added a commit
that referenced
this pull request
Sep 8, 2020
In response to review feedback: #2111 (comment)
jonathanknowles
added a commit
that referenced
this pull request
Sep 8, 2020
For rows that fail to parse: we create a log entry for each error. For rows that parse successfully: we return these as part of the result. In response to review feedback: #2111 (comment)
jonathanknowles
force-pushed
the
jonathanknowles/more-efficient-list-pools
branch
from
September 8, 2020 03:33
832ef1b
to
3155742
Compare
jonathanknowles
added a commit
that referenced
this pull request
Sep 8, 2020
For rows that fail to parse: we create a log entry for each error. For rows that parse successfully: we return these as part of the result. In response to review feedback: #2111 (comment)
jonathanknowles
force-pushed
the
jonathanknowles/more-efficient-list-pools
branch
from
September 8, 2020 03:34
3155742
to
79f505c
Compare
jonathanknowles
added a commit
that referenced
this pull request
Sep 8, 2020
In response to review feedback: #2111 (comment)
jonathanknowles
added a commit
that referenced
this pull request
Sep 8, 2020
For rows that fail to parse: we create a log entry for each error. For rows that parse successfully: we return these as part of the result. In response to review feedback: #2111 (comment)
jonathanknowles
force-pushed
the
jonathanknowles/more-efficient-list-pools
branch
from
September 8, 2020 05:29
79f505c
to
68aaf9e
Compare
This replaces a sequence of n database operations with a single database operation that performs just a single query.
This makes it possible to filter out retired pools at the SQL query level.
This is no longer necessary, as retired pools are now filtered out at the SQL query level.
This type will be used by later commits, to generate sequences of registration and retirement certificates affecting multiple pools.
Use `MultiPoolCertificateSequence` to simplify the `prop_listRetiredPools_multiplePools_multipleCerts` property.
This makes it possible to provide the following function, which produces a single, flattened list of pool certificates affecting multiple pools. ``` getMultiPoolCertificateSequence :: MultiPoolCertificateSequence -> [PoolCertificate] ```
Every valid registration certificate must have at least one owner. Ideally, this should be encoded in the type system, but we make do with not generating invalid data for now.
This more accurately reflects the function's purpose. We don't actually mind if the list of certificates is empty.
Log parse failures in the following pool DB operations: - `readTotalProduction`. - `listPoolLifeCycleData`. - `listRetiredPools`. In response to review feedback: #2111 (comment)
There's no point in fetching the entire set of all pool-related data when all we really need is the set of valid pool IDs.
This makes it possible to change the definitions of views as required.
jonathanknowles
force-pushed
the
jonathanknowles/more-efficient-list-pools
branch
from
September 10, 2020 02:29
b504448
to
7aabf90
Compare
bors r+ |
iohk-bors bot
added a commit
that referenced
this pull request
Sep 10, 2020
2111: Faster listing of stake pools. r=jonathanknowles a=jonathanknowles ## Issue Number #2082 (_Listing stake pools time grows when there are wallets syncing in background_.) ## Overview This PR: * changes the implementation of `ListStakePools` to execute a small number of handwritten SQL queries with native joins, rather than issuing multiple queries and joining together the results in memory. * adds a new operation [`listPoolLifeCycleData`](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/src/Cardano/Pool/DB.hs#L180). The [SQLite implementation](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/src/Cardano/Pool/DB/Sqlite.hs#L384) of this operation uses a [single hand-optimized query](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/src/Cardano/Pool/DB/Sqlite.hs#L595) to return lifecycle data for all pools. ## Benefits ### ⭐ **Faster execution time** The time taken to execute `ListStakePools` is greatly reduced, even when there are multiple wallets syncing in the background. <details><summary>Click to view data</summary><br> | No. of <br>Wallets<br>Syncing | Time Required<br>(seconds)<br>`master` | Time Required<br>(seconds)<br>`more-efficient-list-pools` | | ---: | ---: | ---: | | 0 | 8 | 3 | | 5 | 30 | 7 | | 10 | 60 | 8 | | 15 | 72 | 9 | | 20 | 95 | 10 | </details> ### ⭐ **Fewer database queries** The number of `select` queries required for a single call to `ListStakePools` is greatly reduced. <details><summary>Click to view data</summary><br> | Branch | Network | No. of Pools | No. of Queries | | :--- | :--- | ---: | ---: | | `master` | `mainnet` | 1,124 | 12,365 | | `more-efficient-list-pools`| `mainnet` | 1,124 | 4 | </details> ## Testing ### ✔️ Properties This PR adds a [property test](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/test/unit/Cardano/Pool/DB/Properties.hs#L987) for `listPoolLifeCycleData`, consisting of the following steps: 1. Add an arbitrary sequence of pool registration and retirement certificates to the database, for multiple pools. 2. Call `listPoolLifeCycleData` once to fetch lifecycle data for all active pools. 3. Call `readPoolLifeCycleStatus` multiple times, once for each known pool, and coalesce the results. 4. Test that the results returned in steps 2 and 3 are identical. ### ✔️ Live Data Run the following command to get a list of all stake pools sorted by ID: ```sh cardano-wallet stake-pool list --stake 1000 | jq 'sort_by(.id)' > stake-pool-list ``` The results should be identical for both `master` and this branch. ## Notes ### System used for Benchmarking The system used for benchmarking (4 cores): ``` vendor_id : GenuineIntel cpu family : 6 model : 85 model name : Intel(R) Xeon(R) CPU stepping : 7 microcode : 0x1 cpu MHz : 2800.176 cache size : 33792 KB ``` Co-authored-by: Jonathan Knowles <jonathan.knowles@iohk.io>
Build FailureSuspected CauseParse failure triggered by wallet DB state machine tests. Log Extract
Related Issues |
bors r+ |
Build succeeded: |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Related Issues
#2082 (Listing stake pools time grows when there are wallets syncing in background.)
https://jira.iohk.io/browse/ADP-383 (Use more efficient DB queries when listing pools.)
Overview
This PR:
changes the implementation of
ListStakePools
to execute a small number of handwritten SQL queries with native joins, rather than issuing multiple queries and joining together the results in memory.adds a new operation
listPoolLifeCycleData
. The SQLite implementation of this operation uses a single hand-optimized query to return lifecycle data for all pools.Benefits
⭐ Faster execution time
The time taken to execute
ListStakePools
is greatly reduced, even when there are multiple wallets syncing in the background.Click to view data
Wallets
Syncing
(seconds)
master
(seconds)
more-efficient-list-pools
⭐ Fewer database queries
The number of
select
queries required for a single call toListStakePools
is greatly reduced.Click to view data
master
mainnet
more-efficient-list-pools
mainnet
Testing
✔️ Properties
This PR adds a property test for
listPoolLifeCycleData
, consisting of the following steps:listPoolLifeCycleData
once to fetch lifecycle data for all active pools.readPoolLifeCycleStatus
multiple times, once for each known pool, and coalesce the results.✔️ Live Data
Run the following command to get a list of all stake pools sorted by ID:
The results should be identical for both
master
and this branch.Notes
System used for Benchmarking
The system used for benchmarking (4 cores):