Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Listing stake pools time grows when there are wallets syncing in background #2082

Closed
piotr-iohk opened this issue Aug 27, 2020 · 2 comments
Closed
Assignees
Labels
Bug SEVERITY:LOW Small defects which does not prevent any crucial functionality to work.

Comments

@piotr-iohk
Copy link
Contributor

piotr-iohk commented Aug 27, 2020

Context

Not sure how is the relation but the time of listing pools seems to grow when there are more wallets being synced.
I don't see any obvious outstanding query time that would explain 50s... the longest query is GetNonMyopicMemberRewards which is a bit longer the more wallets are syncing. (Which applies to other querries too actually).

0 Wallets syncing:
List sp = 3s
Query GetNonMyopicMemberRewards took 2.028777786s

1 wallet syncing:
List sp = 13s
Query GetNonMyopicMemberRewards took 2.39806037s

2 wallets syncing
List sp = 26s
Query GetNonMyopicMemberRewards took 2.452248284s

3 wallets syncing
List sp = 31s
Query GetNonMyopicMemberRewards took 2.402175446s

5 wallets syncing
List sp = 40s
Query GetNonMyopicMemberRewards took 2.821580222s

10 wallets syncing
List sp = 47s
Query GetNonMyopicMemberRewards took 3.001514181s

15 wallets syncing
List sp = 1m 03s
Query GetNonMyopicMemberRewards took 3.012593506s

20 wallets syncing
List sp = 1m 11s
Query GetNonMyopicMemberRewards took 3.519106839s

Times of other querries is also growing a bit too and also the number of queries is growing when there are more wallets:

With 0 wallets (lp exec time = 4s):

Query GetNonMyopicMemberRewards took 2.386650034s
Query GetCurrentPParams took 0.000716248s
Query GetEraStart took 0.00137057s
Query getAccountBalance took 0.001327425s
Query GetCurrentPParams took 0.000513555s
Query GetUpdateInterfaceState took 0.000478

With 20 wallets (lp exec time = 1:10m):

Query GetEraStart took 0.637137392s
Query getAccountBalance took 0.648792572s
Query GetCurrentPParams took 0.626415965s
Query GetUpdateInterfaceState took 0.753813193s
Query GetStakeDistribution took 0.570119185s
Query GetEraStart took 0.771473205s
Query GetCurrentPParams took 0.650308479s
Query GetUpdateInterfaceState took 0.784389s
Query GetNonMyopicMemberRewards took 3.200318526s
Query getAccountBalance took 3.245384946s
Query GetCurrentPParams took 1.100185536s
Query GetEraStart took 0.842396347s
Query getAccountBalance took 0.854336837s
Query GetCurrentPParams took 0.759815955s
Query GetUpdateInterfaceState took 0.683372427s
Query GetEraStart took 0.841750692s
Query getAccountBalance took 0.886189268s
Query GetCurrentPParams took 0.806262374s
Query GetUpdateInterfaceState took 0.796393044s
Query GetEraStart took 0.898862386s
Query getAccountBalance took 0.962203422s
Query GetCurrentPParams took 0.736154508s
Query GetUpdateInterfaceState took 0.811650964s
Query GetEraStart took 0.695541092s
Query getAccountBalance took 0.730742968s
Query GetCurrentPParams took 0.901877006s
Query GetUpdateInterfaceState took 0.893728

NOTE 1:
Please note that when 20 wallets are synced listing stake pools takes ~5s, so the performance problem is visible only while some wallets are syncing.

NOTE 2:
Also, when we have n wallets syncing on cardano-wallet 1 and experiencing perf problems with listing stake-pools on that wallet, at the same time if we have cardano-wallet 2 on the same machine, we are able to list stake pools there just fine within 3-4 seconds (considering no wallets are syncing on cardano-wallet 2).

Information -
Version 2020.8.3 (git revision: 9f93d0a) / cardano-node 1.19.0
Platform Linux
Installation From source

Steps to Reproduce

  1. Have n wallets syncing in parallel
  2. List stake pools while wallets are syncing

Expected behavior

Ideally listing stake pools time is consistent no matter how many wallets are syncing.

Actual behavior

Listing stake pool takes longer while the more wallets are syncing.

Resolution

#2111

The above PR:

  • changes the implementation of ListStakePools to execute a small number of handwritten SQL queries with native joins, rather than issuing multiple queries and joining together the results in memory.

  • adds a new operation listPoolLifeCycleData. The SQLite implementation of this operation uses a single hand-optimized query to return lifecycle data for all pools.

QA

✔️ Properties

The above PR adds a property test for listPoolLifeCycleData, consisting of the following steps:

  1. Add an arbitrary sequence of pool registration and retirement certificates to the database, for multiple pools.
  2. Call listPoolLifeCycleData once to fetch lifecycle data for all active pools.
  3. Call readPoolLifeCycleStatus multiple times, once for each known pool, and coalesce the results.
  4. Test that the results returned in steps 2 and 3 are identical.

✔️ Live Data

Run the following command to get a list of all stake pools sorted by ID:

cardano-wallet stake-pool list --stake 1000 | jq 'sort_by(.id)' > stake-pool-list

The results should be identical for both master and the branch from PR #2111.

@piotr-iohk piotr-iohk added Bug SEVERITY:LOW Small defects which does not prevent any crucial functionality to work. labels Aug 27, 2020
@jonathanknowles
Copy link
Member

jonathanknowles commented Sep 1, 2020

I wonder if this is simply due to the fact that wallet syncing occupies CPU and disk resources, and having several wallets syncing in parallel might cause so much resource contention that querying the pool database slows down?

One possible factor is that we currently perform at least (2n + 1) database queries to construct the list of pools, where n is the set of known pool identifiers. (See here.)

Normally, this wouldn't be an issue, as individual queries are executed very quickly. But if the query latency increases, executing (2n + 1) queries becomes a problem.

We have a technical debt task here to examine the use of more efficient queries: https://jira.iohk.io/browse/ADP-383

@jonathanknowles jonathanknowles self-assigned this Sep 1, 2020
iohk-bors bot added a commit that referenced this issue Sep 10, 2020
2111: Faster listing of stake pools. r=jonathanknowles a=jonathanknowles

## Issue Number

#2082 (_Listing stake pools time grows when there are wallets syncing in background_.)

## Overview

This PR:

* changes the implementation of `ListStakePools` to execute a small number of handwritten SQL queries with native joins, rather than issuing multiple queries and joining together the results in memory.

* adds a new operation [`listPoolLifeCycleData`](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/src/Cardano/Pool/DB.hs#L180). The [SQLite implementation](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/src/Cardano/Pool/DB/Sqlite.hs#L384) of this operation uses a [single hand-optimized query](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/src/Cardano/Pool/DB/Sqlite.hs#L595) to return lifecycle data for all pools.

## Benefits

### ⭐ **Faster execution time**

  The time taken to execute `ListStakePools` is greatly reduced, even when there are multiple wallets syncing in the background.
  <details><summary>Click to view data</summary><br>

  | No. of <br>Wallets<br>Syncing | Time Required<br>(seconds)<br>`master` | Time Required<br>(seconds)<br>`more-efficient-list-pools` |
  | ---: | ---: | ---: |
  | 0 | 8 | 3 |
  | 5 | 30 | 7 |
  | 10 | 60 | 8 |
  | 15 | 72 | 9 |
  | 20 | 95 | 10 |
  </details>

### ⭐ **Fewer database queries**

  The number of `select` queries required for a single call to `ListStakePools` is greatly reduced.
  <details><summary>Click to view data</summary><br>
  
  | Branch | Network | No. of Pools | No. of Queries |
  | :--- | :--- | ---: | ---: | 
  | `master` | `mainnet` |  1,124 |  12,365 |
  | `more-efficient-list-pools`| `mainnet` | 1,124  | 4 |
  </details>

## Testing

### ✔️ Properties

This PR adds a [property test](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/test/unit/Cardano/Pool/DB/Properties.hs#L987) for `listPoolLifeCycleData`, consisting of the following steps:
1. Add an arbitrary sequence of pool registration and retirement certificates to the database, for multiple pools.
2. Call `listPoolLifeCycleData` once to fetch lifecycle data for all active pools.
3. Call `readPoolLifeCycleStatus` multiple times, once for each known pool, and coalesce the results. 
4. Test that the results returned in steps 2 and 3 are identical.

### ✔️ Live Data

Run the following command to get a list of all stake pools sorted by ID:
```sh
cardano-wallet stake-pool list --stake 1000 | jq 'sort_by(.id)' > stake-pool-list
```
The results should be identical for both `master` and this branch.

## Notes

### System used for Benchmarking

The system used for benchmarking (4 cores):

```
vendor_id       : GenuineIntel
cpu family      : 6
model           : 85
model name      : Intel(R) Xeon(R) CPU
stepping        : 7
microcode       : 0x1
cpu MHz         : 2800.176
cache size      : 33792 KB
```

Co-authored-by: Jonathan Knowles <jonathan.knowles@iohk.io>
iohk-bors bot added a commit that referenced this issue Sep 10, 2020
2111: Faster listing of stake pools. r=jonathanknowles a=jonathanknowles

## Issue Number

#2082 (_Listing stake pools time grows when there are wallets syncing in background_.)

## Overview

This PR:

* changes the implementation of `ListStakePools` to execute a small number of handwritten SQL queries with native joins, rather than issuing multiple queries and joining together the results in memory.

* adds a new operation [`listPoolLifeCycleData`](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/src/Cardano/Pool/DB.hs#L180). The [SQLite implementation](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/src/Cardano/Pool/DB/Sqlite.hs#L384) of this operation uses a [single hand-optimized query](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/src/Cardano/Pool/DB/Sqlite.hs#L595) to return lifecycle data for all pools.

## Benefits

### ⭐ **Faster execution time**

  The time taken to execute `ListStakePools` is greatly reduced, even when there are multiple wallets syncing in the background.
  <details><summary>Click to view data</summary><br>

  | No. of <br>Wallets<br>Syncing | Time Required<br>(seconds)<br>`master` | Time Required<br>(seconds)<br>`more-efficient-list-pools` |
  | ---: | ---: | ---: |
  | 0 | 8 | 3 |
  | 5 | 30 | 7 |
  | 10 | 60 | 8 |
  | 15 | 72 | 9 |
  | 20 | 95 | 10 |
  </details>

### ⭐ **Fewer database queries**

  The number of `select` queries required for a single call to `ListStakePools` is greatly reduced.
  <details><summary>Click to view data</summary><br>
  
  | Branch | Network | No. of Pools | No. of Queries |
  | :--- | :--- | ---: | ---: | 
  | `master` | `mainnet` |  1,124 |  12,365 |
  | `more-efficient-list-pools`| `mainnet` | 1,124  | 4 |
  </details>

## Testing

### ✔️ Properties

This PR adds a [property test](https://github.com/input-output-hk/cardano-wallet/blob/d00c608e47d2d189a282a02946cc8e46b843375b/lib/core/test/unit/Cardano/Pool/DB/Properties.hs#L987) for `listPoolLifeCycleData`, consisting of the following steps:
1. Add an arbitrary sequence of pool registration and retirement certificates to the database, for multiple pools.
2. Call `listPoolLifeCycleData` once to fetch lifecycle data for all active pools.
3. Call `readPoolLifeCycleStatus` multiple times, once for each known pool, and coalesce the results. 
4. Test that the results returned in steps 2 and 3 are identical.

### ✔️ Live Data

Run the following command to get a list of all stake pools sorted by ID:
```sh
cardano-wallet stake-pool list --stake 1000 | jq 'sort_by(.id)' > stake-pool-list
```
The results should be identical for both `master` and this branch.

## Notes

### System used for Benchmarking

The system used for benchmarking (4 cores):

```
vendor_id       : GenuineIntel
cpu family      : 6
model           : 85
model name      : Intel(R) Xeon(R) CPU
stepping        : 7
microcode       : 0x1
cpu MHz         : 2800.176
cache size      : 33792 KB
```

Co-authored-by: Jonathan Knowles <jonathan.knowles@iohk.io>
@piotr-iohk
Copy link
Contributor Author

LGTM!

Results from manual verification on mainnet -> #2111 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug SEVERITY:LOW Small defects which does not prevent any crucial functionality to work.
Projects
None yet
Development

No branches or pull requests

2 participants