Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

services/horizon: Add (asset, account_id) index to trust_lines table #4635

Merged
merged 6 commits into from
Oct 12, 2022

Conversation

bartekn
Copy link
Contributor

@bartekn bartekn commented Oct 5, 2022

PR Checklist

PR Structure

  • This PR has reasonably narrow scope (if not, break it down into smaller PRs).
  • This PR avoids mixing refactoring changes with feature changes (split into two PRs
    otherwise).
  • This PR's title starts with name of package that is most changed in the PR, ex.
    services/friendbot, or all or doc if the changes are broad or impact many
    packages.

Thoroughness

  • This PR adds tests for the most critical parts of the new functionality or fixes.
  • I've updated any docs (developer docs, .md
    files, etc... affected by this change). Take a look in the docs folder for a given service,
    like this one.

Release planning

  • I've updated the relevant CHANGELOG (here for Horizon) if
    needed with deprecations, added features, breaking changes, and DB schema changes.
  • I've decided if this PR requires a new major/minor version according to
    semver, or if it's mainly a patch change. The PR is targeted at the next
    release branch if it's not a patch change.

What

Adds a new index to trust_lines table on (asset_type, asset_code, asset_issuer, account_id) fields.

Close #4632.

Why

When filtering by asset in /accounts endpoint the underlying query was using "trust_lines_by_type_code_issuer" btree (asset_type, asset_code, asset_issuer) index. This can be slow when querying for a popular asset which (like USDC) because when selected trust lines are found by a DB engine, they later need to be sorted by account id. A new index allow quick sorting of found trustlines.

Query plan without a new index
 Limit  (cost=2760.42..2787.18 rows=10 width=249) (actual time=725.657..734.633 rows=10 loops=1)
   ->  Nested Loop  (cost=2760.42..7229.44 rows=1670 width=249) (actual time=725.656..734.631 rows=10 loops=1)
         ->  Gather Merge  (cost=2759.86..2950.19 rows=1670 width=57) (actual time=725.622..734.541 rows=10 loops=1)
               Workers Planned: 1
               Workers Launched: 1
               ->  Sort  (cost=1759.85..1762.30 rows=982 width=57) (actual time=719.248..719.275 rows=203 loops=2)
                     Sort Key: trust_lines.account_id
                     Sort Method: external merge  Disk: 8104kB
                     Worker 0:  Sort Method: external merge  Disk: 7472kB
                     ->  Parallel Bitmap Heap Scan on trust_lines  (cost=24.86..1711.05 rows=982 width=57) (actual time=49.144..656.403 rows=118316 loops=2)
                           Recheck Cond: ((asset_type = 1) AND ((asset_code)::text = 'USDC'::text) AND ((asset_issuer)::text = 'GA5ZSEJYB37JRC5AVCIA5MOP4RHTM335X2KGX3IHOJAPP5RE34K4KZVN'::text))
                           Rows Removed by Index Recheck: 1905016
                           Heap Blocks: exact=18001 lossy=87732
                           ->  Bitmap Index Scan on trust_lines_by_type_code_issuer  (cost=0.00..24.44 rows=1670 width=0) (actual time=45.137..45.137 rows=236633 loops=1)
                                 Index Cond: ((asset_type = 1) AND ((asset_code)::text = 'USDC'::text) AND ((asset_issuer)::text = 'GA5ZSEJYB37JRC5AVCIA5MOP4RHTM335X2KGX3IHOJAPP5RE34K4KZVN'::text))
         ->  Index Scan using accounts_pkey on accounts  (cost=0.56..2.56 rows=1 width=192) (actual time=0.008..0.008 rows=1 loops=10)
               Index Cond: ((account_id)::text = (trust_lines.account_id)::text)
 Planning Time: 2.946 ms
 Execution Time: 738.166 ms
(19 rows)

Query plan **with** a new index
 Limit  (cost=1.25..27.29 rows=10 width=249) (actual time=0.319..0.376 rows=10 loops=1)
   ->  Nested Loop  (cost=1.25..4351.51 rows=1670 width=249) (actual time=0.318..0.374 rows=10 loops=1)
         ->  Index Only Scan using trust_lines_by_account_type_code_issuer on trust_lines  (cost=0.69..72.26 rows=1670 width=57) (actual time=0.303..0.308 rows=10 loops=1)
               Index Cond: ((asset_type = 1) AND (asset_code = 'USDC'::text) AND (asset_issuer = 'GA5ZSEJYB37JRC5AVCIA5MOP4RHTM335X2KGX3IHOJAPP5RE34K4KZVN'::text))
               Heap Fetches: 0
         ->  Index Scan using accounts_pkey on accounts  (cost=0.56..2.56 rows=1 width=192) (actual time=0.006..0.006 rows=1 loops=10)
               Index Cond: ((account_id)::text = (trust_lines.account_id)::text)
 Planning Time: 2.113 ms
 Execution Time: 0.420 ms
(9 rows)

Known limitations

Additional index takes extra space. A new index is able to replace existing "trust_lines_by_type_code_issuer" btree (asset_type, asset_code, asset_issuer). Maybe the old index should be removed?

@bartekn bartekn changed the title services/horizon: Add (account_id, asset) index to trust_lines table services/horizon: Add (asset, account_id) index to trust_lines table Oct 5, 2022
@bartekn bartekn marked this pull request as ready for review October 5, 2022 11:42
@bartekn bartekn requested a review from a team October 5, 2022 13:35
@tsachiherman
Copy link
Contributor

Do we need to evaluate the performance implication on ingestion ? ( i.e. more indices typically slow down inserts/deletion )

@bartekn
Copy link
Contributor Author

bartekn commented Oct 6, 2022

Do we need to evaluate the performance implication on ingestion ? ( i.e. more indices typically slow down inserts/deletion

Yes, we usually do all performance testing in stg when testing a release.

@bartekn
Copy link
Contributor Author

bartekn commented Oct 6, 2022

Additional index takes extra space. A new index is able to replace existing "trust_lines_by_type_code_issuer" btree (asset_type, asset_code, asset_issuer). Maybe the old index should be removed?

Removed unused index in 77361d4.

Copy link
Contributor

@Shaptic Shaptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but is there any reason why this would cause worse performance on certain queries? We saw this with claimable balances and it took a couple of iterations.

@@ -0,0 +1,9 @@
-- +migrate Up

CREATE INDEX "trust_lines_by_type_code_issuer_account" ON trust_lines USING btree (asset_type, asset_code, asset_issuer, account_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bartekn, any thoughts on the duration this may take to complete on a full db(from pubnet genesis)? this would get applied to staging db which has that volume as part of release testing steps, could observe then?

@bartekn
Copy link
Contributor Author

bartekn commented Oct 12, 2022

LGTM, but is there any reason why this would cause worse performance on certain queries? We saw this with claimable balances and it took a couple of iterations.

I checked the codebase and we we only use this index in /accounts?asset={asset}. So I don't expect any performance degradation however better to check it with mirroring in stg.

@bartekn, any thoughts on the duration this may take to complete on a full db(from pubnet genesis)? this would get applied to staging db which has that volume as part of release testing steps, could observe then?

It is index on state tables so every horizon (full history or not) have the same number of rows. When I was creating it locally it took a couple of minutes so it should be really fast in prd DBs.

@bartekn bartekn merged commit 947c2a6 into stellar:master Oct 12, 2022
@bartekn bartekn deleted the add-trustlines-index branch October 12, 2022 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Horizon endpoint /accounts?asset=<USDC> getting 503's
4 participants