Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use blocks table, not header_cids, for getBlockNumber() #232

Merged
merged 1 commit into from
Mar 10, 2023

Conversation

telackey
Copy link
Contributor

@telackey telackey commented Mar 9, 2023

On our machines, the single most expensive query is:

SELECT block_number FROM eth.header_cids ORDER BY block_number DESC LIMIT 1

This query touches gazillions of rows, almost certainly as a consequence of only having a BRIN index on header_cids.block_number.

Even using gibbon (which has a comparatively small DB vs rhino or vulture) the comparison is significant:

OLD:

"Limit  (cost=28471.11..28471.24 rows=1 width=8) (actual time=182.739..249.251 rows=1 loops=1)"
"  ->  Gather Merge  (cost=28471.11..102315.06 rows=610660 width=8) (actual time=182.738..249.249 rows=1 loops=1)"
"        Workers Planned: 5"
"        Workers Launched: 5"
"        ->  Sort  (cost=27471.04..27776.37 rows=122132 width=8) (actual time=164.437..164.437 rows=1 loops=6)"
"              Sort Key: block_number DESC"
"              Sort Method: top-N heapsort  Memory: 25kB"
"              Worker 0:  Sort Method: top-N heapsort  Memory: 25kB"
"              Worker 1:  Sort Method: top-N heapsort  Memory: 25kB"
"              Worker 2:  Sort Method: top-N heapsort  Memory: 25kB"
"              Worker 3:  Sort Method: top-N heapsort  Memory: 25kB"
"              Worker 4:  Sort Method: top-N heapsort  Memory: 25kB"
"              ->  Parallel Index Only Scan using header_cid_index on header_cids  (cost=0.55..26860.38 rows=122132 width=8) (actual time=0.069..155.067 rows=101663 loops=6)"
"                    Heap Fetches: 87419"
"Planning Time: 4.875 ms"
"Execution Time: 249.310 ms"

NEW:

"Limit  (cost=0.58..0.60 rows=1 width=8) (actual time=0.028..0.029 rows=1 loops=1)"
"  ->  Index Only Scan using blocks_block_number_idx on blocks  (cost=0.58..67751347.29 rows=2837035909 width=8) (actual time=0.028..0.028 rows=1 loops=1)"
"        Heap Fetches: 1"
"Planning Time: 0.091 ms"
"Execution Time: 0.047 ms"

@telackey telackey requested a review from i-norden March 9, 2023 22:11
@telackey telackey self-assigned this Mar 9, 2023
@telackey telackey requested a review from dboreham March 9, 2023 22:14
Copy link
Collaborator

@i-norden i-norden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Is the poor performance due to a combination of timescaleDB partitioning by block_number + using BRIN to index block_number, or is this a more general issue when doing ORDER BY BRIN indexed columns?

Note: we're also changing the BRIN => BTREE in v5 (due to this finding, previously mentioned by David). Once that is done, should we revert to querying eth.header_cids since it is a much smaller table?

@dboreham
Copy link

LGTM! Is the poor performance due to a combination of timescaleDB partitioning by block_number + using BRIN to index block_number, or is this a more general issue when doing ORDER BY BRIN indexed columns?

Note: we're also changing the brin => btree in v5 (due to this finding, previously mentioned by David). Once that is done, should we revert to querying eth.header_cids since it is a much smaller table?

It's both. The lack of a b-tree index seems to lobotomize Timescale's query planner such that it looks in all partitions when only the highest block number range one would do. Then it also takes longer to scan each partition.

I'm not sure the size of the table matters. When executed efficiently this query will be very fast regardless of the size because it will only do one seek on the index.

@telackey telackey merged commit b42f96e into v4 Mar 10, 2023
@telackey telackey deleted the telackey/blocks branch March 10, 2023 02:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants