-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: events: address sqlite index selection performance regressions #12261
Conversation
0ba0a49
to
9ae5f80
Compare
cd53adf
to
62d3c16
Compare
62d3c16
to
c8b0d6d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Let's test this on a mainnet node and ideally ask Glif/Mikers to deploy it and get the RPC timings out to see the impact.
2619e24
to
dc9cf61
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow - awesome work investigating this. I have been living in a world of assuming the DB engine would "do the right thing". Good to know that verification is needed. Well done!
dc9cf61
to
40aa442
Compare
Fixes: #12255 SQLite was found to be avoiding the intended initial indexes for some types of complex queries and opting for minor indexes which don't narrow down the search space enough. Specifically we want queries to first either narrow by height or tipset_key_cid and then apply other criteria. Having alternative indexes when a query such as `height>=X AND height<=Y` are encountered cause SQLite to avoid the height index entirely. By removing additional indexes that could be used during the main query path (prefillFilter), we force SQLite to use the intended indexes and narrow the results without the use of indexes.
40aa442
to
7d9e21e
Compare
Edit: If you want to take advantage of the performance gains from this change without upgrading to a version of Lotus that includes this fix, you can do it manually for yourself in a safe way that will still work when you eventually do upgrade Lotus to include this fix.
With
sqlite3
installed, substituting the correct path for/path/to/lotus/repo
, run:Fixes: #12255
SQLite was found to be avoiding the intended initial indexes for some types of complex queries and opting for minor indexes which don't narrow down the search space enough. Specifically we want queries to first either narrow by height or tipset_key_cid and then apply other criteria. Having alternative indexes when a query such as
height>=X AND height<=Y
are encountered cause SQLite to avoid the height index entirely. By removing additional indexes that could be used during the main query path (prefillFilter), we force SQLite to use the intended indexes and narrow the results without the use of indexes.Some examples and numbers. These approximate the queries generated by
prefillFilter()
for various inputs and are run against my 40GiB events db containing 380,025 epochs. I've got 2 copies of it, one with the schema as in this PR and one with the schema as in master and I'm running and timing these queries directly on the commandline.Basic select using
height=
(note the current version doesn't even use this, it's always >= & <=)
Before: 3.05s, after: 0.0s - the reason for the slowness is that it chooses the
reverted
index instead ofheight
, for reasons I can't fathom. This could be fixed by having a compound index on the two, but that fails for other cases.Basic select using height>= and height <= for single height
Results in the same output as above.
Before: 3.10, after: 0.0s
Basic select using tipset_key_cid
Same results as above but using the tipset instead of height.
Before: 3.42, after: 0.0s - it's again using
reverted
for some reason, and then matching on tsk cid which is slower because it's a large bytes match.Select at height and match emitter_addr
Again, this isn't quite what master does because we don't have
height=X
there, but close:Before: 3.15s, after: 0.0s -
reverted
again, but if I remove thereverted
clause here it goes back to theheight
index rather than theemitter_addr
index.Select at height >= and <= and match emitter_addr
Same as above but with
height>=X AND height <=X
Before: 3.19s, after: 0.0s
Select at tipset and match emitter_addr
Returns the same results as the above 2 but uses tsk
Before: 3.35s, after: 0.0s
Select event_entry properties
Now we get interesting, this is the kind of thing you might get through
GetActorEventsRaw
oreth_getLogs
searching for a particular log output.prefillFilter()
builds these queries, they're a bit nested and complex but trigger a bunch of different indices:Before: 0.73s, after: 0.1s.
EXPLAIN QUERY PLAN
on this is interesting, it's still avoiding the index we want to hit (height
) but it's finding one that'll narrow it down enough so the resultset isn't too big to filter further down:Before:
After:
I'm using
reverted
in here becauseeth_*
will use that, but it's not always the case if you useGetActorEventsRaw
so the speed may vary if it hits alternative indexes.After all of this I still don't know what criteria SQLite is using to select its index ordering. Perhaps it's to do with the efficiency of the indexes? Bool index is better than int index is better than bytes index? I don't really understand the use of
event_entry_codec_value
in the last one though, but maybe that's a special case because of nesting?