-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented SQL query plan explainer, added necessary indexes #728
base: next
Are you sure you want to change the base?
Conversation
…d of deadpool-sqlite
# Conflicts: # Cargo.lock # crates/store/src/db/migrations.rs # crates/store/src/db/mod.rs # crates/store/src/db/settings.rs # crates/store/src/db/sql/mod.rs # crates/store/src/db/sql/utils.rs # crates/store/src/db/tests.rs # crates/store/src/server/api.rs
crates/store/src/db/connection.rs
Outdated
#[cfg(not(feature = "explain-query-plans"))] | ||
impl Connection { | ||
#[inline] | ||
pub fn prepare_cached(&self, sql: &str) -> rusqlite::Result<rusqlite::CachedStatement<'_>> { | ||
self.inner.prepare_cached(sql) | ||
} | ||
|
||
#[inline] | ||
pub fn execute<P: rusqlite::Params>(&self, sql: &str, params: P) -> rusqlite::Result<usize> { | ||
self.inner.execute(sql, params) | ||
} | ||
|
||
#[inline] | ||
pub fn query_row<T, P, F>(&self, sql: &str, params: P, f: F) -> rusqlite::Result<T> | ||
where | ||
P: rusqlite::Params, | ||
F: FnOnce(&rusqlite::Row<'_>) -> rusqlite::Result<T>, | ||
{ | ||
self.inner.query_row(sql, params, f) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we need these and can't just use a transaction?
iiuc these are really just convenience features which automatically create and commit a transaction under the hood. I don't think we should be hiding transactions like this - its already happened that we used two connections in the same query instead of using an atomic transaction for both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tried to keep the same interface, but your point makes sense, I will use only transactions, thank you!
crates/store/src/db/tests.rs
Outdated
#[ctor::ctor] | ||
fn initialize() { | ||
miden_node_utils::logging::setup_tracing(miden_node_utils::logging::OpenTelemetry::Disabled) | ||
.expect("Failed to setup logging for tests"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this used?
I think we also have a proc-macro for this already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed for initializing logging in tests. If it was implemented somewhere else, please let me know, I haven't found it by myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our test-macro
crate I think. But you would need to annotate with
#[miden_node_test_macro::enable_logging]
I'm not sure how much of a fan I am of the macro tbh so maybe yours is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I compared both solutions and must admit that even though mine is simpler and more convenient to use, it doesn't create span for each test as #[miden_node_test_macro::enable_logging]
does. And our enable_logging
ruins syntax coloring in my IDE which I would try to avoid. We could use https://github.com/d-e-s-o/test-log instead, but it's also not perfect.
self.statement.execute(params) | ||
} | ||
|
||
fn explain<P: Params>(&mut self, params: P) -> rusqlite::Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain what this is doing?
Could we instead just do a fn Transaction::query_plan_contains_scan() -> bool
function, and then in prepare_cached
etc we use it conditionally and do an assert?
pub fn prepare_cached(&self, sql: &str) -> rusqlite::Result<rusqlite::Statement<'_>> {
// Ensure our sql statements use indices and don't perform a row scan.
#[cfg(test)
assert!(self.query_plan_contains_scan().not());
self.inner().prepare_cached(sql)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method generates query plan, formats it as tree and writes it to log along with an SQL query. We could simplify this by analyzing just rows of query plan returned by EXPLAIN QUERY PLAN ...
request, without formatting, but this would make logs less human-friendly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you provide a before/after example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on formatting. Current implementation generates tree like:
QUERY PLAN
├── SEARCH notes USING INDEX sqlite_autoindex_notes_1 (block_num=?)
├── SCALAR SUBQUERY 3
│ ├── MULTI-INDEX OR
│ │ ├── INDEX 1
│ │ │ ├── LIST SUBQUERY 1
│ │ │ │ └── SCAN rarray VIRTUAL TABLE INDEX 1:
│ │ │ └── SEARCH notes USING INDEX idx_notes_tag (tag=? AND block_num>?)
│ │ └── INDEX 2
│ │ ├── LIST SUBQUERY 2
│ │ │ └── SCAN rarray VIRTUAL TABLE INDEX 1:
│ │ └── SEARCH notes USING INDEX idx_notes_sender (sender=? AND block_num>?)
│ └── USE TEMP B-TREE FOR ORDER BY
├── LIST SUBQUERY 4
│ └── SCAN rarray VIRTUAL TABLE INDEX 1:
└── LIST SUBQUERY 5
└── SCAN rarray VIRTUAL TABLE INDEX 1:
If we get rid of formatting, it would look like:
┌────┬────────┬─────────┬─────────────────────────────────────────────────────────────────┐
│ id ┆ parent ┆ notused ┆ detail │
╞════╪════════╪═════════╪═════════════════════════════════════════════════════════════════╡
│ 3 ┆ 0 ┆ 0 ┆ SEARCH notes USING INDEX sqlite_autoindex_notes_1 (block_num=?) │
├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 6 ┆ 0 ┆ 0 ┆ SCALAR SUBQUERY 3 │
├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 16 ┆ 6 ┆ 0 ┆ SEARCH notes USING INDEX sqlite_autoindex_notes_1 (block_num>?) │
├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 25 ┆ 6 ┆ 0 ┆ LIST SUBQUERY 1 │
├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 27 ┆ 25 ┆ 0 ┆ SCAN rarray VIRTUAL TABLE INDEX 1: │
├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 45 ┆ 6 ┆ 0 ┆ LIST SUBQUERY 2 │
├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 47 ┆ 45 ┆ 0 ┆ SCAN rarray VIRTUAL TABLE INDEX 1: │
├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 73 ┆ 0 ┆ 0 ┆ LIST SUBQUERY 4 │
├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 75 ┆ 73 ┆ 0 ┆ SCAN rarray VIRTUAL TABLE INDEX 1: │
├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 93 ┆ 0 ┆ 0 ┆ LIST SUBQUERY 5 │
├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 95 ┆ 93 ┆ 0 ┆ SCAN rarray VIRTUAL TABLE INDEX 1: │
└────┴────────┴─────────┴─────────────────────────────────────────────────────────────────┘
To be clear, the latter example still uses formatting by external crate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see okay so its mimicking the format of sqlite3
by building a query tree and formatting it.
Its a bit confusing because the Tree
type is recursive and we're assuming that id
is strictly incrementing I think? Would be good to either comment this extensively, or find a simpler abstraction.
I know its not optimal (but this doesn't need to be), but what about something like:
// Only required so we can implement `Display` for it.
// Also a bad name but meh.
struct Element(u64, String)
struct QueryTree(Tree<Element>)
impl QueryTree {
fn new() -> Self { Self(Tree::new(Element(0, "QUERY PLAN")) }
fn insert(&mut self, parent: u64, element: Element) {
// Recursively search through all elements in tree already
// and insert new element as a child there.
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, it's not that bad to rely on the current row ordering (we will explain this in comments), since it's not a critical feature, but needed only for debugging, and such implementation is simple and fast. And the official SQLIite CLI relies on this ordering as well, it's natural in terms of query plan generation and unlikely to be changed in future.
Co-authored-by: Mirko <48352201+Mirko-von-Leipzig@users.noreply.github.com>
Executor::Transaction(transaction) => transaction.prepare(&explain_sql)?, | ||
}; | ||
|
||
let mut rows = explain_stmt.query(params)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this require actual params? With sqlite3
you just do EXPLAIN QUERY PLAN <statment>
without any params.
Would be nice to not need params at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, this is not supported by ordinary query
(we have to fill all params or get "invalid parameter count" error otherwise), but it might work for raw_query
, I will check it, thanks.
enum Executor<'conn> { | ||
Connection(&'conn rusqlite::Connection), | ||
Transaction(&'conn rusqlite::Transaction<'conn>), | ||
} | ||
|
||
pub struct CachedStatementWithQueryPlan<'conn> { | ||
executor: Executor<'conn>, | ||
sql: Box<str>, | ||
statement: CachedStatement<'conn>, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to get rid of this if we only support Transaction
right?
I'm thinking it just becomes Transaction::check_query_plan(&self, sql: &str)
which is selectively enabled by #[cfg(test)]
.
# Conflicts: # crates/store/src/db/mod.rs
Resolves #712 and #57
In this PR we implemented SQL query plan explainer which writes to log something like a query plan below for each running query:
This works for both tests and running node, if feature
explain-query-plans
is enabled.In order to achieve this, we implemented our own SQLite connection pool which constructs special wrapper for each connection, returning special wrapper for each cached prepared statement. The latter is able to explain a query once it is being run. The same was done for transactions as well. This solution also helped us to get rid of deadpool-sqlite (all necessary functionality from it we have now). And configuring of connection now done in SQLite connection pool instead of hooking connection creation, it looks more elegant to me.
Also implemented some simple checking for problems in query plans during test run (test will fail, if any problem found in any query plan for queries run in tests).
We also added all necessary indices and foreign keys.