Implemented SQL query plan explainer, added necessary indexes #728

polydez · 2025-03-06T14:22:38Z

Resolves #712 and #57

In this PR we implemented SQL query plan explainer which writes to log something like a query plan below for each running query:

>> EXPLAIN QUERY PLAN -- Selects new notes matching the tags and account IDs search criteria.
SELECT
    block_num,
    batch_index,
    note_index,
    note_id,
    note_type,
    sender,
    tag,
    aux,
    execution_hint,
    merkle_path
FROM
    notes
WHERE
    -- find the next block which contains at least one note with a matching tag or sender
    block_num = (
        SELECT
            block_num
        FROM
            notes
        WHERE
            (tag IN rarray(NULL) OR sender IN rarray(NULL)) AND
            block_num > 0
        ORDER BY
            block_num ASC
    LIMIT 1) AND
    -- filter the block's notes and return only the ones matching the requested tags or senders
    (tag IN rarray(NULL) OR sender IN rarray(NULL))


QUERY PLAN
├── SEARCH notes USING INDEX sqlite_autoindex_notes_1 (block_num=?)
├── SCALAR SUBQUERY 3
│   ├── MULTI-INDEX OR
│   │   ├── INDEX 1
│   │   │   ├── LIST SUBQUERY 1
│   │   │   │   └── SCAN rarray VIRTUAL TABLE INDEX 1:
│   │   │   └── SEARCH notes USING INDEX idx_notes_tag (tag=? AND block_num>?)
│   │   └── INDEX 2
│   │       ├── LIST SUBQUERY 2
│   │       │   └── SCAN rarray VIRTUAL TABLE INDEX 1:
│   │       └── SEARCH notes USING INDEX idx_notes_sender (sender=? AND block_num>?)
│   └── USE TEMP B-TREE FOR ORDER BY
├── LIST SUBQUERY 4
│   └── SCAN rarray VIRTUAL TABLE INDEX 1:
└── LIST SUBQUERY 5
    └── SCAN rarray VIRTUAL TABLE INDEX 1:

This works for both tests and running node, if feature explain-query-plans is enabled.

In order to achieve this, we implemented our own SQLite connection pool which constructs special wrapper for each connection, returning special wrapper for each cached prepared statement. The latter is able to explain a query once it is being run. The same was done for transactions as well. This solution also helped us to get rid of deadpool-sqlite (all necessary functionality from it we have now). And configuring of connection now done in SQLite connection pool instead of hooking connection creation, it looks more elegant to me.

Also implemented some simple checking for problems in query plans during test run (test will fail, if any problem found in any query plan for queries run in tests).

We also added all necessary indices and foreign keys.

…d of deadpool-sqlite

# Conflicts: # Cargo.lock # crates/store/src/db/migrations.rs # crates/store/src/db/mod.rs # crates/store/src/db/settings.rs # crates/store/src/db/sql/mod.rs # crates/store/src/db/sql/utils.rs # crates/store/src/db/tests.rs # crates/store/src/server/api.rs

Mirko-von-Leipzig · 2025-03-10T14:22:22Z

crates/store/src/db/connection.rs

+#[cfg(not(feature = "explain-query-plans"))]
+impl Connection {
+    #[inline]
+    pub fn prepare_cached(&self, sql: &str) -> rusqlite::Result<rusqlite::CachedStatement<'_>> {
+        self.inner.prepare_cached(sql)
+    }
+
+    #[inline]
+    pub fn execute<P: rusqlite::Params>(&self, sql: &str, params: P) -> rusqlite::Result<usize> {
+        self.inner.execute(sql, params)
+    }
+
+    #[inline]
+    pub fn query_row<T, P, F>(&self, sql: &str, params: P, f: F) -> rusqlite::Result<T>
+    where
+        P: rusqlite::Params,
+        F: FnOnce(&rusqlite::Row<'_>) -> rusqlite::Result<T>,
+    {
+        self.inner.query_row(sql, params, f)
+    }
+}


Is there a reason we need these and can't just use a transaction?

iiuc these are really just convenience features which automatically create and commit a transaction under the hood. I don't think we should be hiding transactions like this - its already happened that we used two connections in the same query instead of using an atomic transaction for both.

I just tried to keep the same interface, but your point makes sense, I will use only transactions, thank you!

Mirko-von-Leipzig · 2025-03-10T14:26:46Z

crates/store/src/db/tests.rs

+#[ctor::ctor]
+fn initialize() {
+    miden_node_utils::logging::setup_tracing(miden_node_utils::logging::OpenTelemetry::Disabled)
+        .expect("Failed to setup logging for tests");
+}


Where is this used?

I think we also have a proc-macro for this already

This is needed for initializing logging in tests. If it was implemented somewhere else, please let me know, I haven't found it by myself.

Our test-macro crate I think. But you would need to annotate with

#[miden_node_test_macro::enable_logging]

I'm not sure how much of a fan I am of the macro tbh so maybe yours is better.

Thank you! I compared both solutions and must admit that even though mine is simpler and more convenient to use, it doesn't create span for each test as #[miden_node_test_macro::enable_logging] does. And our enable_logging ruins syntax coloring in my IDE which I would try to avoid. We could use https://github.com/d-e-s-o/test-log instead, but it's also not perfect.

CHANGELOG.md

crates/store/src/db/pool_manager.rs

Mirko-von-Leipzig · 2025-03-10T16:23:47Z

crates/store/src/db/sql/plan_explainer.rs

+        self.statement.execute(params)
+    }
+
+    fn explain<P: Params>(&mut self, params: P) -> rusqlite::Result<()> {


Could you explain what this is doing?

Could we instead just do a fn Transaction::query_plan_contains_scan() -> bool function, and then in prepare_cached etc we use it conditionally and do an assert?

pub fn prepare_cached(&self, sql: &str) -> rusqlite::Result<rusqlite::Statement<'_>> { // Ensure our sql statements use indices and don't perform a row scan. #[cfg(test) assert!(self.query_plan_contains_scan().not()); self.inner().prepare_cached(sql) }

This method generates query plan, formats it as tree and writes it to log along with an SQL query. We could simplify this by analyzing just rows of query plan returned by EXPLAIN QUERY PLAN ... request, without formatting, but this would make logs less human-friendly.

Can you provide a before/after example?

It depends on formatting. Current implementation generates tree like:

QUERY PLAN ├── SEARCH notes USING INDEX sqlite_autoindex_notes_1 (block_num=?) ├── SCALAR SUBQUERY 3 │ ├── MULTI-INDEX OR │ │ ├── INDEX 1 │ │ │ ├── LIST SUBQUERY 1 │ │ │ │ └── SCAN rarray VIRTUAL TABLE INDEX 1: │ │ │ └── SEARCH notes USING INDEX idx_notes_tag (tag=? AND block_num>?) │ │ └── INDEX 2 │ │ ├── LIST SUBQUERY 2 │ │ │ └── SCAN rarray VIRTUAL TABLE INDEX 1: │ │ └── SEARCH notes USING INDEX idx_notes_sender (sender=? AND block_num>?) │ └── USE TEMP B-TREE FOR ORDER BY ├── LIST SUBQUERY 4 │ └── SCAN rarray VIRTUAL TABLE INDEX 1: └── LIST SUBQUERY 5 └── SCAN rarray VIRTUAL TABLE INDEX 1:

If we get rid of formatting, it would look like:

┌────┬────────┬─────────┬─────────────────────────────────────────────────────────────────┐ │ id ┆ parent ┆ notused ┆ detail │ ╞════╪════════╪═════════╪═════════════════════════════════════════════════════════════════╡ │ 3 ┆ 0 ┆ 0 ┆ SEARCH notes USING INDEX sqlite_autoindex_notes_1 (block_num=?) │ ├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ 6 ┆ 0 ┆ 0 ┆ SCALAR SUBQUERY 3 │ ├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ 16 ┆ 6 ┆ 0 ┆ SEARCH notes USING INDEX sqlite_autoindex_notes_1 (block_num>?) │ ├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ 25 ┆ 6 ┆ 0 ┆ LIST SUBQUERY 1 │ ├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ 27 ┆ 25 ┆ 0 ┆ SCAN rarray VIRTUAL TABLE INDEX 1: │ ├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ 45 ┆ 6 ┆ 0 ┆ LIST SUBQUERY 2 │ ├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ 47 ┆ 45 ┆ 0 ┆ SCAN rarray VIRTUAL TABLE INDEX 1: │ ├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ 73 ┆ 0 ┆ 0 ┆ LIST SUBQUERY 4 │ ├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ 75 ┆ 73 ┆ 0 ┆ SCAN rarray VIRTUAL TABLE INDEX 1: │ ├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ 93 ┆ 0 ┆ 0 ┆ LIST SUBQUERY 5 │ ├╌╌╌╌┼╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │ 95 ┆ 93 ┆ 0 ┆ SCAN rarray VIRTUAL TABLE INDEX 1: │ └────┴────────┴─────────┴─────────────────────────────────────────────────────────────────┘

To be clear, the latter example still uses formatting by external crate.

I see okay so its mimicking the format of sqlite3 by building a query tree and formatting it.

Its a bit confusing because the Tree type is recursive and we're assuming that id is strictly incrementing I think? Would be good to either comment this extensively, or find a simpler abstraction.

I know its not optimal (but this doesn't need to be), but what about something like:

// Only required so we can implement `Display` for it. // Also a bad name but meh. struct Element(u64, String) struct QueryTree(Tree<Element>) impl QueryTree { fn new() -> Self { Self(Tree::new(Element(0, "QUERY PLAN")) } fn insert(&mut self, parent: u64, element: Element) { // Recursively search through all elements in tree already // and insert new element as a child there. } }

I think, it's not that bad to rely on the current row ordering (we will explain this in comments), since it's not a critical feature, but needed only for debugging, and such implementation is simple and fast. And the official SQLIite CLI relies on this ordering as well, it's natural in terms of query plan generation and unlikely to be changed in future.

Co-authored-by: Mirko <48352201+Mirko-von-Leipzig@users.noreply.github.com>

Mirko-von-Leipzig · 2025-03-11T13:42:27Z

crates/store/src/db/sql/plan_explainer.rs

+            Executor::Transaction(transaction) => transaction.prepare(&explain_sql)?,
+        };
+
+        let mut rows = explain_stmt.query(params)?;


Does this require actual params? With sqlite3 you just do EXPLAIN QUERY PLAN <statment> without any params.

Would be nice to not need params at all.

Good idea, this is not supported by ordinary query (we have to fill all params or get "invalid parameter count" error otherwise), but it might work for raw_query, I will check it, thanks.

Mirko-von-Leipzig · 2025-03-11T13:45:31Z

crates/store/src/db/sql/plan_explainer.rs

+enum Executor<'conn> {
+    Connection(&'conn rusqlite::Connection),
+    Transaction(&'conn rusqlite::Transaction<'conn>),
+}
+
+pub struct CachedStatementWithQueryPlan<'conn> {
+    executor: Executor<'conn>,
+    sql: Box<str>,
+    statement: CachedStatement<'conn>,
+}


We should be able to get rid of this if we only support Transaction right?

I'm thinking it just becomes Transaction::check_query_plan(&self, sql: &str) which is selectively enabled by #[cfg(test)].

# Conflicts: # crates/store/src/db/mod.rs

polydez added 17 commits February 27, 2025 21:19

fix: typo

3594181

fix: doctest error

8f139e2

refactor: add foreign keys, remove unnecessary constraints, fix tests

85fcda3

feat: implement query plan printing to STDOUT

7d63ddf

feat: expand parameters in query plan printing SQL

4c0b29b

feat: add support for BLOB

3d25559

refactor: visualize query plan as tree

e664814

refactor: implement pool manager and rewrite solution onto it, get ri…

a950038

…d of deadpool-sqlite

fix: clippy warnings

5cab211

fix: printing

78f84c2

feat: add necessary indices

e42b4fe

feat: add necessary indices for notes

f0105dc

feat: printing improvement

d0a4231

test: simply examine query plans in tests

f41c1c4

feat: print to log instead of STDOUT

379abc6

feat: add explain-query-plans feature to miden-node crate

aa94bc9

docs: update CHANGELOG.md

10852ea

polydez requested review from Mirko-von-Leipzig and bobbinth March 6, 2025 14:25

formatting: reformat toml-files

80673e5

polydez marked this pull request as ready for review March 6, 2025 14:36

polydez added 2 commits March 10, 2025 22:01

fix: compilation errors/warnings

3bfd81d

Mirko-von-Leipzig mentioned this pull request Mar 10, 2025

store: consider typed sqlite getters using an extension trait #442

Open

Mirko-von-Leipzig reviewed Mar 10, 2025

View reviewed changes

polydez and others added 5 commits March 11, 2025 15:49

format: format using rustfmt

ddfb91c

refactor: use transactions for any query

4ad9117

Update CHANGELOG.md

5e98a64

Co-authored-by: Mirko <48352201+Mirko-von-Leipzig@users.noreply.github.com>

refactor: make new_connection private

d5b6ad0

format: reformat code by using rustfmt

df27ca3

refactor: use proc-macro for logging initialization in all DB-tests

f4287e3

Mirko-von-Leipzig reviewed Mar 11, 2025

View reviewed changes

polydez added 4 commits March 12, 2025 18:53

refactor: address review comments

329133f

fix: clippy false positive warning

bcbb208

Merge branch 'next' into polydez-db-indexes

9b0ac2a

# Conflicts: # crates/store/src/db/mod.rs

formatting: reformat toml

1b40c84

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implemented SQL query plan explainer, added necessary indexes #728

Implemented SQL query plan explainer, added necessary indexes #728

polydez commented Mar 6, 2025 •

edited

Loading

Mirko-von-Leipzig Mar 10, 2025

polydez Mar 11, 2025

Mirko-von-Leipzig Mar 10, 2025

polydez Mar 11, 2025

Mirko-von-Leipzig Mar 11, 2025 •

edited

Loading

polydez Mar 11, 2025

Mirko-von-Leipzig Mar 10, 2025

polydez Mar 11, 2025

Mirko-von-Leipzig Mar 11, 2025

polydez Mar 11, 2025 •

edited

Loading

Mirko-von-Leipzig Mar 11, 2025

polydez Mar 12, 2025 •

edited

Loading

Mirko-von-Leipzig Mar 11, 2025

polydez Mar 11, 2025 •

edited

Loading

Mirko-von-Leipzig Mar 11, 2025

Implemented SQL query plan explainer, added necessary indexes #728

Are you sure you want to change the base?

Implemented SQL query plan explainer, added necessary indexes #728

Conversation

polydez commented Mar 6, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Mirko-von-Leipzig Mar 11, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

polydez Mar 11, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

polydez Mar 12, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

polydez Mar 11, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

polydez commented Mar 6, 2025 •

edited

Loading

Mirko-von-Leipzig Mar 11, 2025 •

edited

Loading

polydez Mar 11, 2025 •

edited

Loading

polydez Mar 12, 2025 •

edited

Loading

polydez Mar 11, 2025 •

edited

Loading