Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Left joins in SQLite can break the query macros #1249

Closed
ProjectMoon opened this issue May 25, 2021 · 9 comments
Closed

Left joins in SQLite can break the query macros #1249

ProjectMoon opened this issue May 25, 2021 · 9 comments

Comments

@ProjectMoon
Copy link

After some debugging by myself and some people in the #rust:mozilla.org Matrix room, we've come to the conclusion that I've found a bug in SQLx.

If I have two tables like this:

CREATE TABLE IF NOT EXISTS "accounts" (
    "user_id" TEXT PRIMARY KEY NOT NULL UNIQUE,
    "password" TEXT NULL
);

CREATE TABLE IF NOT EXISTS "user_state" (
    "user_id" TEXT PRIMARY KEY NOT NULL UNIQUE, 
    "active_room" TEXT, 
    "account_status" CHECK(account_status IN ('not_registered', 'registered', 'awaiting_activation')) NOT NULL
);

This query will break the query and query_as! macros:

SELECT
    a.user_id as "username",
    a.password,
    s.active_room,
    s.account_status
    FROM accounts a
    LEFT JOIN user_state s on a.user_id = s.user_id
    WHERE a.user_id = 'usernamehere';

The error message returned during compilation is proc macro panicked and help: message: no entry found for key. The code is located at src/sqlite/connection/explain.rs:149:32. This occurs on sqlx 0.5.x, including 0.5.5.

It seems specifically related to the LEFT JOIN. A regular JOIN will compile just fine.

@hectorj
Copy link

hectorj commented Jun 20, 2021

I have a similar case and the panic happens on this line:

for column in &r_cursor[&p1] {

(I manually called (&pool).describe(sqlQuery) to debug)

@hectorj
Copy link

hectorj commented Jun 20, 2021

@ProjectMoon 's example query still fails when reduced to

SELECT
    user_state.active_room
FROM accounts
     LEFT JOIN user_state ON accounts.user_id = user_state.user_id;

The EXPLAIN gives:

0,Init,0,19,0,,0,
1,OpenRead,2,3,0,"k(1,)",0,
2,OpenRead,1,4,0,2,0,
3,OpenRead,3,5,0,"k(1,)",2,
4,Rewind,2,18,1,0,0,
5,Integer,0,1,0,,0,
6,Column,2,0,2,,0,
7,SeekGE,3,13,2,1,0,
8,IdxGT,3,13,2,1,0,
9,DeferredSeek,3,0,1,,0,
10,Integer,1,1,0,,0,
11,Column,1,1,3,,0,
12,ResultRow,3,1,0,,0,
13,IfPos,1,17,0,,0,
14,NullRow,1,0,0,,0,
15,NullRow,3,0,0,,0,
16,Goto,0,10,0,,0,
17,Next,2,5,0,,1,
18,Halt,0,0,0,,0,
19,Transaction,0,0,4,1,1,
20,Goto,0,1,0,,0,

Edit: note that it doens't happen if we select only user_state.user_id, which gives this EXPLAIN:

0,Init,0,16,0,,0,
1,OpenRead,2,3,0,"k(1,)",0,
2,OpenRead,3,5,0,"k(1,)",2,
3,Rewind,2,15,1,0,0,
4,Integer,0,1,0,,0,
5,Column,2,0,2,,0,
6,SeekGE,3,11,2,1,0,
7,IdxGT,3,11,2,1,0,
8,Integer,1,1,0,,0,
9,Column,3,0,3,,0,
10,ResultRow,3,1,0,,0,
11,IfPos,1,14,0,,0,
12,NullRow,3,0,0,,0,
13,Goto,0,8,0,,0,
14,Next,2,4,0,,1,
15,Halt,0,0,0,,0,
16,Transaction,0,0,10,3,1,
17,Goto,0,1,0,,0,

@hectorj
Copy link

hectorj commented Jun 20, 2021

I haven't been able to reproduce the issue with tests/sqlite/sqlite.db.

I'm trying to understand the COLUMN and NullRow sqlite opcodes but I'm not there yet.

Adding a if r_cursor.contains_key(&p1) { check before the OP_NULL_ROW loop fixes the issue on my project, but I'm guessing we may be losing information on nullable columns in some cases. Maybe.

@casey
Copy link

casey commented Aug 24, 2021

I'm running into this as well. Are there any workarounds?

@bgwdotdev
Copy link

I think this might be related to SQLites query prover/planner, in specific:

https://www.sqlite.org/optoverview.html#the_left_join_strength_reduction_optimization
https://www.sqlite.org/optoverview.html#the_omit_left_join_optimization

If either of these situations are hit, then sqlite will change the query and I think this is what breaking the macro.

If I am understanding correctly, in the above provided example, it is hitting the second case of OMIT LEFT JOIN

  1. Either the query is DISTINCT or else the ON or USING clause on the LEFT JOIN constrains the join such that it matches only a single row

as user_state.user_idand account.user_id are UNIQUE, it will always return a single row,

I think a successful test of the left join strength reduction also causing a problem is:

A LEFT JOIN can sometimes be converted into an ordinary JOIN if there are terms in the WHERE clause that guarantee that the two joins will give identical results.

CREATE TABLE IF NOT EXISTS "accounts" (
    "user_id" TEXT PRIMARY KEY NOT NULL UNIQUE,
    "password" TEXT NULL
);

CREATE TABLE IF NOT EXISTS "user_state" (
    "row_id" integer primary key autoincrement,
    "user_id" TEXT NOT NULL, 
    "active_room" TEXT, 
    "account_status" CHECK(account_status IN ('not_registered', 'registered', 'awaiting_activation')) 
);

INSERT INTO accounts (user_id, password) VALUES 
    ('a', 'pass')
    , ('b', 'pass')
;

INSERT INTO user_state (user_id, active_room, account_status) VALUES
    ('a', 'room', 'not_registered')
    , ('a', 'place', 'not_registered')
    , ('b', 'room', 'not_registered')
;

-- this breaks as it reduces to a JOIN in the query prover
SELECT
    a.user_id as "username",
    a.password,
    s.active_room,
    s.account_status
FROM accounts a
LEFT JOIN user_state s on a.user_id = s.user_id
WHERE a.user_id = 'a'
    AND (s.active_room = 'room' OR s.active_room = 'place')
;
-- this breaks as it reduces to a JOIN in the query prover
SELECT
    a.user_id as "username",
    a.password,
    s.active_room,
    s.account_status
FROM accounts a
LEFT JOIN user_state s on a.user_id = s.user_id
WHERE a.user_id = 'a'
;

-- this works as it has different results from a JOIN
SELECT
    a.user_id as "username",
    a.password,
    s.active_room,
    s.account_status
FROM accounts a
LEFT JOIN user_state s on a.user_id = s.user_id
WHERE a.user_id = 'a'
    AND s.active_room = 'room'
;

bgwdotdev added a commit to bgwdotdev/libre-oneroster that referenced this issue Sep 9, 2021
Currently unknown issue, may be related to[0] though likely something
else entirely. Further testing to be done at a later date.

[0] launchbadge/sqlx#1249
@aidanhs
Copy link

aidanhs commented Nov 22, 2021

Downgrading to sqlx 0.4 lets me compile with left outer joins that are failing on 0.5

@nicoxxl
Copy link

nicoxxl commented Feb 17, 2022

Reproduced with version 0.5.10

If you add this to tests/sqlite/describe.rs, it reproduce it :

#[sqlx_macros::test]
async fn it_left_join() -> anyhow::Result<()> {
    let mut conn = new::<Sqlite>().await?;

    let info = conn.describe("SELECT tweet.id, accounts.id FROM tweet LEFT JOIN accounts ON tweet.owner_id = accounts.id").await?;

    Ok(())
}

This is weird as, near the end of the file, there is this line, very close from the first one in it_describes_left_join :

    let d = conn
        .describe(
            "select tweet.id, accounts.id from accounts left join tweet on owner_id = accounts.id",
        )
        .await?;

And the _explain_s are really different. My addition :

$ sqlite3 sqlite/sqlite.db 
SQLite version 3.37.2 2022-01-06 13:25:41
Enter ".help" for usage hints.
sqlite> EXPLAIN SELECT tweet.id, accounts.id FROM tweet LEFT JOIN accounts ON tweet.owner_id = accounts.id;
addr  opcode         p1    p2    p3    p4             p5  comment      
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     16    0                    0   Start at 16
1     OpenRead       0     2     0     4              0   root=2 iDb=0; tweet
2     OpenRead       1     6     0     0              0   root=6 iDb=0; accounts
3     Rewind         0     15    0                    0   
4       Integer        0     1     0                    0   r[1]=0; init LEFT JOIN no-match flag
5       Column         0     3     2                    0   r[2]=tweet.owner_id
6       SeekRowid      1     11    2                    0   intkey=r[2]
7       Integer        1     1     0                    0   r[1]=1; record LEFT JOIN hit
8       Column         0     0     3                    0   r[3]=tweet.id
9       Rowid          1     4     0                    0   r[4]=rowid
10      ResultRow      3     2     0                    0   output=r[3..4]
11      IfPos          1     14    0                    0   if r[1]>0 then r[1]-=0, goto 14
12      NullRow        1     0     0                    0   
13      Goto           0     7     0                    0   
14    Next           0     4     0                    1   
15    Halt           0     0     0                    0   
16    Transaction    0     0     20    0              1   usesStmtJournal=0
17    Goto           0     1     0                    0   

And the one present in the source code :

$ sqlite3 sqlite/sqlite.db 
SQLite version 3.37.2 2022-01-06 13:25:41
Enter ".help" for usage hints.
sqlite> EXPLAIN select tweet.id, accounts.id from accounts left join tweet on owner_id = accounts.id;
addr  opcode         p1    p2    p3    p4             p5  comment      
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     27    0                    0   Start at 27
1     OpenRead       0     6     0     0              0   root=6 iDb=0; accounts
2     OpenRead       1     2     0     4              0   root=2 iDb=0; tweet
3     Rewind         0     26    0                    0   
4       Once           0     13    0                    0   
5       OpenAutoindex  2     3     0     k(3,B,,)       0   nColumn=3; for tweet
6       Rewind         1     13    0                    0   
7         Column         1     3     2                    0   r[2]=tweet.owner_id
8         Column         1     0     3                    0   r[3]=tweet.id
9         Rowid          1     4     0                    0   r[4]=rowid
10        MakeRecord     2     3     1                    0   r[1]=mkrec(r[2..4])
11        IdxInsert      2     1     0                    16  key=r[1]
12      Next           1     7     0                    3   
13      Integer        0     5     0                    0   r[5]=0; init LEFT JOIN no-match flag
14      Rowid          0     6     0                    0   r[6]=rowid
15      SeekGE         2     22    6     1              0   key=r[6]
16        IdxGT          2     22    6     1              0   key=r[6]
17        Integer        1     5     0                    0   r[5]=1; record LEFT JOIN hit
18        Column         2     1     7                    0   r[7]=tweet.id
19        Rowid          0     8     0                    0   r[8]=rowid
20        ResultRow      7     2     0                    0   output=r[7..8]
21      Next           2     16    0                    0   
22      IfPos          5     25    0                    0   if r[5]>0 then r[5]-=0, goto 25
23      NullRow        2     0     0                    0   
24      Goto           0     17    0                    0   
25    Next           0     4     0                    1   
26    Halt           0     0     0                    0   
27    Transaction    0     0     20    0              1   usesStmtJournal=0
28    Goto           0     1     0                    0   

@serzhshakur
Copy link

@abonander any thoughts on this?

@rustdesk
Copy link

@abonander any thoughts on this?

As a work around, I remove OP_NULL_ROW support, open-trade@81392f0

tyrelr added a commit to tyrelr/sqlx that referenced this issue Apr 1, 2022
tyrelr added a commit to tyrelr/sqlx that referenced this issue Apr 7, 2022
tyrelr added a commit to tyrelr/sqlx that referenced this issue Apr 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants