Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong SQL parentheses #2601

Closed
petergummer opened this issue Dec 16, 2022 · 11 comments
Closed

Wrong SQL parentheses #2601

petergummer opened this issue Dec 16, 2022 · 11 comments
Labels
duplicate This issue or pull request already exists

Comments

@petergummer
Copy link

petergummer commented Dec 16, 2022

7.0.0 is not generating correct parentheses with some queries. It generates the following SQL, which selects far too many rows. The problem appears to be that the third OR EXISTS is not correctly nested, causing the earlier AND conditions to be ignored when it's evaluated:

SELECT t."Id", t."Data", t."Date", t."Number", ...
FROM (
    SELECT f."Id", r."Data", r."Date", r."Number", ...
    FROM "Table1" AS f
    INNER JOIN "Table2" AS r ON f."Id" = r."Id"
    WHERE r."Table3Id" = @__table3_Id_0 AND r."Date" >= @__dates_Start_1 AND r."Date" < @__dates_End_2 AND (EXISTS (
        SELECT 1
        FROM "Table4" AS f0
        WHERE f."Id" = f0."Table1Id" AND f0."Identifier" = 'xxx' AND f0."Value" = @__xxx_3) OR EXISTS (
        SELECT 1
        FROM "Table4" AS f1
        WHERE f."Id" = f1."Table1Id" AND f1."Identifier" = 'xxx' AND f1."Value" = @__xxx_4)) OR EXISTS (
        SELECT 1
        FROM "Table4" AS f2
        WHERE f."Id" = f2."Table1Id" AND f2."Identifier" = 'xxx' AND f2."Value" = @__xxx_5)
    ORDER BY r."Number"
    LIMIT @__p_7 OFFSET @__p_6
) AS t
LEFT JOIN "Table4" AS f3 ON t."Id" = f3."Table1Id"
ORDER BY t."Number", t."Id", f3."Table1Id"

7.0.1-ci.20221201T124005+sha.09bc34fd0 still generates the same incorrect SQL.

6.07 generates this, which is exactly the same SQL except that it has more parentheses that select the correct rows:

SELECT t."Id", t."Data", t."Date", t."Number", ...
FROM (
    SELECT f."Id", r."Data", r."Date", r."Number", ...
    FROM "Table1" AS f
    INNER JOIN "Table2" AS r ON f."Id" = r."Id"
    WHERE (((r."Table3Id" = @__table3_Id_0) AND (r."Date" >= @__dates_Start_1)) AND (r."Date" < @__dates_End_2)) AND ((EXISTS (
        SELECT 1
        FROM "Table4" AS f0
        WHERE (f."Id" = f0."Table1Id") AND ((f0."Identifier" = 'xxx') AND (f0."Value" = @__xxx_3))) OR EXISTS (
        SELECT 1
        FROM "Table4" AS f1
        WHERE (f."Id" = f1."Table1Id") AND ((f1."Identifier" = 'xxx') AND (f1."Value" = @__xxx_4)))) OR EXISTS (
        SELECT 1
        FROM "Table4" AS f2
        WHERE (f."Id" = f2."Table1Id") AND ((f2."Identifier" = 'xxx') AND (f2."Value" = @__xxx_5))))
    ORDER BY r."Number"
    LIMIT @__p_7 OFFSET @__p_6
) AS t
LEFT JOIN "Table4" AS f3 ON t."Id" = f3."Table1Id"
ORDER BY t."Number", t."Id", f3."Table1Id"
@roji
Copy link
Member

roji commented Dec 16, 2022

@petergummer thanks for filing this. Can you please provide a minimal code sample, including the LINQ query and model which trigger this query? I suspect this may be a general EF issue rather than an EFCore.PG one.

@roji
Copy link
Member

roji commented Dec 21, 2022

@petergummer ping, if there's a bug here (and it seems that there is) it's quite important that we find the root cause and fix it ASAP.

@petergummer
Copy link
Author

petergummer commented Dec 23, 2022

@roji, I've finally managed to reproduce the problem with an isolated program, using version 7.0.1.

The short story is that to replicate it, the expression tree has to use Or rather than OrElse. It seems quite similar to dotnet/efcore#29222 which was closed a couple of months ago.

This program uses | in the second Where clause.

  • In the SQL, this shifts the closing parenthesis after the 2nd condition.
  • If I use || then the closing parenthesis remains correctly at the end of the 3rd condition.

An interesting variation of this problem occurs if I delete the first Where clause.

  • With | the parenthesis is still in the wrong place; but,
  • If I use || then the enclosing parentheses are removed completely (which makes sense, since they are redundant when there's only one Where clause).
using Microsoft.EntityFrameworkCore;

var ctx = new Ctx();

var query = ctx.Bunches
    .Where(b => b.Label == "coconuts") // To see an interesting variation, delete this Where clause.
    .Where(b =>
        b.Things.Any(t => t.Name == "1st") |
        b.Things.Any(t => t.Name == "2nd") |
        b.Things.Any(t => t.Name == "3rd"))
    .Include(b => b.Things);

Console.WriteLine(query.ToQueryString());

public class Thing
{
    public int BunchId { get; set; }

    public virtual Bunch Bunch { get; set; } = null!;

    public string Name { get; set; } = null!;
}

public class Bunch
{
    public int Id { get; set; }

    public string Label { get; set; } = null!;

    public virtual ISet<Thing> Things { get; private set; } = new HashSet<Thing>();
}

public class Ctx : DbContext
{
    public DbSet<Bunch> Bunches { get; set; } = null!;

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder) => optionsBuilder.UseNpgsql();

    protected override void OnModelCreating(ModelBuilder builder)
    {
        builder.Entity<Bunch>().ToTable(nameof(Bunch));
        builder.Entity<Thing>().HasKey(t => new { t.BunchId, t.Name });
        builder.Entity<Thing>().HasOne(t => t.Bunch)
            .WithMany(g => g.Things).HasForeignKey(t => t.BunchId).IsRequired();
    }
}

This program outputs the following SQL:

SELECT b."Id", b."Label", t2."BunchId", t2."Name"
FROM "Bunch" AS b
LEFT JOIN "Thing" AS t2 ON b."Id" = t2."BunchId"
WHERE b."Label" = 'coconuts' AND (EXISTS (
    SELECT 1
    FROM "Thing" AS t
    WHERE b."Id" = t."BunchId" AND t."Name" = '1st') OR EXISTS (
    SELECT 1
    FROM "Thing" AS t0
    WHERE b."Id" = t0."BunchId" AND t0."Name" = '2nd')) OR EXISTS (
    SELECT 1
    FROM "Thing" AS t1
    WHERE b."Id" = t1."BunchId" AND t1."Name" = '3rd')
ORDER BY b."Id", t2."BunchId"

You may be wondering why my original code was using Or. This part of the expression tree is generated like so:

Expression.Lambda<Func<T, bool>>(expressions.Skip(1).Aggregate(
    expressions[0],
    (accumulator, next) => Expression.Or(accumulator, next)),
    anEntity)

I've confirmed that the correct SQL is generated if I change this to OrElse.

@petergummer
Copy link
Author

@roji, has this problem been looked at?

@roji
Copy link
Member

roji commented Mar 9, 2023

@petergummer sorry, your comment slipped under my radar...

This seems to be a duplicate of dotnet/efcore#30181, which is being fixed for 8.0 via new parentheses logic in dotnet/efcore#27177.

So I'm going to go ahead close this issue as a duplicate of that; but if there's something else at play feel free to post back and I'll reopen as needed.

@roji
Copy link
Member

roji commented Mar 9, 2023

Duplicate of dotnet/efcore#30181

@roji roji marked this as a duplicate of dotnet/efcore#30181 Mar 9, 2023
@roji roji closed this as not planned Won't fix, can't repro, duplicate, stale Mar 9, 2023
@roji roji added the duplicate This issue or pull request already exists label Mar 9, 2023
@petergummer
Copy link
Author

I'm sorry to report that the problem is not fixed yet in the latest prerelease 8.0.0-preview.1, @roji. My test program above outputs exactly the same as 7.0.1 did. The extra closing parenthesis is still there:

WHERE b."Id" = t0."BunchId" AND t0."Name" = '2nd')) OR EXISTS (

@roji
Copy link
Member

roji commented Mar 16, 2023

@petergummer I never wrote the bug was fixed for 8.0.0-preview.1... This is flagged as a duplicate of dotnet/efcore#30181, which, as its milestone indicates, was fixed for 8.0.0-preview.3 (out in around a month).

@petergummer
Copy link
Author

Thanks @roji, good to know. I was just making sure that it didn't get lost again.

@petergummer
Copy link
Author

petergummer commented May 8, 2023

Good news, @roji, with prerelease 8.0.0-preview.3 my test program above outputs this:

SELECT b."Id", b."Label", t2."BunchId", t2."Name"
FROM "Bunch" AS b
LEFT JOIN "Thing" AS t2 ON b."Id" = t2."BunchId"
WHERE b."Label" = 'coconuts' AND (EXISTS (
    SELECT 1
    FROM "Thing" AS t
    WHERE b."Id" = t."BunchId" AND t."Name" = '1st') OR EXISTS (
    SELECT 1
    FROM "Thing" AS t0
    WHERE b."Id" = t0."BunchId" AND t0."Name" = '2nd') OR EXISTS (
    SELECT 1
    FROM "Thing" AS t1
    WHERE b."Id" = t1."BunchId" AND t1."Name" = '3rd'))
ORDER BY b."Id", t2."BunchId"

The parentheses are positioned correctly. Thanks!

@roji
Copy link
Member

roji commented May 8, 2023

Thanks for the confirmation @petergummer, good to hear!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists
Projects
None yet
Development

No branches or pull requests

2 participants