OR clause allowing NULL is generated when joining on NullableProperty.Value #27071

sam-wheat · 2021-12-28T19:42:44Z

Linq below joins h.ModuleId (long?) to cmc.ModuleId (string). Both IDs can be null.

var query = from h in db.BatteryHistories
join cmc in db.CurrentMConfigurations on new { ModuleId = h.ModuleId.ToString() } equals new { ModuleId = cmc.ModuleId }
join m in db.Cu04tb09Ms on  new { M_ID = cmc.M_Number.ToString(), SvcType = cmc.ServiceType  } equals new { M_ID = m.IdMtr, SvcType = m.CdSvcTyp }

Generated SQL:

SELECT g.*
FROM [BatteryHistory] AS [g]
INNER JOIN [CurrentMConfiguration] AS [c] ON (CONVERT(VARCHAR(20), [g].[ModuleId]) = [c].[ModuleId]) OR ([g].[ModuleId] IS NULL AND [c].[ModuleId] IS NULL)
INNER JOIN [CU04TB09_M] AS [c0] ON (CONVERT(VARCHAR(100), [c].[M_Number]) = [c0].[ID_MTR]) AND ([c].[ServiceType] = [c0].[CD_SVC_TYP])

Linq below joins h.ModuleId.Value (long) to cmc.ModuleId (string). Linq does not allow nulls since h.ModuleId.Value cannot be null. However generated SQL will still select nulls: "OR ([g].[ModuleId] IS NULL AND [c].[ModuleId] IS NULL)"

var query = from h in db.BatteryHistories
join cmc in db.CurrentMConfigurations on new { ModuleId = h.ModuleId.Value.ToString() } equals new { ModuleId = cmc.ModuleId }
join m in db.Cu04tb09Ms on  new { M_ID = cmc.M_Number.ToString(), SvcType = cmc.ServiceType  } equals new { M_ID = m.IdMtr, SvcType = m.CdSvcTyp }

Generated SQL:

SELECT g.*
FROM [BatteryHistory] AS [g]
INNER JOIN [CurrentMConfiguration] AS [c] ON (CONVERT(VARCHAR(20), [g].[ModuleId]) = [c].[ModuleId]) OR ([g].[ModuleId] IS NULL AND [c].[ModuleId] IS NULL)
INNER JOIN [CU04TB09_M] AS [c0] ON (CONVERT(VARCHAR(100), [c].[M_Number]) = [c0].[ID_MTR]) AND ([c].[ServiceType] = [c0].[CD_SVC_TYP])

Noteably, no null handling is generated for the following Linq. Because cmc.M_Number is not nullable? This seems to be inconsistent with the above. If consistent, EF would handle NULL condition where nullable fields are compared:

join m in db.Cu04tb09Ms on  new { M_ID = cmc.M_Number.ToString(), SvcType = cmc.ServiceType  } equals new { M_ID = m.IdMtr, SvcType = m.CdSvcTyp }

Generated SQL:

INNER JOIN [CU04TB09_M] AS [c0] ON (CONVERT(VARCHAR(100), [c].[M_Number]) = [c0].[ID_MTR]) AND ([c].[ServiceType] = [c0].[CD_SVC_TYP])

Classes (all strings are nullable):

public partial class BatteryHistory
{
	public long? ModuleId { get; set; }
}


public partial class CurrentMConfiguration
{
	public decimal M_Number { get; set; }
	public string ModuleId { get; set; }
	public string ServiceType { get; set; }
}

public partial class Cu04tb09M
{
    public string IdMtr { get; set; }
	public string IdCommMod { get; set; }
	public string CdSvcTyp { get; set; }
}

Versions:
dotnet 5.0
EF 5.0.13

The text was updated successfully, but these errors were encountered:

maumar · 2021-12-30T05:19:50Z

related issue: #18808

This is by design. EF follows the join semantics of linq to objects here - if join keys are anonymous objects, the nulls actually get matched in the join results. Linq to objects code:

        var entity1s = new List<Entity1>
        {
            new Entity1 { Id = 1, Name = "1" },
            new Entity1 { Id = 2, Name = "2" },
            new Entity1 { Id = 3, Name = null }
        };

        var entity2s = new List<Entity2>
        {
            new Entity2 { Id = 1, Name = "1" },
            new Entity2 { Id = 2, Name = null },
            new Entity2 { Id = 3, Name = "3" }
        };

        var l2o_null_match = from e1 in entity1s
                             join e2 in entity2s on new { x = e1.Name } equals new { x = e2.Name }
                             select new { e1, e2 };

        // matched: 1 and null
        var result_null_match = l2o_null_match.ToList();

        var l2o_null_not_match = from e1 in entity1s
                                 join e2 in entity2s on e1.Name equals e2.Name
                                 select new { e1, e2 };

        // matched: 1
        var result_null_not_match = l2o_null_not_match.ToList();

maumar · 2021-12-30T05:42:36Z

wrt second example (new { ModuleId = h.ModuleId.Value.ToString() } equals new { ModuleId = cmc.ModuleId }) - EF allows accessing nullable objects - we don't throw null refs, outside the projection. So h.ModuleId.Value.ToString still returns null for us if h.ModuleId was null.

wrt third example: EF has two "modes" of expanding null semantics - optimized and full. Optimized doesn't distinguish between null and false, whereas full expansion completely gets rid of nulls from the result (we convert everything to 2-value logic). Full expansion is used in the projection or when the comparison is negated. Simplified expansion can be used in regular predicates like:

entities.Where(e => e.SomeNullValue == e.SomeOtherNullValue)
entities.Where(e => e.SomeNullValue == e.SomeNonNullValue)

That second case is what you have in your example. Since one side can be null but the other never can, we can skip the null checks completely. If the left side value is null, the result of comparison is null, which is good enough for us. We don't need false here, since null will also filter out the results. For join predicate comparison we are using the simplified expansion that's why you don't see the null check.

if the comparison was negated:

entities.Where(e => e.SomeNullValue != e.SomeNonNullValue)

we need to do full expansion and it would look something like this: WHERE e.SomeNullValue <> e.SomeNonNullValue OR e.SomeNullValue IS NULL. If both sides of the comparison are nullable the expansion is more complicated.

sam-wheat · 2021-12-31T01:44:27Z

Hi Maurycy, thank you for looking at this issue. Based on your comments please allow me to add addtional details.

Let us agree on the correct and desired behavior, which you stated in your reply:

Since one side can be null but the other never can, we can skip the null checks completely.

The small queries I have included below do not indicate that the desired behavior is being applied consistently. Please explain how this behavior is being applied in the example queries I have provided.

Your reference to #18808 is not clear to me as we are not comparing object references here.

the nulls actually get matched in the join results

The issue I am raising in this ticket is that I am joining on NullableColumn.Value. By definition the value being compared cannot be null, therefore a null check is not desirable.

So h.ModuleId.Value.ToString still returns null for us if h.ModuleId was null.

we don't throw null refs, outside the projection.

By using .Value I am explictly telling EF to treat the field as if it cannot be null. The expectation is not that an exception will be thrown but that EF will treat the value as non-nullable for purpose of comparison. WHERE NullableColumn IS NOT NULL is expected to appear in the WHERE clause - not as an OR clause in the join. Referring to .Value should be no different than referring to a column that has [Required] attribute set.

[Required] is set on the Address column. No null check is generated as expected:

// Both sides not nullable
var query30 = from e1 in db.Entity1s join e2 in db.Entity2s on new { Count = e1.Address } equals new { Count = e2.Address } select new { e1, e2 };
var sql30 = query30.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON [e].[Address] = [e0].[Address]

[Required] is not set on the Name column. A null check is generated as expected:

// Both sides nullable
var query31 = from e1 in db.Entity1s join e2 in db.Entity2s on new { Count = e1.Name } equals new { Count = e2.Name } select new { e1, e2 };
var sql31 = query31.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON ([e].[Name] = [e0].[Name]) OR ([e].[Name] IS NULL AND [e0].[Name] IS NULL)

Both sides are nullable in the query below but no null check is generated:

// Both sides nullable
var query3 = from e1 in db.Entity1s join e2 in db.Entity2s on e1.Count equals e2.Count select new { e1, e2 };
var sql3 = query3.ToQueryString(); // INNER JOIN [Entity2s] AS [e0] ON [e].[Count] = [e0].[Count]

This query comparing Required strings does not generate a null check:

// Both sides not nullable 
var query24 = from e1 in db.Entity1s join e2 in db.Entity2s on new { ID = e1.Address } equals new { ID = e2.Address } select new { e1, e2 };
var sql24 = query24.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON [e].[Address] = [e0].[Address]

However this query that is expected to compare values will allow nulls:

// Both sides not nullable 
var query33 = from e1 in db.Entity1s join e2 in db.Entity2s on new { ID = e1.Count.Value } equals new { ID = e2.Count.Value } select new { e1, e2 };
var sql33 = query32.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON ([e].[Count] = [e0].[Count]) OR ([e].[Count] IS NULL AND [e0].[Count] IS NULL)

This is the same join as the query above however in this query no null check is generated:

// Both sides not nullable
var query34 = from e1 in db.Entity1s join e2 in db.Entity2s on e1.Count.Value  equals e2.Count.Value select new { e1, e2 };
var sql34 = query34.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON [e].[Count] = [e0].[Count]

Tests: I find no consistent pattern of how null checks are generated.

class Program
{
    static void Main(string[] args)
    {
        Db db = new Db(new DbContextOptionsBuilder<Db>().UseSqlServer("Server=.;Database=EFNullComparison;Trusted_Connection=True;").Options);
        db.Database.EnsureDeleted();
        db.Database.EnsureCreated();
        db.Entity1s.Add(new Entity1 { Name = "1", Address = "1", Count = 1 });
        db.Entity1s.Add(new Entity1 { Name = "2", Address = "2", Count = 2 });
        db.Entity1s.Add(new Entity1 { Name = null, Address = "", Count = null });

        db.Entity2s.Add(new Entity2 { Name = "1", Address = "1", Count = 1 });
        db.Entity2s.Add(new Entity2 { Name = null, Address = "", Count = null });
        db.Entity2s.Add(new Entity2 { Name = "3", Address = "3", Count = 3 });

        db.SaveChanges();

        // Both sides nullable
        var query1 = from e1 in db.Entity1s join e2 in db.Entity2s on e1.Name equals e2.Name select new { e1, e2 };
        var sql1 = query1.ToQueryString(); // INNER JOIN [Entity2s] AS [e0] ON [e].[Name] = [e0].[Name]

        // Both sides nullable
        var query21 = from e1 in db.Entity1s join e2 in db.Entity2s on new { Name = e1.Name } equals new { Name = e2.Name} select new { e1, e2 };
        var sql21 = query21.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON ([e].[Name] = [e0].[Name]) OR ([e].[Name] IS NULL AND [e0].[Name] IS NULL)

        // One side nullable
        var query22 = from e1 in db.Entity1s join e2 in db.Entity2s on new { Name = e1.Name } equals new { Name = e2.Address } select new { e1, e2 };
        var sql22 = query22.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON [e].[Name] = [e0].[Address]

        // One side nullable
        var query23 = from e1 in db.Entity1s join e2 in db.Entity2s on e1.Name equals e2.Address select new { e1, e2 };
        var sql23 = query23.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON [e].[Name] = [e0].[Address]

        // Both sides not nullable
        var query2 = from e1 in db.Entity1s join e2 in db.Entity2s on e1.Address equals e2.Address select new { e1, e2 };
        var sql2 = query2.ToQueryString(); // INNER JOIN [Entity2s] AS[e0] ON[e].[Address] = [e0].[Address]

        // Both sides not nullable (compare to query21)
        var query24 = from e1 in db.Entity1s join e2 in db.Entity2s on new { ID = e1.Address } equals new { ID = e2.Address } select new { e1, e2 };
        var sql24 = query24.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON [e].[Address] = [e0].[Address]

        // Both sides nullable
        var query3 = from e1 in db.Entity1s join e2 in db.Entity2s on e1.Count equals e2.Count select new { e1, e2 };
        var sql3 = query3.ToQueryString(); // INNER JOIN [Entity2s] AS [e0] ON [e].[Count] = [e0].[Count]

        // Both sides nullable
        var query32 = from e1 in db.Entity1s join e2 in db.Entity2s on new { ID = e1.Count }  equals new { ID = e2.Count } select new { e1, e2 };
        var sql32 = query32.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON ([e].[Count] = [e0].[Count]) OR ([e].[Count] IS NULL AND [e0].[Count] IS NULL)

        // Both sides not nullable (compare to query24)
        var query33 = from e1 in db.Entity1s join e2 in db.Entity2s on new { ID = e1.Count.Value } equals new { ID = e2.Count.Value } select new { e1, e2 };
        var sql33 = query32.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON ([e].[Count] = [e0].[Count]) OR ([e].[Count] IS NULL AND [e0].[Count] IS NULL)

        // Both sides not nullable
        var query34 = from e1 in db.Entity1s join e2 in db.Entity2s on e1.Count.Value  equals e2.Count.Value select new { e1, e2 };
        var sql34 = query34.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON [e].[Count] = [e0].[Count]



        // Both sides not nullable
        var query4 = from e1 in db.Entity1s join e2 in db.Entity2s on e1.Count.Value equals e2.Count.Value select new { e1, e2 };
        var sql4 = query4.ToQueryString(); // INNER JOIN [Entity2s] AS [e0] ON [e].[Count] = [e0].[Count]

        // One side nullable
        var query5 = from e1 in db.Entity1s join e2 in db.Entity2s on e1.ID equals e2.Count select new { e1, e2 };
        var sql5 = query5.ToQueryString(); // INNER JOIN[Entity2s] AS[e0] ON[e].[ID] = [e0].[Count]

        // Both sides not nullable
        var query6 = from e1 in db.Entity1s join e2 in db.Entity2s on e1.ID.ToString() equals e2.Address select new { e1, e2 };
        var sql6 = query6.ToQueryString(); // INNER JOIN [Entity2s] AS [e0] ON CONVERT(VARCHAR(11), [e].[ID]) = [e0].[Address]

        // Both sides not nullable
        var query7 = from e1 in db.Entity1s join e2 in db.Entity2s on new { ID = e1.ID } equals new { ID = e2.ID } select new { e1, e2 };
        var sql7 = query7.ToQueryString(); // INNER JOIN[Entity2s] AS[e0] ON[e].[ID] = [e0].[ID]

        // Both sides not nullable
        var query8 = from e1 in db.Entity1s join e2 in db.Entity2s on new { ID = e1.ID.ToString() } equals new { ID = e2.Address } select new { e1, e2 };
        var sql8 = query8.ToQueryString(); // INNER JOIN [Entity2s] AS [e0] ON CONVERT(VARCHAR(11), [e].[ID]) = [e0].[Address]

        // One side nullable
        var query9 = from e1 in db.Entity1s join e2 in db.Entity2s on new { ID = e1.Count.ToString() } equals new { ID = e2.Address } select new { e1, e2 };
        var sql9 = query9.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON CONVERT(VARCHAR(11), [e].[Count]) = [e0].[Address]

        // Both sides nullable
        var query25 = from e1 in db.Entity1s join e2 in db.Entity2s on new { Count = e1.Count.ToString() } equals new { Count = e2.Name } select new { e1, e2 };
        var sql25 = query25.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON (CONVERT(VARCHAR(11), [e].[Count]) = [e0].[Name]) OR ([e].[Count] IS NULL AND [e0].[Name] IS NULL)

        // One side nullable
        var query26 = from e1 in db.Entity1s join e2 in db.Entity2s on e1.Count.ToString() equals e2.Address select new { e1, e2 };
        var sql26 = query26.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON CONVERT(VARCHAR(11), [e].[Count]) = [e0].[Address]

        // One side nullable
        var query27 = from e1 in db.Entity1s join e2 in db.Entity2s on e1.Count equals Convert.ToInt32(e2.Address) select new { e1, e2 };
        var sql27 = query27.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON [e].[Count] = CONVERT(int, [e0].[Address])

        // Both sides not nullable
        var query28 = from e1 in db.Entity1s join e2 in db.Entity2s on new { Address = e1.Count.Value } equals new { Address = Convert.ToInt32(e2.Address) } select new { e1, e2 };
        var sql28 = query28.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON [e].[Count] = CONVERT(int, [e0].[Address])

        // One side nullable
        var query29 = from e1 in db.Entity1s join e2 in db.Entity2s on new { Count = e1.Count.ToString() } equals new { Count = e2.Address } select new { e1, e2 };
        var sql29 = query25.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON (CONVERT(VARCHAR(11), [e].[Count]) = [e0].[Name]) OR ([e].[Count] IS NULL AND [e0].[Name] IS NULL)

        // Both sides not nullable
        var query30 = from e1 in db.Entity1s join e2 in db.Entity2s on new { Count = e1.Address } equals new { Count = e2.Address } select new { e1, e2 };
        var sql30 = query30.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON [e].[Address] = [e0].[Address]

        // Both sides nullable
        var query31 = from e1 in db.Entity1s join e2 in db.Entity2s on new { Count = e1.Name } equals new { Count = e2.Name } select new { e1, e2 };
        var sql31 = query31.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON ([e].[Name] = [e0].[Name]) OR ([e].[Name] IS NULL AND [e0].[Name] IS NULL)

        // One side nullable
        var query10 = from e1 in db.Entity1s join e2 in db.Entity2s on new { ID = e1.LongCount.ToString() } equals new { ID = e2.Address } select new { e1, e2 };
        var sql10 = query10.ToQueryString(); // INNER JOIN [Entity2s] AS [e0] ON CONVERT(VARCHAR(20), [e].[LongCount]) = [e0].[Address]

        // Both sides nullable
        var query11 = from e1 in db.Entity1s join e2 in db.Entity2s on new { LongCount = e1.LongCount.ToString() } equals new { LongCount = e2.LongCount.ToString() } select new { e1, e2 };
        var sql11 = query11.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON (CONVERT(VARCHAR(20), [e].[LongCount]) = CONVERT(VARCHAR(20), [e0].[LongCount])) OR ([e].[LongCount] IS NULL AND [e0].[LongCount] IS NULL)

        // Both sides nullable
        var query14 = from e1 in db.Entity1s join e3 in db.Entity3s on new { LongCount = e1.LongCount.ToString() } equals new { LongCount = e3.LongCount } select new { e1, e3 };
        var sql14 = query14.ToQueryString(); // INNER JOIN [Entity3s] AS [e0] ON (CONVERT(VARCHAR(20), [e].[LongCount]) = [e0].[LongCount]) OR ([e].[LongCount] IS NULL AND [e0].[LongCount] IS NULL)

        // Both sides nullable
        var query15 = from e1 in db.Entity1s join e3 in db.Entity3s on new { LongCount = e1.LongCount.ToString() } equals new { LongCount = e3.Name } select new { e1, e3 };
        var sql15 = query15.ToQueryString(); // INNER JOIN [Entity3s] AS [e0] ON (CONVERT(VARCHAR(20), [e].[LongCount]) = [e0].[Name]) OR ([e].[LongCount] IS NULL AND [e0].[Name] IS NULL)

        // Both sides nullable
        var query16 = from e1 in db.Entity1s join e3 in db.Entity3s on new { X = e1.LongCount.ToString() } equals new { X = e3.Name } select new { e1, e3 };
        var sql16 = query16.ToQueryString(); // INNER JOIN [Entity3s] AS [e0] ON (CONVERT(VARCHAR(20), [e].[LongCount]) = [e0].[Name]) OR ([e].[LongCount] IS NULL AND [e0].[Name] IS NULL)
        
        // Both sides nullable
        var query17 = from e1 in db.Entity1s join e3 in db.Entity3s on new { X = e1.LongCount.ToString() } equals new { X = e3.Name } select new { e1, e3 };
        var sql17 = query17.ToQueryString(); // INNER JOIN [Entity3s] AS [e0] ON (CONVERT(VARCHAR(20), [e].[LongCount]) = [e0].[Name]) OR ([e].[LongCount] IS NULL AND [e0].[Name] IS NULL)
        
        // Both sides nullable
        var query18 = from e1 in db.Entity1s join e2 in db.Entity2s on new { X = e1.LongCount.ToString() } equals new { X = e2.Name } select new { e1, e2 };
        var sql18 = query18.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON (CONVERT(VARCHAR(20), [e].[LongCount]) = [e0].[Name]) OR ([e].[LongCount] IS NULL AND [e0].[Name] IS NULL)

        // One side nullable
        var query19 = from e1 in db.Entity1s join e2 in db.Entity2s on new { ID = e1.LongCount.ToString() } equals new { ID = e2.Address } select new { e1, e2 };
        var sql19 = query19.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON CONVERT(VARCHAR(20), [e].[LongCount]) = [e0].[Address]
        
        // Both sides nullable
        var query20 = from e1 in db.Entity1s join e2 in db.Entity2s on new { ID = e1.LongCount.ToString() } equals new { ID = e2.Name } select new { e1, e2 };
        var sql20 = query20.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON (CONVERT(VARCHAR(20), [e].[LongCount]) = [e0].[Name]) OR ([e].[LongCount] IS NULL AND [e0].[Name] IS NULL)
    }


}

public class Entity1 
{
    public int ID { get; set; }
    public int? Count { get; set; }
    public long? LongCount { get; set; }
    public string Name { get; set; }
    [Required]
    public string Address { get; set; }
}

public class Entity2
{
    public int ID { get; set; }
    public int? Count { get; set; }
    public long? LongCount { get; set; }
    public string Name { get; set; }
    [Required]
    public string Address { get; set; }
}

public class Entity3
{
    public int? Count { get; set; }
    public string LongCount { get; set; }
    public string Name { get; set; }
    [Required]
    public string Address { get; set; }
}

public partial class CurrentConfiguration
{
    public string ServiceType { get; set; }
    public string SerialNumber { get; set; }
    public string ModuleId { get; set; }
}

public partial class BatteryHistory
{
    public long? ModuleId { get; set; }
}

public partial class CurrentConfiguration2
{
    public int ID { get; set; }
    public string ModuleId2 { get; set; }
}

public partial class BatteryHistory2
{
    public int ID { get; set; }
    public long? ModuleId2 { get; set; }
}

public class Db : DbContext
{
    public virtual DbSet<Entity1> Entity1s { get; set; }
    public virtual DbSet<Entity2> Entity2s { get; set; }
    public virtual DbSet<Entity3> Entity3s { get; set; }
    public virtual DbSet<CurrentConfiguration> CurrentConfigurations { get; set; }
    public virtual DbSet<BatteryHistory> BatteryHistories { get; set; }
    public virtual DbSet<CurrentConfiguration2> CurrentConfigurations2 { get; set; }
    public virtual DbSet<BatteryHistory2> BatteryHistories2 { get; set; }


    public Db(DbContextOptions<Db> options)
     : base(options)
    {
    }

    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
    {
        if (!optionsBuilder.IsConfigured)
        {
            optionsBuilder.UseSqlServer("Name=BatteryAnalysis");
        }
    }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.HasAnnotation("Relational:Collation", "SQL_Latin1_General_CP1_CI_AS");

        modelBuilder.Entity<Entity2>(entity =>
        {
            entity.HasKey(e => new { e.ID, e.Address });

            entity.ToTable("Entity2s", "SAP");

            entity.Property(e => e.Name)
                .HasMaxLength(1)
                .IsUnicode(false)
                .IsFixedLength(true);

    
            entity.Property(e => e.Address).HasMaxLength(18);

        });

        modelBuilder.Entity<Entity3>(entity =>
        {
            entity.HasNoKey();
        });


        modelBuilder.Entity<CurrentConfiguration>(entity =>
        {
            entity.HasKey(e => new { e.ServiceType, e.SerialNumber });

            entity.ToTable("CurrentConfiguration", "SAP");

            entity.Property(e => e.ServiceType)
                .HasMaxLength(1)
                .IsUnicode(false)
                .IsFixedLength(true);

            entity.Property(e => e.SerialNumber).HasMaxLength(18);

            entity.Property(e => e.ModuleId).HasMaxLength(18);

        });

        modelBuilder.Entity<BatteryHistory>(entity =>
        {
            entity.HasNoKey();
            entity.ToTable("BatteryHistory", "OWCE");
        });



       
    }
}

sam-wheat · 2021-12-31T01:53:15Z

Related: #27072

It is very suboptimal, since due to the unnecessary OR ((t."Id" IS NULL) AND (t1."ParentId" IS NULL)) clause in the inner query,

WHERE (t."Id" = t1."ParentId") OR ((t."Id" IS NULL) AND (t1."ParentId" IS NULL))

roji · 2022-01-04T19:42:58Z

By using .Value I am explictly telling EF to treat the field as if it cannot be null. The expectation is not that an exception will be thrown but that EF will treat the value as non-nullable for purpose of comparison.

This is not how EF Core is currently designed. One important thing to note in this context, is that there's no analogous gesture for reference types - there's no way to explicitly tell EF Core that the field cannot be null. The new bang (!) operator does that logically, but is pure compiler syntax which does not exist in the expression tree which EF Core receives.

Note also that you can disable null compensation entirely by using relational null semantics; when using that, EF Core doesn't add any additional null checks, and your C# code gets translated to SQL as-is (nullability-wise). However, this is not a mode we generally recommend.

sam-wheat · 2022-01-05T19:00:33Z

1.) My choice of words "the field cannot be null" is arguably poor. Most generally I want to know if there is there a way to tell EF to not generate ... OR t1.f1 IS NULL AND t2.f1 IS NULL. I think you understand my intent but I want to be clear. In my case that OR cluase makes my query time out (runs in 11 seconds without the OR).

2.) If the answer to 1.) is no, is it possible in a future release (edit) to not generate the OR if one side of the join expression results in a value type or if the join contains null coalescing logic? Adding ... WHERE t1.f1 IS NOT NULL for each nullable field in the join would be a nice touch also I think.

3.) Can you please explain the existing logic and intent as demonstrated by the examples I have provided?

roji · 2022-01-06T12:36:47Z

1.) My choice of words "the field cannot be null" is arguably poor. Most generally I want to know if there is there a way to tell EF to not generate ... OR t1.f1 IS NULL AND t2.f1 IS NULL. I think you understand my intent but I want to be clear. In my case that OR cluase makes my query time out (runs in 11 seconds without the OR).

@sam-wheat when it generates a SQL equality (a = b), EF Core adds the nullability check if both side are nullable; This is simply to maintain the regular C# behavior where a = null returns true when a is null (but not in SQL). Now, what matters isn't the C# expression, but rather the SQL expression; so X.Value in SQL is simply the column X, and if that's nullable, the extra null checks are added.

Are you seeing a case where both sides of the SQL expression aren't nullable, but EF Core is still adding the checks? If so, can you provide the clear, runnable repro for that (including the model)?

Otherwise, if you don't want the extra checks even when the sides are nullable, then as I wrote below you can opt into relational null semantics.

2.) If the answer to 1.) is no, is it possible in a future release (edit) to not generate the OR if one side of the join expression results in a value type or if the join contains null coalescing logic? Adding ... WHERE t1.f1 IS NOT NULL for each nullable field in the join would be a nice touch also I think.

We could do an optimization like that, but would that really be significantly better? That would remove the extra SQL check from the JOIN clause, only to add one in a new WHERE clause...

3.) Can you please explain the existing logic and intent as demonstrated by the examples I have provided?

See my explanation above in 1 - by default EF Core recreates C# null semantics via the SQL it generates, so that users' LINQ queries return the same results from the database that they would if they were executed in-memory.

roji · 2022-01-06T13:22:37Z

@sam-wheat one additional comment... Are you looking to remove the extra nullability checks because of performance? Because generally speaking they should not have a significant impact there (though that can always vary across databases, query shapes...). If so, have you measured performance with and without those checks, and confirmed there's an issue?

sam-wheat · 2022-01-07T21:21:35Z

have you measured performance with and without those checks, and confirmed there's an issue?:

Query 2 runs in 2:12 and returns 7.6m rows. I manually commented out the OR clause.
Query 1 is the exact same except for the OR clause. Time to run is unknown as I am cancelling it after 5 mins.

var query = from h in db.Histories
join cmc in db.CMC on new { ModuleId = h.ModuleId.ToString() } equals new { ModuleId = cmc.ModuleId }
where h.ModuleId != null 
orderby h.ModuleId ascending
select new { H_ModuleID = h.ModuleId, C_ModuleID = cmc.ModuleId };

string sql = query.ToQueryString();

SELECT[g].[ModuleId] AS[H_ModuleID], [c].[ModuleId] AS[C_ModuleID]
FROM[OWCE].[History] AS[g]
INNER JOIN[SAP].[CMC] AS[c] ON(CONVERT(VARCHAR(20), [g].[ModuleId]) = [c].[ModuleId]) OR([g].[ModuleId] IS NULL AND[c].[ModuleId] IS NULL)
WHERE[g].[ModuleId] IS NOT NULL
ORDER BY[g].[ModuleId]

We could do an optimization like that, but would that really be significantly better?

I beleive it would be better in terms of control given to the developer. The following are very different queries. AFAIK it is not possible to achieve the latter when using a complex join:

... from t1 in db.t1s join t2 in db.t2s on new { N = t1.Nullable } equals new { N = t2.Nullable }...
... FROM T1 JOIN T2 ON T1.NULLABLE = T2.NULLABLE OR (T1.NULLABLE IS NULL AND T2.NULLABLE IS NULL)


... from t1 in db.t1s join t2 in db.t2s on new { N = t1.Nullable.Value } equals { N = t2.Nullable.Value }...
... FROM T1 JOIN T2 ON T1.NULLABLE = T2.NULLABLE WHERE T1.NULLABLE IS NOT NULL

Runnable...

class Program
{
    private static List<Entity1> MemoryEntity1s;
    private static List<Entity2> MemoryEntity2s;

    static void Main(string[] args)
    {
        MemoryEntity1s = new List<Entity1>
        {
            new Entity1 { Name = "1", Address = "1", Count = 1 },
            new Entity1 { Name = "2", Address = "2", Count = 2 },
            new Entity1 { Name = null, Address = "", Count = null }
        };

        MemoryEntity2s = new List<Entity2>
        {
            new Entity2 { Name = "1", Address = "1", Count = 1 },
            new Entity2 { Name = null, Address = "", Count = null },
            new Entity2 { Name = "3", Address = "3", Count = 3 }
        };

        Db db = new Db(new DbContextOptionsBuilder<Db>().UseSqlServer("Server=.;Database=EFNullComparison;Trusted_Connection=True;").Options);
        db.Database.EnsureDeleted();
        db.Database.EnsureCreated();
        
        MemoryEntity1s.ForEach(e => db.Entity1s.Add(e));
        MemoryEntity2s.ForEach(e => db.Entity2s.Add(e));

        db.SaveChanges();

        var query1 = from e1 in db.Entity1s join e2 in db.Entity2s on new { Name = e1.Name, Count = e1.Count } equals new { Name = e2.Name, Count = e2.Count } 
                     where !string.IsNullOrEmpty(e1.Name) && !string.IsNullOrEmpty(e2.Name) && e1.Count.HasValue && e2.Count.HasValue
                     select new { E1Value = e1.Name, E2Value = e2.Name };

        var sql1 = query1.ToQueryString();
        var sqlResult1 = query1.ToList();


        // SELECT [e].[Name] AS [E1Value], [e0].[Name] AS [E2Value]
        // FROM[Entity1s] AS[e]
        // INNER JOIN[Entity2s] AS[e0] ON(([e].[Name] = [e0].[Name]) OR([e].[Name] IS NULL AND[e0].[Name] IS NULL)) AND(([e].[Count] = [e0].[Count]) OR([e].[Count] IS NULL AND[e0].[Count] IS NULL))
        // WHERE((([e].[Name] IS NOT NULL AND(([e].[Name] <> N'') OR[e].[Name] IS NULL)) AND([e0].[Name] IS NOT NULL AND(([e0].[Name] <> N'') OR[e0].[Name] IS NULL))) AND[e].[Count] IS NOT NULL) AND[e0].[Count] IS NOT NULL


        var query2 = from e1 in db.Entity1s join e2 in db.Entity2s on new { Name = e1.Name } equals new { Name = e2.Name } 
                     where ! string.IsNullOrEmpty(e1.Name) && ! string.IsNullOrEmpty(e2.Name) 
                     select new { E1Value = e1.Name, E2Value = e2.Name };
        
        var sql2 = query2.ToQueryString(); 
        var sqlResult2 = query2.ToList();

        // SELECT[e].[Name] AS[E1Value], [e0].[Name] AS[E2Value]
        // FROM[Entity1s] AS[e]
        // INNER JOIN[Entity2s] AS[e0] ON([e].[Name] = [e0].[Name]) OR([e].[Name] IS NULL AND[e0].[Name] IS NULL)
        // WHERE([e].[Name] IS NOT NULL AND(([e].[Name] <> N'') OR[e].[Name] IS NULL)) AND([e0].[Name] IS NOT NULL AND(([e0].[Name] <> N'') OR[e0].[Name] IS NULL))
    }
}

public class Entity1 
{
    public int ID { get; set; }
    public int? Count { get; set; }
    public string Name { get; set; }
    [Required]
    public string Address { get; set; }
}

public class Entity2
{
    public int ID { get; set; }
    public int? Count { get; set; }
    public string Name { get; set; }
    [Required]
    public string Address { get; set; }
}


public class Db : DbContext
{
    public virtual DbSet<Entity1> Entity1s { get; set; }
    public virtual DbSet<Entity2> Entity2s { get; set; }

    public Db(DbContextOptions<Db> options) : base(options) { }
}

roji · 2022-01-07T21:31:28Z

We could do an optimization like that, but would that really be significantly better?

I beleive it would be better in terms of control given to the developer. The following are very different queries. AFAIK it is not possible to achieve the latter when using a complex join:

The question if these two SQLs are (significantly) different in terms of performance - is this something you've verified?

roji · 2022-01-07T21:33:00Z

One more thing that could be helpful here, is the query plan for the various SQLs being discussed - could you post these? An easy way of getting a query plan is to execute SET SHOWPLAN_ALL ON and then to execute your SQL - SQL Server returns the plan instead of actually executing.

sam-wheat · 2022-01-07T21:34:23Z

The question if these two SQLs are (significantly) different in terms of performance - is this something you've verified?

The difference of the OR clause is verified and the results are shown in my prior post.

I let it run while I was typing it is still going after over 20 mins:

sam-wheat · 2022-01-07T21:51:20Z

Understood. The difference of the queries is below. WHERE[g].[ModuleId] IS NOT NULL is common to both.

INNER JOIN[SAP].[CMC] AS[c] ON(CONVERT(VARCHAR(20), [g].[ModuleId]) = [c].[ModuleId]) OR([g].[ModuleId] IS NULL AND[c].[ModuleId] IS NULL)

INNER JOIN[SAP].[CMC] AS[c] ON(CONVERT(VARCHAR(20), [g].[ModuleId]) = [c].[ModuleId]) -- OR([g].[ModuleId] IS NULL AND[c].[ModuleId] IS NULL)

sam-wheat · 2022-01-07T22:02:05Z

Requested:

roji · 2022-01-11T20:14:26Z

Triage discussion: @maumar to take a look if there's a possible workaround for avoiding the null compensation in the above join condition.

maumar · 2022-01-20T09:35:49Z

best way we currently have to avoid null compensation is to use relational null semantics like @roji pointed out, and manually add null compensation terms for the comparisons we actually want to behave like c#. e.g.:

///ctx is set up with UseRelationalNulls option enabled
ctx.Customers.Where(c => c.Relational1 == c.Relational2 && (c.CSharp1 == c.CSharp2 || (c.CSharp1 == null && c.CSharp2 == null)))

In case of join key comparison, relational null semantics are used as long as keys are not wrapped in anonymous objects (but this doesn't obviously work for composite key comparisons)

wrt consistency in @sam-wheat 's examples above, EF has several "dimensions" which decide if/how comparisons are translated:

should we use relational or c# semantics

1.1 by default we mimic c# semantics so we will add extra terms so that null == null is not filtered out.
there are two exceptions to this:
1.2 using UseRelationalNulls context option, which globally applies relational-style comparison regardless of anything else
1.3 join key comparison - these per linq to object implementation use relational null semantics, null keys won't match, hence the query comprehension syntax uses a keyword equals, rather than just a == operator
however there is an exception to that in EF -
1.4 when the join keys are wrapped in anonymous type - in that case (on linq to objects) join equals comparison matches two anonymous objects which contain null values (like new { x = null }) - so on EF we apply c# null semantics expansion in this case.

should we use simplified or "full" expansion

2.1 simplified expansion is the most common case, happens inside predicates where we don't care between false and null result. In that case when only one side is nullable and the other one is not we can completely skip the extra terms, if the nullable side is null, the result of comparison is null, but that's just as good as false for us
2.2 full expansion happens when we need precise result of the comparison, either true or false. we do it outside predicate (in projection - we don't want to return nulls for queries like customers.Select(c => c.SomeNullableValue == c.SomeOtherValue)), we also do this inside predicate in some cases, e.g. if the comparison is negated.

what are the extra terms we need to add
depends on simplified vs full expansion and if one or both elements are nullable
for simplified we only need to add anything if both sides are nullable:

3.1 ?a == ?b -> a == b || (a == null && b == null)
if only one element is nullable, we don't need to do anything because if the nullable element is null we get null back which is fine, and since the other side can't ever be null, we never expect true result in this case, so:

3.2. ?a == b -> a == b as well as a == ?b -> a == b

3.3 for full expansion the translation is a bit tricky, you can see them all (along with truth tables) starting here, in case you are interested: https://github.com/dotnet/efcore/blob/main/src/EFCore.Relational/Query/SqlNullabilityProcessor.cs#L1804

now looping back to the cases you provided earlier:

        // Both sides nullable
        var query1 = from e1 in db.Entity1s join e2 in db.Entity2s on e1.Name equals e2.Name select new { e1, e2 };
        var sql1 = query1.ToQueryString(); // INNER JOIN [Entity2s] AS [e0] ON [e].[Name] = [e0].[Name]

//maumar: 
//1.3 join key comparison -> using relational semantics -> no need for the extra terms

        // Both sides nullable
        var query21 = from e1 in db.Entity1s join e2 in db.Entity2s on new { Name = e1.Name } equals new { Name = e2.Name} select new { e1, e2 };
        var sql21 = query21.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON ([e].[Name] = [e0].[Name]) OR ([e].[Name] IS NULL AND [e0].[Name] IS NULL)

//maumar:
//1.4 join but wrapped in anonymous type ->
//2.1 need to perform expansion, but its done in predicate so can do simple one
//3.1 both terms nullable so we do a == b || (a == null && b == null)

        // One side nullable
        var query22 = from e1 in db.Entity1s join e2 in db.Entity2s on new { Name = e1.Name } equals new { Name = e2.Address } select new { e1, e2 };
        var sql22 = query22.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON [e].[Name] = [e0].[Address]

//maumar:
//1.4 join but wrapped in anonymous type ->
//2.1 need to perform expansion, but its done in predicate so can do simple one
//3.2 only one side is nullable, no need for the extra term

        // One side nullable
        var query23 = from e1 in db.Entity1s join e2 in db.Entity2s on e1.Name equals e2.Address select new { e1, e2 };
        var sql23 = query23.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON [e].[Name] = [e0].[Address]

//maumar:
//1.3 join key comparison -> using relational semantics -> no need for the extra terms

        // Both sides not nullable
        var query2 = from e1 in db.Entity1s join e2 in db.Entity2s on e1.Address equals e2.Address select new { e1, e2 };
        var sql2 = query2.ToQueryString(); // INNER JOIN [Entity2s] AS[e0] ON[e].[Address] = [e0].[Address]

//maumar:
//1.3 join key comparison -> using relational semantics -> no need for the extra terms

        // Both sides not nullable (compare to query21)
        var query24 = from e1 in db.Entity1s join e2 in db.Entity2s on new { ID = e1.Address } equals new { ID = e2.Address } select new { e1, e2 };
        var sql24 = query24.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON [e].[Address] = [e0].[Address]

//maumar:
//1.4 join but wrapped in anonymous type ->
//2.1 need to perform expansion, but its done in predicate so can do simple one
//both terms are non-nullable so no need to do anything here

        // Both sides nullable
        var query3 = from e1 in db.Entity1s join e2 in db.Entity2s on e1.Count equals e2.Count select new { e1, e2 };
        var sql3 = query3.ToQueryString(); // INNER JOIN [Entity2s] AS [e0] ON [e].[Count] = [e0].[Count]

//maumar:
//same as query1

        // Both sides nullable
        var query32 = from e1 in db.Entity1s join e2 in db.Entity2s on new { ID = e1.Count }  equals new { ID = e2.Count } select new { e1, e2 };
        var sql32 = query32.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON ([e].[Count] = [e0].[Count]) OR ([e].[Count] IS NULL AND [e0].[Count] IS NULL)

//maumar:
//same as query21

        // Both sides not nullable (compare to query24)
        var query33 = from e1 in db.Entity1s join e2 in db.Entity2s on new { ID = e1.Count.Value } equals new { ID = e2.Count.Value } select new { e1, e2 };
        var sql33 = query32.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON ([e].[Count] = [e0].[Count]) OR ([e].[Count] IS NULL AND [e0].[Count] IS NULL)

//maumar:
//per earlier comment, EF ignores the .Value and still treats Count as nullable property, so this is the same as query21 and query32

        // Both sides not nullable
        var query34 = from e1 in db.Entity1s join e2 in db.Entity2s on e1.Count.Value  equals e2.Count.Value select new { e1, e2 };
        var sql34 = query34.ToQueryString(); // INNER JOIN [SAP].[Entity2s] AS [e0] ON [e].[Count] = [e0].[Count]

//maumar:
//same as query1 - we ignore .Value, so both sides are nullable, but its inside join key comparion and not wrapped in anonymous object, so we can do relational semantics

maumar · 2022-01-25T21:32:33Z

closing this in favor of doc issue

sam-wheat added the customer-reported label Dec 28, 2021

maumar added the closed-by-design label Dec 30, 2021

roji closed this as completed Jan 11, 2022

roji reopened this Jan 11, 2022

ajcvickers assigned maumar Jan 14, 2022

ajcvickers mentioned this issue Jan 22, 2022

Document null semantics rules dotnet/EntityFramework.Docs#3692

Open

maumar closed this as completed Jan 25, 2022

ajcvickers reopened this Oct 16, 2022

ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Oct 16, 2022

maumar mentioned this issue May 22, 2024

Fix comparison of nullable values #33757

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OR clause allowing NULL is generated when joining on NullableProperty.Value #27071

OR clause allowing NULL is generated when joining on NullableProperty.Value #27071

sam-wheat commented Dec 28, 2021

maumar commented Dec 30, 2021

maumar commented Dec 30, 2021 •

edited

Loading

sam-wheat commented Dec 31, 2021

sam-wheat commented Dec 31, 2021

roji commented Jan 4, 2022 •

edited

Loading

sam-wheat commented Jan 5, 2022 •

edited

Loading

roji commented Jan 6, 2022 •

edited

Loading

roji commented Jan 6, 2022

sam-wheat commented Jan 7, 2022

roji commented Jan 7, 2022

roji commented Jan 7, 2022

sam-wheat commented Jan 7, 2022 •

edited

Loading

sam-wheat commented Jan 7, 2022

sam-wheat commented Jan 7, 2022

roji commented Jan 11, 2022

maumar commented Jan 20, 2022 •

edited

Loading

maumar commented Jan 25, 2022

OR clause allowing NULL is generated when joining on NullableProperty.Value #27071

OR clause allowing NULL is generated when joining on NullableProperty.Value #27071

Comments

sam-wheat commented Dec 28, 2021

maumar commented Dec 30, 2021

maumar commented Dec 30, 2021 • edited Loading

sam-wheat commented Dec 31, 2021

sam-wheat commented Dec 31, 2021

roji commented Jan 4, 2022 • edited Loading

sam-wheat commented Jan 5, 2022 • edited Loading

roji commented Jan 6, 2022 • edited Loading

roji commented Jan 6, 2022

sam-wheat commented Jan 7, 2022

roji commented Jan 7, 2022

roji commented Jan 7, 2022

sam-wheat commented Jan 7, 2022 • edited Loading

sam-wheat commented Jan 7, 2022

sam-wheat commented Jan 7, 2022

roji commented Jan 11, 2022

maumar commented Jan 20, 2022 • edited Loading

maumar commented Jan 25, 2022

maumar commented Dec 30, 2021 •

edited

Loading

roji commented Jan 4, 2022 •

edited

Loading

sam-wheat commented Jan 5, 2022 •

edited

Loading

roji commented Jan 6, 2022 •

edited

Loading

sam-wheat commented Jan 7, 2022 •

edited

Loading

maumar commented Jan 20, 2022 •

edited

Loading