Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GroupBy() isn't working as expected unless I first load the individual items into memory #32014

Closed
Tracked by #30173
SoftCircuits opened this issue Oct 10, 2023 · 7 comments

Comments

@SoftCircuits
Copy link

SoftCircuits commented Oct 10, 2023

I have the following query. I'm expecting a number of results but I get only one.

var results = await DbContext.Trucks
    .Where(t => t.Departure.HasValue)
    .Select(t => new
    {
        t.Departure!.Value.Year,
        t.Departure!.Value.Month,
        Quantity = t.InboundQuantity + t.ToTransfers.Sum(x => x.Quantity) - t.FromTransfers.Sum(x => x.Quantity)
    })
    .GroupBy(t => new { t.Year, t.Month })
    .ToDictionaryAsync(g => new DateTime(g.Key.Year, g.Key.Month, 1), g => g.Sum(x => x.Quantity));

Results:

results Count = 1
+ [0]   {[{7/1/2021 12:00:00 AM}, 3842405714]}

If I rewrite the query to load all the rows into memory and then group, I then get the expected number of results.

var test = DbContext.Trucks
    .Where(t => t.Departure.HasValue)
    .Select(t => new
    {
        t.Departure!.Value.Year,
        t.Departure!.Value.Month,
        Quantity = t.InboundQuantity + t.ToTransfers.Sum(x => x.Quantity) - t.FromTransfers.Sum(x => x.Quantity)
    })
    .ToList();

var test2 = test
    .GroupBy(t => new { t.Year, t.Month })
    .ToDictionary(g => new DateTime(g.Key.Year, g.Key.Month, 1), g => g.Sum(x => x.Quantity));

Results:

test2   Count = 27
+ [0]   {[{1/1/2022 12:00:00 AM}, 95309170]}
+ [1]   {[{2/1/2022 12:00:00 AM}, 79737674]}
+ [2]   {[{3/1/2022 12:00:00 AM}, 173875538]}
+ [3]   {[{7/1/2021 12:00:00 AM}, 125219196]}
+ [4]   {[{8/1/2021 12:00:00 AM}, 110993954]}
+ [5]   {[{10/1/2021 12:00:00 AM}, 129282632]}
+ [6]   {[{11/1/2021 12:00:00 AM}, 160927222]}
+ [7]   {[{12/1/2021 12:00:00 AM}, 117813444]}
+ [8]   {[{9/1/2021 12:00:00 AM}, 26260420]}
+ [9]   {[{4/1/2022 12:00:00 AM}, 161695966]}
+ [10]  {[{5/1/2022 12:00:00 AM}, 184781001]}
+ [11]  {[{6/1/2022 12:00:00 AM}, 146985217]}
+ [12]  {[{7/1/2022 12:00:00 AM}, 84260160]}
+ [13]  {[{8/1/2022 12:00:00 AM}, 89769875]}
+ [14]  {[{10/1/2022 12:00:00 AM}, 187958697]}
+ [15]  {[{11/1/2022 12:00:00 AM}, 175034225]}
+ [16]  {[{9/1/2022 12:00:00 AM}, 108421771]}
+ [17]  {[{1/1/2023 12:00:00 AM}, 179720773]}
+ [18]  {[{12/1/2022 12:00:00 AM}, 110000415]}
+ [19]  {[{2/1/2023 12:00:00 AM}, 210461751]}
+ [20]  {[{3/1/2023 12:00:00 AM}, 232239876]}
+ [21]  {[{4/1/2023 12:00:00 AM}, 176482465]}
+ [22]  {[{5/1/2023 12:00:00 AM}, 218732769]}
+ [23]  {[{6/1/2023 12:00:00 AM}, 186324324]}
+ [24]  {[{7/1/2023 12:00:00 AM}, 151700649]}
+ [25]  {[{8/1/2023 12:00:00 AM}, 142653374]}
+ [26]  {[{9/1/2023 12:00:00 AM}, 75763156]}

If I add a second Select() after the GroupBy(), I get an run-time exception.

Microsoft.Data.SqlClient.SqlException: 'Cannot perform an aggregate function on an expression containing an aggregate or a subquery. Cannot perform an aggregate function on an expression containing an aggregate or a subquery. Cannot perform an aggregate function on an expression containing an aggregate or a subquery. Cannot perform an aggregate function on an expression containing an aggregate or a subquery.'

I understand that some expressions cannot be converted, but it seems odd that it would actually run without any error and then produce different results. In this case, an exception might actually be better.

@roji
Copy link
Member

roji commented Oct 11, 2023

@SoftCircuits can you please submit a minimal repro for this?

@SoftCircuits
Copy link
Author

SoftCircuits commented Oct 13, 2023

@roji My project is large. I think these are the relevant parts.

Query

var results = await DbContext.Trucks
    .Where(t => t.Departure.HasValue)
    .Select(t => new
    {
        t.Departure!.Value.Year,
        t.Departure!.Value.Month,
        Quantity = t.InboundQuantity + t.ToTransfers.Sum(x => x.Quantity) - t.FromTransfers.Sum(x => x.Quantity)
    })
    .GroupBy(t => new { t.Year, t.Month })
    .ToDictionaryAsync(g => new DateTime(g.Key.Year, g.Key.Month, 1), g => g.Sum(x => x.Quantity));

Configuration

builder.Services.AddDbContext<ApplicationDbContext>(options =>
{
    string? connectionString = builder.Configuration.GetConnectionString(connectionName);
    if (connectionString == null)
        throw new Exception("Connection string not configured.");

    options.UseSqlServer(
        connectionString,
        builder => builder.MigrationsAssembly("PegasusEntities"));
});

ApplicationDbContext

public class ApplicationDbContext : IdentityDbContext<ApplicationUser>
{
    //public DbSet<ApiLog> ApiLogs { get; set; }
    //public DbSet<ApiUser> ApiUsers { get; set; }
    //public DbSet<BolDocument> BolDocuments { get; set; }
    //public DbSet<BolDocumentField> BolDocumentFields { get; set; }
    //public DbSet<ChangeLog> ChangeLogs { get; set; }
    //public DbSet<Company> Companies { get; set; }
    //public DbSet<Customer> Customers { get; set; }
    //public DbSet<CustomerFacility> CustomerFacilities { get; set; }
    //public DbSet<CustomerAlias> CustomerAliases { get; set; }
    public DbSet<Facility> Facilities { get; set; }
    //public DbSet<PdfTemplate> PdfTemplates { get; set; }
    //public DbSet<Product> Products { get; set; }
    //public DbSet<ProductAlias> ProductAliases { get; set; }
    //public DbSet<ProductCategory> ProductCategories { get; set; }
    //public DbSet<PurchaseOrder> PurchaseOrders { get; set; }
    public DbSet<Railcar> Railcars { get; set;}
    //public DbSet<ReportRecipient> ReportRecipients { get; set; }
    //public DbSet<Setting> Settings { get; set; }
    //public DbSet<Ship> Ships { get; set; }
    //public DbSet<ShipProduct> ShipProducts { get; set; }
    //public DbSet<Storage> Storage { get; set; }
    //public DbSet<StorageAdjustment> StorageAdjustments { get; set; }
    //public DbSet<Track> Tracks { get; set; }
    public DbSet<Transfer> Transfers { get; set; }
    public DbSet<Truck> Trucks { get; set; }
    //public DbSet<TruckField> TruckFields { get; set; }
    //public DbSet<UserFacility> UserFacilities { get; set; }

    public ApplicationDbContext(DbContextOptions<ApplicationDbContext> options)
        : base(options)
    {
    }

    protected override void OnModelCreating(ModelBuilder builder)
    {
        //builder.ApplyConfiguration(new ApiUserConfiguration());
        //builder.ApplyConfiguration(new ApplicationUserConfiguration());
        //builder.ApplyConfiguration(new BolDocumentConfiguration());
        //builder.ApplyConfiguration(new CompanyConfiguration());
        //builder.ApplyConfiguration(new CustomerConfiguration());
        //builder.ApplyConfiguration(new CustomerAliasConfiguration());
        builder.ApplyConfiguration(new FacilityConfiguration());
        //builder.ApplyConfiguration(new PdfTemplateConfiguration());
        //builder.ApplyConfiguration(new ProductConfiguration());
        //builder.ApplyConfiguration(new ProductAliasConfiguration());
        //builder.ApplyConfiguration(new PurchaseOrderConfiguration());
        builder.ApplyConfiguration(new RailcarConfiguration());
        //builder.ApplyConfiguration(new StorageConfiguration());
        //builder.ApplyConfiguration(new TrackConfiguration());
        builder.ApplyConfiguration(new TransferConfiguration());
        builder.ApplyConfiguration(new TruckConfiguration());
        //builder.ApplyConfiguration(new UserFacilityConfiguration());
        base.OnModelCreating(builder);
    }
}

Entities

/// <summary>
/// Represents a transfer of product between containers.
/// </summary>
public class Transfer
{
    public int Id { get; set; }
    public DateTime TimeStamp { get; set; }
    public double Quantity { get; set; }

    // To references
    // All but one must be null

    public int? ToRailcarId { get; set; }
    public Railcar? ToRailcar { get; set; }

    public int? ToTruckId { get; set; }
    public Truck? ToTruck { get; set; }

    //public int? ToStorageId { get; set; }
    //public Storage? ToStorage { get; set; }

    //public int? ToShipProductId { get; set; }
    //public ShipProduct? ToShipProduct { get; set; }

    // From references
    // All but one must be null

    public int? FromRailcarId { get; set; }
    public Railcar? FromRailcar { get; set; }

    public int? FromTruckId { get; set; }
    public Truck? FromTruck { get; set; }

    //public int? FromStorageId { get; set; }
    //public Storage? FromStorage { get; set; }

    //public int? FromShipProductId { get; set; }
    //public ShipProduct? FromShipProduct { get; set; }
}

public class TransferConfiguration : IEntityTypeConfiguration<Transfer>
{
    public void Configure(EntityTypeBuilder<Transfer> builder)
    {
        // To references
        builder.HasOne(t => t.ToRailcar)
            .WithMany(r => r.ToTransfers)
            .HasForeignKey(t => t.ToRailcarId);
        builder.HasOne(t => t.ToTruck)
            .WithMany(t => t.ToTransfers)
            .HasForeignKey(t => t.ToTruckId);
        //builder.HasOne(t => t.ToStorage)
        //    .WithMany(s => s.ToTransfers)
        //    .HasForeignKey(t => t.ToStorageId);
        //builder.HasOne(t => t.ToShipProduct)
        //    .WithMany(sp => sp.ToTransfers)
        //    .HasForeignKey(t => t.ToShipProductId);

        // From references
        builder.HasOne(t => t.FromRailcar)
            .WithMany(r => r.FromTransfers)
            .HasForeignKey(t => t.FromRailcarId);
        builder.HasOne(t => t.FromTruck)
            .WithMany(t => t.FromTransfers)
            .HasForeignKey(t => t.FromTruckId);
        //builder.HasOne(t => t.FromStorage)
        //    .WithMany(s => s.FromTransfers)
        //    .HasForeignKey(t => t.FromStorageId);
        //builder.HasOne(t => t.FromShipProduct)
        //    .WithMany(sp => sp.FromTransfers)
        //    .HasForeignKey(t => t.FromShipProductId);

        // Only one To and one From is not null
        builder.ToTable(t =>
        {
            t.HasCheckConstraint("ValidateToForeignKeys",
                @"(CASE WHEN [ToRailcarId] IS NULL THEN 0 ELSE 1 END
                + CASE WHEN [ToTruckId] IS NULL THEN 0 ELSE 1 END) = 1");
            t.HasCheckConstraint("ValidateFromForeignKeys",
                @"(CASE WHEN [FromRailcarId] IS NULL THEN 0 ELSE 1 END
                + CASE WHEN [FromTruckId] IS NULL THEN 0 ELSE 1 END) = 1");
            });
            //t.HasCheckConstraint("ValidateToForeignKeys",
            //    @"(CASE WHEN [ToRailcarId] IS NULL THEN 0 ELSE 1 END
            //    + CASE WHEN [ToTruckId] IS NULL THEN 0 ELSE 1 END
            //    + CASE WHEN [ToStorageId] IS NULL THEN 0 ELSE 1 END
            //    + CASE WHEN [ToShipProductId] IS NULL THEN 0 ELSE 1 END) = 1");
            //t.HasCheckConstraint("ValidateFromForeignKeys",
            //    @"(CASE WHEN [FromRailcarId] IS NULL THEN 0 ELSE 1 END
            //    + CASE WHEN [FromTruckId] IS NULL THEN 0 ELSE 1 END
            //    + CASE WHEN [FromStorageId] IS NULL THEN 0 ELSE 1 END
            //    + CASE WHEN [FromShipProductId] IS NULL THEN 0 ELSE 1 END) = 1");
        });
    }
}

public class Truck
{
    public int Id { get; set; }

    public int FacilityId { get; set; }
    public Facility Facility { get; set; }

    public DateTime Arrival { get; set; }

    [DisplayName("Inbound Quantity")]
    public double InboundQuantity { get; set; }

    [Display(Name = "Truck Number")]
    [Required]
    [StringLength(100)]
    public string TruckNumber { get; set; }

    [Display(Name = "Truck Company")]
    [Required]
    [StringLength(100)]
    public string TruckCompany { get; set; }

    [Display(Name = "Trailer Number")]
    [Required]
    [StringLength(100)]
    public string BoxNumber { get; set; }

    //[Display(Name = "Purchase Order")]
    //public int PurchaseOrderId { get; set; }
    //public PurchaseOrder PurchaseOrder { get; set; }

    [DataType("NVARCHAR(MAX)")]
    public string Notes { get; set; }

    public DateTime? Departure { get; set; }

    [Display(Name = "BOL")]
    public int? BillOfLading { get; set; }

    [Display(Name = "Inbound Weight (Pounds)")]
    [Range(1, 10000000, ErrorMessage = "Weight  is not valid")]
    public int? InboundWeight { get; set; }

    [Display(Name = "Outbound Weight (Pounds)")]
    [Range(1, 10000000, ErrorMessage = "Weight  is not valid")]
    public int? OutboundWeight { get; set; }

    public bool ManualEntry { get; set; }

    //public ICollection<TruckField> TruckFields { get; set; }
    public ICollection<Transfer> FromTransfers { get; set; }
    public ICollection<Transfer> ToTransfers { get; set; }
}

public class TruckConfiguration : IEntityTypeConfiguration<Truck>
{
    public void Configure(EntityTypeBuilder<Truck> builder)
    {
        builder.HasOne(t => t.Facility)
            .WithMany(f => f.Trucks)
            .OnDelete(DeleteBehavior.NoAction);

        //builder.HasOne(t => t.PurchaseOrder)
        //    .WithMany(po => po.Trucks)
        //    .OnDelete(DeleteBehavior.NoAction);

        builder.Property(t => t.ManualEntry)
            .HasDefaultValue(false);

        builder.Property(t => t.InboundQuantity)
            .HasDefaultValue(0.0);
    }
}

public class Railcar
{
    public int Id { get; set; }

    [Display(Name = "Railcar Number")]
    [Required]
    [StringLength(18)]
    public string RailcarNumber { get; set; }

    [Display(Name = "Facility")]
    public int FacilityId { get; set; }
    public Facility Facility { get; set; }

    //[Display(Name = "Track")]
    //public int? TrackId { get; set; }
    //public Track Track { get; set; }

    [Display(Name = "Arrival")]
    public DateTime Arrival { get; set; }

    [Display(Name = "Departure")]
    public DateTime? Departure { get; set; }

    public int? TripId { get; set; }

    //[DisplayName("Product")]
    //public int ProductId { get; set; }
    //public Product Product { get; set; }

    //[Display(Name = "Customer")]
    //public int CustomerId { get; set; }
    //public Customer Customer { get; set; }

    [Display(Name = "Consignee")]
    [StringLength(60)]
    public string Consignee { get; set; }

    [Display(Name = "Shipper")]
    [StringLength(60)]
    public string Shipper { get; set; }

    [Display(Name = "Origin")]
    [StringLength(45)]
    public string Origin { get; set; }

    // Quantity of units associated with product
    [Display(Name = "Inbound Quantity")]
    public double InboundQuantity { get; set; }

    [Display(Name = "Is Empty")]
    public bool IsEmpty { get; set; }

    [StringLength(15)]
    [DisplayName("Delivering Railroad")]
    public string Railroad { get; set; }

    /// <summary>
    /// Railcar track sequence. 0 = unsequenced.
    /// </summary>
    public int Sequence { get; set; }

    public ICollection<Transfer> FromTransfers { get; set; }
    public ICollection<Transfer> ToTransfers { get; set; }

    public override string ToString() => RailcarNumber;
}

public class RailcarConfiguration : IEntityTypeConfiguration<Railcar>
{
    public void Configure(EntityTypeBuilder<Railcar> builder)
    {
        builder.Property(r => r.InboundQuantity)
            .HasDefaultValue(0.0);

        builder.Property(r => r.IsEmpty)
            .HasDefaultValue(false);

        builder.Property(r => r.Sequence)
            .HasDefaultValue(0);

        builder.HasOne(t => t.Facility)
            .WithMany(f => f.Railcars)
            .OnDelete(DeleteBehavior.NoAction);

        builder.HasOne(t => t.Track)
            .WithMany(t => t.Railcars)
            .OnDelete(DeleteBehavior.SetNull);

        builder.HasOne(r => r.Customer)
            .WithMany(c => c.Railcars)
            .OnDelete(DeleteBehavior.NoAction);
    }
}

public class Facility
{
    public int Id { get; set; }

    [Required]
    [StringLength(80)]
    public string Name { get; set; }

    [Required]
    [Display(Name = "Time Zone")]
    [StringLength(45)]
    public string TimeZone { get; set; }

    [Required]
    [StringLength(14)]
    public string City { get; set; }

    [Required]
    [StringLength(2)]
    public string State { get; set; }

    [Display(Name = "Next BOL Number")]
    public int NextTruckBol { get; set; }

    //[DisplayName("Company")]
    //public int CompanyId { get; set; }
    //public Company Company { get; set; }

    //public ICollection<ApplicationUser> Users { get; set; }
    //public ICollection<UserFacility> UserFacilities { get; set; }
    //public ICollection<CustomerFacility> CustomerFacilities { get; set; }
    //public ICollection<PurchaseOrder> PurchaseOrders { get; set; }
    public ICollection<Railcar> Railcars { get; set; }
    //public ICollection<Track> Tracks { get; set; }
    public ICollection<Truck> Trucks { get; set; }
    //public ICollection<Ship> Ships { get; set; }
}

public class FacilityConfiguration : IEntityTypeConfiguration<Facility>
{
    public void Configure(EntityTypeBuilder<Facility> builder)
    {
        builder.HasMany(f => f.Users)
            .WithOne(u => u.Facility)
            .OnDelete(DeleteBehavior.NoAction);

        builder.Property(f => f.NextTruckBol)
            .HasDefaultValue(100);
    }
}

@roji
Copy link
Member

roji commented Oct 13, 2023

@SoftCircuits please make the effort of making the above snippets into a runnable, minimal repro. Not doing so effectively asks us to try to create the repro in your place, but we don't actually have the code or failure.

@SoftCircuits
Copy link
Author

@roji I've created a stand-alone project to demonstrate this issue. At the core, I have two simple queries that do not get the same results. Seems like they should.

ExampleProject.zip

@ajcvickers
Copy link
Contributor

Note for triage: The final GroupBy for very simple queries with projection of a simple type behaves differently than when done in LINQ to Objects.

Minimal repro:

using (var context = new SomeDbContext())
{
    await context.Database.EnsureDeletedAsync();
    await context.Database.EnsureCreatedAsync();

    context.AddRange(
        new Foo { Departure = new(2020, 5, 10)},
        new Foo { Departure = new(2020, 5, 11)},
        new Foo { Departure = new(2020, 6, 12)},
        new Foo { Departure = new(2020, 6, 13)},
        new Foo { Departure = new(2020, 7, 14)});
    
    await context.SaveChangesAsync();
}

using (var context = new SomeDbContext())
{
    
    // Query 1
    var results1 = context.Foo
        .Select(t => t.Departure.Month)
        .ToList()
        .GroupBy(t => new { t })
        .ToList();

    Console.WriteLine($"Query 1 : {results1.Count} results");

    foreach (var result in results1)
    {
        Console.WriteLine($"  Grouping key: {result.Key}");
        foreach (var item in result)
        {
            Console.WriteLine($"    {item}");
        }
    }

    // Query 2
    var results2 = context.Foo
        .Select(t => t.Departure.Month)
        .GroupBy(t => new { t })
        .ToList();

    Console.WriteLine($"Query 2 : {results2.Count} results");

    foreach (var result in results2)
    {
        Console.WriteLine($"  Grouping key: {result.Key}");
        foreach (var item in result)
        {
            Console.WriteLine($"    {item}");
        }
    }
}

public class SomeDbContext : DbContext
{
    protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        => optionsBuilder
            .UseSqlServer(@"Data Source=localhost;Database=One;Integrated Security=True;TrustServerCertificate=true")
            .LogTo(Console.WriteLine, LogLevel.Information)
            .EnableSensitiveDataLogging();

    public DbSet<Foo> Foo { get; set; }
}

public class Foo
{
    public int Id { get; set; }
    public DateTime Departure { get; set; }
}

Output:

info: 1/5/2024 10:28:00.165 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command) 
      Executed DbCommand (3ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT DATEPART(month, [f].[Departure])
      FROM [Foo] AS [f]
Query 1 : 3 results
  Grouping key: { t = 5 }
    5
    5
  Grouping key: { t = 6 }
    6
    6
  Grouping key: { t = 7 }
    7
info: 1/5/2024 10:28:00.207 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command) 
      Executed DbCommand (1ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
      SELECT DATEPART(month, [t].[Departure]), [t].[c]
      FROM (
          SELECT DATEPART(month, [f].[Departure]) AS [c], [f].[Departure]
          FROM [Foo] AS [f]
      ) AS [t]
      ORDER BY DATEPART(month, [t].[Departure])
Query 2 : 1 results
  Grouping key: { t = 5 }
    5
    5
    6
    6
    7

@roji
Copy link
Member

roji commented Jul 20, 2024

Duplicate of #34256

@roji roji marked this as a duplicate of #34256 Jul 20, 2024
@roji
Copy link
Member

roji commented Jul 20, 2024

Am going to use #34256 to track working around the SQL Server limitation in EF.

@roji roji closed this as not planned Won't fix, can't repro, duplicate, stale Jul 20, 2024
@roji roji removed this from the Backlog milestone Jul 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants