Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GroupBy() Min() Limitations #29455

Closed
KPLK opened this issue Oct 30, 2022 · 5 comments
Closed

GroupBy() Min() Limitations #29455

KPLK opened this issue Oct 30, 2022 · 5 comments

Comments

@KPLK
Copy link

KPLK commented Oct 30, 2022

Let's say my database table Agents is the following,

+------------+----------------------+--------------------+------------+-----------------+---------+
| AgentCode  | AgentName            | WorkingArea        | Commission | PhoneNo         | Country |
+------------+----------------------+--------------------+------------+-----------------+---------+
| A007       | Ramasundar           | Bangalore          |       0.15 | 077-25814763    |         |
| A003       | Alex                 | London             |       0.13 | 075-12458969    |         |
| A008       | Alford               | New York           |       0.12 | 044-25874365    |         |
| A011       | Ravi Kumar           | Bangalore          |       0.15 | 077-45625874    |         |
| A010       | Santakumar           | Chennai            |       0.14 | 007-22388644    |         |
| A012       | Lucida               | San Jose           |       0.12 | 044-52981425    |         |
| A005       | Anderson             | Brisban            |       0.13 | 045-21447739    |         |
| A001       | Subbarao             | Bangalore          |       0.14 | 077-12346674    |         |
| A002       | Mukesh               | Mumbai             |       0.11 | 029-12358964    |         |
| A006       | McDen                | London             |       0.15 | 078-22255588    |         |
| A004       | Ivan                 | Torento            |       0.15 | 008-22544166    |         |
| A009       | Benjamin             | Hampshair          |       0.11 | 008-22536178    |         |
+------------+----------------------+--------------------+------------+-----------------+---------+

What I exactly need to query is as the following(in SQL for better understanding).

SELECT WorkingArea, MIN(Commission) 
FROM agents 
GROUP BY WorkingArea;

And its result is:-

WorkingArea                         MIN(Commission)
----------------------------------- ---------------
San Jose                                        .12
Torento                                         .15
London                                          .13
Hampshair                                       .11
New York                                        .12
Brisban                                         .13
Bangalore                                       .14
Chennai                                         .14
Mumbai                                          .11

How can I do the same with .NET Entity Framework MVC?

I tried the following. But it gave me only the MIN(COMMISSION) row.

IEnumerable<Agent> AgentList = _db.Agents
                .GroupBy(fields => fields.WorkingArea)
                .Select(fields => new Agent
                {
                    Commission = fields.Min(fields => fields.Commission)
                });

// If my Model name is Agent its corresponding database table name will be Agent**s**.

1. How can I do the equivalent to the sql query above, with .NET Entity Framework?

2. What if I want add also the PhoneNo column to the result? like

    SELECT WorkingArea, MIN(Commission), PhoneNo 
    FROM agents 
    GROUP BY WorkingArea;
@ErikEJ
Copy link
Contributor

ErikEJ commented Oct 30, 2022

Can you run example 2 as SQL??

@KPLK
Copy link
Author

KPLK commented Oct 30, 2022

Can't in MS SQL Server. But in MySQL I hardly remember it was working.

@Xor-el
Copy link

Xor-el commented Oct 30, 2022

@KPLK

There are 3 approaches at the top of my head to tackle this

the first approach is doing a MinBy inside a Select after the GroupBy, something like the approach below.

IEnumerable<Agent> AgentList = _db.Agents
                .GroupBy(fields => fields.WorkingArea)
                .Select(x => x.MinBy(i => i.Commission))
                .Select(fields => new Agent
                {
                    WorkingArea = fields.WorkingArea,
                    Commission = fields.Commission,
                    PhoneNo = fields.PhoneNo,
                });

unfortunately, the approach above won't work at the moment since EF Core can't translate MinBy directly to SQL at the moment.

second approach which I feel EF Core should be able to translate to SQL involves doing an OrderBy using the key Commission inside a Select and picking the FirstOrDefault after the GroupBy.

IEnumerable<Agent> AgentList = _db.Agents
                .GroupBy(fields => fields.WorkingArea)
                .Select(x => x.OrderBy(i => i.Commission).FirstOrDefault())
                .Select(fields => new Agent
                {
                    WorkingArea = fields.WorkingArea,
                    Commission = fields.Commission,
                    PhoneNo = fields.PhoneNo,
                });

third approach involves applying an accumulator function (Aggregate) over a sequence, I have no idea if EF Core can translate this to SQL.

IEnumerable<Agent> AgentList = _db.Agents
                .GroupBy(fields => fields.WorkingArea)
                .Select(x => x.Aggregate((a, b) => a.Commission < b.Commission ? a : b))
                .Select(fields => new Agent
                {
                    WorkingArea = fields.WorkingArea,
                    Commission = fields.Commission,
                    PhoneNo = fields.PhoneNo,
                });

Personally, I think you should go with the second approach as EF Core has a better chance at translating that to SQL at the moment than the rest.
Do note that the second approach would not translate to exactly the same SQL as you hand-crafted (except EF Core has become sentient 😱). however, it should give you the same result set.

@roji
Copy link
Member

roji commented Oct 30, 2022

@Xor-el none of your suggestions actually translate with EF - it's a good idea to test code suggestions before posting them.

@KPLK:

  1. How can I do the equivalent to the sql query above, with .NET Entity Framework?

To get all working areas with the minimal commission for each working area (without the phone number), simply project out the group key as follows:

var agents = await ctx.Agents
    .GroupBy(agent => agent.WorkingArea)
    .Select(g => new Agent
    {
        WorkingArea = g.Key,
        Commission = g.Min(fields => fields.Commission),
    });
  1. What if I want add also the PhoneNo column to the result? like

This is a very different requirement. Re the SQL you say you're looking for:

SELECT WorkingArea, MIN(Commission), PhoneNo 
FROM agents 
GROUP BY WorkingArea

As written above, that doesn't work on most databases, since you can't project a column without an aggregate function in a query containing GROUP BY. You say that this works on MySQL; but which phone number does it return exactly? I'm pretty sure there's no guarantee it would return the phone number for the agent with the minimal commission (there's no connection between the two projections).

#25566 tracks EF translating MinBy, which as @Xor-el wrote would be appropriate here. Based on suggestions in that issue, here's a query that probably does what you want:

var agents = await ctx.Agents
    // Get the agent with the lowest commission for each working area
    .GroupBy(agent => agent.WorkingArea)
    .Select(g => new { WorkingArea = g.Key, Commission = g.Min(fields => fields.Commission) })

    // Join back to the agents table to get the top agent Id for each { WorkingArea, MinCommission } pair
    .Join(ctx.Agents, wa => wa, agent => new { agent.WorkingArea, agent.Commission }, (wa, agent) => agent)
    .GroupBy(agent => new { agent.WorkingArea, agent.Commission })
    .Select(group => new
    {
        group.Key.WorkingArea,
        MinCommission = group.Key.Commission,
        AgentPhoneNo = group.Max(a => a.PhoneNo)
    })
    .ToListAsync();

This generates the following SQL:

SELECT [a0].[WorkingArea], [a0].[Commission] AS [MinCommission], MAX([a0].[PhoneNo]) AS [AgentPhoneNo]
FROM (
    SELECT [a].[WorkingArea], MIN([a].[Commission]) AS [Commission]
    FROM [Agents] AS [a]
    GROUP BY [a].[WorkingArea]
) AS [t]
INNER JOIN [Agents] AS [a0] ON ([t].[WorkingArea] = [a0].[WorkingArea] OR ([t].[WorkingArea] IS NULL AND [a0].[WorkingArea] IS NULL)) AND [t].[Commission] = [a0].[Commission]
GROUP BY [a0].[WorkingArea], [a0].[Commission]

The above returns the phone number for a random agent which has the minimum commission for a working area.

@roji
Copy link
Member

roji commented Oct 30, 2022

Duplicate of #25566

@roji roji marked this as a duplicate of #25566 Oct 30, 2022
@ajcvickers ajcvickers closed this as not planned Won't fix, can't repro, duplicate, stale Nov 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants