-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to deal with array_agg(column1, column2) in Entity Framework and Postgres? #2631
Comments
I'm not fully understanding what you're looking to do - can you please show the current C# code and query you're using, and the full query you'd like to see? It's difficult to understand from snippets and partial info exactly you're looking for. |
Well there is not much code to show. I wasn't able to create a really fast query with Entity Framework, since this would be to complex.
To get a fast db query I decided split up the things using CTE. One issues is, that two tables have large number of rows. With a normal join it would work, but the result set would grow. So I decided to aggregate both parts by creating arrays and to join via Any() Operator . One side has a array of assigned roleIds which are matiching to the roleId of the table. And to have one row per resource I did a second aggregate on permissions and scopes. The pro on this solution is, that number of rows and that means also number of objects to be mapped does not grow very fast with a larger DB. Without aggregate the result set would have in the worst case 500000 objects or more depending on the passed parameters. |
Well the only other solution what just come up in my mind is creating a user defined type. That should work with Entity Framework. Would be cool if EF could deal with that aggregate objects. Just by adding a mapping to the model like this:
or in case of an Array something like
|
Is there a way to map user defined types in EF7? |
There's a lot going on in your comments, I'm still not sure I understand exactly what it is you're trying to do (the overall goal as opposed to "how I can do X specifically"). But here are some answers:
First, it seems you're trying to avoid using SQL JOIN because of the data duplication it creates (e.g. joining Blogs with Posts duplicated a Blog for each Post it has). I'd advise doing this only if you've confirmed that the duplication causes an actual performance problem in your specific scenario, rather than doing any sort of premature optimization. See the EF docs on this. Assuming you do need to avoid JOIN, the usual EF way to do this is via split querying. I'd again advise considering this, and only looking into advanced PG-specific techniques if you actually need to. I do aim to add the possibility to use array_agg to load collection includes on PG - that's #1691. But that issue hasn't been implemented yet.
You can include your custom type in your EF model (but e.g. exclude it from migrations to prevent EF from creating a table); at that point you can use SQL querying and map the results back to it (docs). This should allow you to write whatever SQL you want and have EF materialize the results back to your type. Note that we've just merged an enhancement for EF 8.0 that allows you to do this without including your type in the EF model (docs).
Version 7.0 of the provider does support translating array_agg, at least in some scenarios, e.g.: _ = ctx.Posts
.GroupBy(p => p.Blog)
.Select(g => EF.Functions.ArrayAgg(g.Select(p => p.Id)))
.ToArray(); However, you cannot currently do ArrayAgg over a record/tuple, I've opened #2633 to track this. You also can't translate unnest at this point.
I'm not sure how a CTE helps making the query faster; note that non-recursive CTEs can be rewritten as a simple subquery, so there's nothing you should be unable to do without them. Regardless, EF can't currently generate CTEs (WITH), this is tracked by dotnet/efcore#26486. |
Thank you very much that sounds good. Not quite sure how split query is implemented. Does it execute each sub query, wait for the response and do the next one? One thought on using CTE was that I avoid data transfer between Client and DB. |
@CarstenK1 yes, at this point split queries are executed in separate roundtrips, although this is something we want to fix soon (dotnet/efcore#10878). This situation indeed isn't ideal, but I'd make sure it's enough of a problem for you before doing something more complicated. Note that again, non-recursive CTEs can generally be expressed as subqueries (not that there's anything wrong with using them, if you're writing your own SQL). |
I've created a custom object to get the result of an Raw SQL. I used JSONB to aggregate some data. The object is defined like this:
This works good, but than I noticed that using array_agg is much faster and I changed the query to use that aggregation function like this:
Select array_agg((code, permission_scope_ids)) ..... that creates an Array of Tupels <string,int[]>. The return datatype is record[]. Now I try to match that to the Entity Framework. The NPGSQL support that Datatype. I've tried using object[] or record as Type, but does not work. Can entity Framework map that to an Array of Objects?
The text was updated successfully, but these errors were encountered: