Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance degraded when migrated from EF2 to EF6 #27939

Open
EvgenyMuryshkin opened this issue May 3, 2022 · 23 comments
Open

Performance degraded when migrated from EF2 to EF6 #27939

EvgenyMuryshkin opened this issue May 3, 2022 · 23 comments

Comments

@EvgenyMuryshkin
Copy link

File a bug

Performance degraded when migrated from EF2 to EF6

Include your code

Please see repository for test cases and database setup, + test cases for AsSplitQuery and EFPlus Optimized query

https://github.com/EvgenyMuryshkin/EFCorePerf

Include provider and version information

EF Core version: 6
Database provider: (Microsoft.EntityFrameworkCore.SqlServer)
Target framework: (.NET 6.0)
Operating system: Windows 10 Pro
IDE: (Visual Studio 2022)

@roji
Copy link
Member

roji commented May 3, 2022

@EvgenyMuryshkin I'd be happy to take a deeper look, but I noticed you've implemented your benchmark as unit tests, without any consideration to warm-up, how to determine iteration counts, or various other benchmarking considerations. For example, the first test that happens to run will perform all the cold-start work, and appear to work much slower than the second test (this is because of the lack of warmup).

I highly recommend doing your benchmarks with BenchmarkDotNet, which takes care of all these problems.

@EvgenyMuryshkin
Copy link
Author

EvgenyMuryshkin commented May 3, 2022

@roji I have updated repository with benchmarks and test results, please have a look.

This is kind of the key difference.
I found that in EF2 first DB call is also slow (EF warmup, SQL execution plan), but then if works fine for subsequent requests with different query parameters.

In EF6 - each call is slow, fast call is only for query with the same parameters.

Thanks,
Regards,
Evgeny

@roji
Copy link
Member

roji commented May 4, 2022

@EvgenyMuryshkin thanks for making the change to BenchmarkDotNet.

Looking at the benchmarks, you seem be doing a large amount of collection joins using a single query - this causes the so-called "cartesian explosion" problem, and it's expected for this to run slowly. We typically recommend switching to split query for this kind of scenario (see this section in our docs).

Now, I see that in your summary you address split query, but say that it "snailed along". The SQL query right below doesn't seem to be a split query though, and I can't see any actual benchmark results for that - can you please update your benchmark code and results to use split query, and post the SQL outputted from it?

@EvgenyMuryshkin
Copy link
Author

EvgenyMuryshkin commented May 4, 2022

@roji ef6 split query test runs for 13 minutes, comparing to 7 seconds for ef2. Do you really need benchmark for this? I did not paste all split queries, only one that is causing the problem.

I understand cartesian, ef2 was able to handle it without problems. As I remember, problem first appeared in ef3, but I hold upgrade for as long as I could. Now as .net 3 is running out of support, we forced to upgrade and ef issue is still there.

@roji I updated readme with full AsSplitQuery log (6 large queries in total for AsSplitQuery)

Thanks,
Regards,
Evgeny

@roji
Copy link
Member

roji commented May 4, 2022

Do you really need benchmark for this?

Well, split query is what you're supposed to be using in EF Core 3+ when many collection includes are present - so it makes sense to benchmark that.

I understand cartesian, ef2 was able to handle it without problems. As I remember, problem first appeared in ef3, but I hold upgrade for as long as I could. Now as .net 3 is running out of support, we forced to upgrade and ef issue is still there.

EF Core 2 did not perform single query (JOINs) for collection includes, it performed a form of split query. Single query was introduced in EF Core 3.0.

@smitpatel can you take a look here? IIRC the EF Core 3+ split query isn't identical to what we were doing before 3, maybe that difference is causing a comparative slowness here?

@EvgenyMuryshkin
Copy link
Author

EvgenyMuryshkin commented May 4, 2022

@roji sure, will create benchmarks for split and optimized.

Thanks,
Regards,
Evgeny

@EvgenyMuryshkin
Copy link
Author

EvgenyMuryshkin commented May 4, 2022

@roji I have updated readme with benchmarks for AsSplitQuery.

Cannot reproduce stall in benchmark, only managed to see that during unit test run, unfortunately - effectively ~8 seconds per query (see section below EF6 benchmark results).

https://github.com/EvgenyMuryshkin/EFCorePerf/blob/main/AsSplitQuery.stall.txt

Thanks,
Regards,
Evgeny

@roji
Copy link
Member

roji commented May 4, 2022

@EvgenyMuryshkin are you saying that in the benchmark, the EF Core 6 split query performance is completely fine (comparable to what it was in EF Core 2)? If so, there may be some issue with the way your tests are set up (or in your actual application), or some interference between them that explains the slowdown. I'd advise concentrating on reproducing the perf issue in the benchmark - that may help you find the actual issue in your application.

@EvgenyMuryshkin
Copy link
Author

EvgenyMuryshkin commented May 4, 2022

@roji it is still a lot slower than ef2, 12ms vs 260ms

@roji
Copy link
Member

roji commented May 4, 2022

OK, thanks. Probably good for @smitpatel to take a look at the generated SQLs.

@smitpatel
Copy link
Contributor

Looking at the query ShippingUnitsWithComposites There are 5 collections in the query. Looking at logs in readme file, EF6 split query generates 6 queries which is expected. But EF2 only generated 2 queries (even with split mode). I still suspect we are measuring same thing here. If the performance issue is there then it would be easier to trim down this to much smaller code rather than having 20+ includes. Lesser number of include should still show difference.

@EvgenyMuryshkin
Copy link
Author

EvgenyMuryshkin commented May 5, 2022

@smitpatel Query #4 in EF6 Split query is the most time and resource consuming from SQL profiling tool.
I have added markers into readme

@EvgenyMuryshkin
Copy link
Author

@smitpatel I got the idea that query is complex. Question here really, is there something can be done from EF side to match performance of EF2. Or it will stay like this and we have to try and find workarounds apart from using of AsSplitQuery

Thanks
Regards,
Evgeny

@EvgenyMuryshkin
Copy link
Author

@smitpatel maybe I can somehow replicate EF2 split logic? I tried to add multiple AsSingleQuery() and AsSplitQuery() into the same WithComposites, but it seems to be picking up only last modifier and apply to the whole query

@smitpatel
Copy link
Contributor

That doesn't address my observation above. I am not sure if we are comparing exactly same query between EF2 vs EF6 here. In that case, no there is no way for a EF6 query to behave like some different EF2 query.

@EvgenyMuryshkin
Copy link
Author

EvgenyMuryshkin commented May 5, 2022

@smitpatel LINQ queries are the same and generated SQL queries are different, making overall performance impact. How can we proceed from here?

@smitpatel
Copy link
Contributor

If the generated query count is not the same then they are not same. Comparing the perf of 2 DbComamnd vs 6 DbCommand, certainly the latter will likely to have higher perf. The way split query is implemented in EF3+ it issues to same number of command as EF2. The only difference is in the generated SQL which is intentional change to allow utilizing the code path for more kind of queries (specifically queries with Distinct/Skip/Take which couldn't use split query mode in EF2).

Path forward from here, get a single LINQ query which is generating same number of SQL queries, then you would be able to inspect the difference between generated SQL (intentional change I mentioned above). It will give you an opportunity to understand if the generated SQL queries are intrinsically slower or is it something EF core does from the results which is slowing down.

Even while the repro code uses BDN, the amount of code is still quite a lot of pin point down if the queries being run and results being generated are the same. You need to trim down the repro code to minimal amount for us to investigate effectively.

@EvgenyMuryshkin
Copy link
Author

@smitpatel I don't think I can give more on this. I spent last two weeks trying to pin down the problem. EF6 works fine on two table joins, it works reasonable on random seed for data in tables for that schema.

But when it comes to production - it just slow, with or without split query modifier. So I had to pack whole prod database as a test case.

Same LINQ query between versions produces very different SQL in terms of performance, looks like we need to change schema then.

Thanks
Regards,
Evgeny

@EvgenyMuryshkin
Copy link
Author

@smitpatel I might have an opportunity to get back on this in couple of weeks, we just run out of time. I will try to find query that produce similar sql

@EvgenyMuryshkin
Copy link
Author

@smitpatel I have added benchmarking for incremental complication of this test query.

https://github.com/EvgenyMuryshkin/EFCorePerf/blob/main/query.md

Performance is comparable up to query 48, then EF6 falls behind.

@EvgenyMuryshkin
Copy link
Author

EvgenyMuryshkin commented May 10, 2022

@smitpatel I don't know what I am looking for.

EF6 produced completely different join pattern then EF2, how can I create LINQ that produces same SQL queries?

Please have a look into single include SQL comparison.

https://github.com/EvgenyMuryshkin/EFCorePerf/blob/main/diff.md

EF6 Split query looks similar to EF2, is that what you are after?

@smitpatel
Copy link
Contributor

What are the perf comparisons of EF2 query and EF6 split query?

@EvgenyMuryshkin
Copy link
Author

EvgenyMuryshkin commented May 10, 2022

@smitpatel just single include, EF6 is faster (at the bottom - Distilled section), but as data gets added, EF6 falls behind, especially split query.
See here
https://github.com/EvgenyMuryshkin/EFCorePerf/blob/main/query.md
ShippingUnitsWithComposites31AsSplitQuery and ShippingUnitsWithComposites32AsSplitQuery

At this time References are being included to query 32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants