-
Notifications
You must be signed in to change notification settings - Fork 545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query compilation performance #430
Query compilation performance #430
Conversation
@ajcvickers @divega is it possible to get some feedback on this PR? Thanks! |
@cjpearson I am sorry that we haven't been responsive on this PR or on any of the other PRs for EF6. The team has been focusing completely on EF Core releases, but we expect to be able to do a push for making some progress on EF6.3 soon. |
@cjpearson this reminds me of 'context.Configuration.UseDatabaseNullSemantics'. Does setting this flag to true affect the compilation cost for your queries? |
Thanks for the update. I ran some quick tests with |
@cjpearson Some thoughts on each of the points above:
|
@ajcvickers Thanks for the feedback. Making The BitVec implementation is almost exactly the same as the .NET implementation, so I'm pretty confident of its behavior. The only major change besides exposing the bits was trying to use an ArrayPool, but I'm not sure the benefits were worth the risk there. Subsumes and MoveNext were the only parts of VarVec that changed so more tests there might help. Unfortunately the hashing turned into a dead end for me. Any improvements I found also broke the compilation. I don't think there's any quick improvements here. Do you have any thoughts on the VarMap changes? |
I have/had the same problem. I merged your changes into the 6.2.0 branch and my query compile time went down from ~4 minutes to ~10s :-) |
@ajcvickers Will any of this be available in EF 6.3? |
@chrfin I plan to see what I can merge, so I hope so, but that's not a guarantee. |
@ajcvickers Thanks for your reply. At least the configuration option for |
@ajcvickers Any update on this with EF 6.3 Preview 6? |
@chrfin No update for preview 6. |
@cjpearson Thanks for the PR and sorry for the delay in merging. This is now in. |
@ajcvickers Is this available in a (signed) nightly somewhere? https://www.myget.org/gallery/aspnetwebstacknightly is 6 month old... |
@bricelam Nightly builds for EF6? |
This reverts commit 2483303.
* lower MaxNodeCountForTransformations * use bitwise operations for VarVec.MoveNext and VarVec.Subsumes * Store reverseMap instead of calculating it each time
Is this released yet? |
This has been released as part of the 6.4.4 release in 2020-05-12. |
(This is more of an issue than a pull request, but I wanted to share some code along with it. It's certainly not ready to merge.)
We've been having performance issues with large LINQ queries and EntityFramework. Some large or complex queries take a very long time to compile. They're cached so we only encounter this penalty once but with some queries taking several minutes to compile it's a very noticeable impact. The best workaround we've found so far is splitting the queries into several smaller queries and combining the results. This brings the total compilation time down to several seconds.
I've tried running the compilation through a profiler to find out what's causing the slowness and made a few changes to improve the compilation speed. I haven't tested them much beyond the unit tests, but based on my test query the results are promising.
51,000 node query
Before changes: 2+ minutes
With changes: ~10 seconds
With changes and lower (10,000) MaxNodeCountForTransformations: ~1 second
They're in a very rough state currently, but I'd like to get your thoughts to see if any of them are worth pursuing further.
Lower MaxNodeCountForTransformations
This one definitely had the biggest impact. Even with all the other changes, the compilation was still taking over ten seconds with 90% of the time spent applying these transformations. The most expensive part of the transformations were the IsNull rules. I wasn't comfortable with changing the logic of these transformations so I stuck to lower-level changes, but if there are any algorithmic changes to these transformations it would be a big help.
Changes to VarVec to operate directly on the underlying int[]
With VarVec sizes in the thousands, a lot of the CPU time was being spent in MoveNext and Subsumes in the VarVec class. I had some success optimizing these functions by doing bit operations on the underlying ints instead of iterating through bits. In this example I've added BitVec, a modified version of the BitArray class, but the same changes could also be done using reflection.
Store the Reversed VarMap
Instead of iterating through the dictionary each time GetReverseMap is called, store a reversed version. As part of this change, VarMap implements IDictionary instead of inheriting from Dictionary.
Other possibilities:
There were a couple other places I looked at but wasn't able to find good solutions for.
Change SubTreeId hashing
Each call to insert averaged about a thousand calls to SubTreeId.Equals. This suggested there were a lot of hash collisions for the query. I'm not sure if there's better way to hash while making sure equivalent SubTreeIds always have the same hash.
NodeInfo GetHashValue
A lot of time was also spent in the GetHashValue method of NodeInfo. Perhaps there may be efficient implementation of this method.
Do any of these options seem like candidates to be added to EF? Or are there any other ways to improve the query compilation step? Thanks!