Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large DbContext Startup Time #9347

Closed
SergeyBarskiy opened this issue Aug 7, 2017 · 32 comments
Closed

Large DbContext Startup Time #9347

SergeyBarskiy opened this issue Aug 7, 2017 · 32 comments
Labels
area-perf closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-enhancement
Milestone

Comments

@SergeyBarskiy
Copy link

SergeyBarskiy commented Aug 7, 2017

Describe what is not working as expected.

Large DbContext (1000) tables takes about 40 seconds before the first query. Any plans to add view generation at build time to EF Core, just like EF 6 has?
Thanks.

Further technical details

EF Core version: 1.0.4
Database Provider: Microsoft.EntityFrameworkCore.SqlServer
Operating system: Windows 10
IDE: Visual Studio 2017

@ErikEJ
Copy link
Contributor

ErikEJ commented Aug 7, 2017

Are you able to share your DbContext?

@divega
Copy link
Contributor

divega commented Aug 7, 2017

@AndriySvyryd are there any significant improvements for this in 2.0?

@SergeyBarskiy what exact version are you using?

@SergeyBarskiy
Copy link
Author

@divega 1.0.4 right now.
@ErikEJ I reverse engineered our main product database, so I am not sure. I could check with my boss on legal aspects of doing this. Ordinarily we require NDA in place. Is this a must?

@ErikEJ
Copy link
Contributor

ErikEJ commented Aug 7, 2017

It will most likely help the team find the hotspots to have a repro. Have you tested with 2.0 ( you wrote 2.0 in your initial message) ?

@AndriySvyryd
Copy link
Member

@divega There haven't been any specific significant perf improvements in model building in 2.0

@SergeyBarskiy
Copy link
Author

@ErikEJ Typo. I am testing at work, and I do not have 2.0 on my box there, like I do at home. In any case to reveal the schema I have to get a permission from legal. I could test on 2.0 if anyone would like to know the results.

@SergeyBarskiy
Copy link
Author

Also, back to original question, is view pre-generation a thing for EF Core?

@divega
Copy link
Contributor

divega commented Aug 12, 2017

@SergeyBarskiy we have played with the idea of "compiled models" (see #1906) which would be a way to produce an artifact or set of artifacts at design time or compile time which would include (but not be limited to) the pre-computed code first model. The feature is currently in the backlog.

I could test on 2.0 if anyone would like to know the results.

It would absolutely be great if you could try your model with 2.0 (doing so should be easier very soon).

If it turns out to still be too slow and you could share the model (it can be privately through email) so that we can profile it please let us know.

I am going to close the issue for now but if there is any follow up action feel free to re-activate.

@divega divega closed this as completed Aug 12, 2017
@divega divega added the closed-no-further-action The issue is closed and no further action is planned. label Aug 12, 2017
@SergeyBarskiy
Copy link
Author

I got an OK to share compiled model, @divega . If you provide email address, I can send it to you.
Thanks.

@divega
Copy link
Contributor

divega commented Aug 15, 2017

Thanks @SergeyBarskiy. My email is like my alias here + @ + the company I work for + .com.

Didn’t you have a chance to test the performance with 2.0 final (we released it yesterday).

@ajcvickers
Copy link
Member

@divega Re-opened and assigned to you for now so we don't forget about this. We can re-triage based on the results of looking at the repro.

@ajcvickers ajcvickers added type-investigation and removed closed-no-further-action The issue is closed and no further action is planned. labels Aug 16, 2017
@SergeyBarskiy
Copy link
Author

SergeyBarskiy commented Aug 17, 2017

I emailed you the project, @divega
Things have gotten worse in 2.0 from 1.0, I went from 40 seconds to 2 minutes. memory usage was pretty high too, over 110MB.

Here is Program.cs listing:

using Microsoft.EntityFrameworkCore;
using System.Linq;
using System;
using System.Diagnostics;

namespace EFLargeContext
{
    class Program
    {
        static void Main(string[] args)
        {
            var builder = new DbContextOptionsBuilder<MyDbContext>();
            builder.UseSqlServer("Server=.;Database=Edge-Sql;Trusted_Connection=True;MultipleActiveResultSets=true");
            var timer = new System.Diagnostics.Stopwatch();
            timer.Start();
            using (var ctx = new MyContext(builder.Options))
            {
                timer.Stop();
                Console.WriteLine($"Total ms to create {timer.ElapsedMilliseconds}");

                timer.Start();
                var setting = ctx.Settings.FirstOrDefault(s => s.Settingid == "06600547-5D1D-475C-98AB-BB22E519C2E9");
                timer.Stop();
                Console.WriteLine($"Total ms to first query {timer.ElapsedMilliseconds}");
                timer.Reset();

                timer.Start();
                setting = ctx.Settings.FirstOrDefault(s => s.Settingid == "06D8A69C-E5B2-4004-88E8-C4BB0562448D");
                timer.Stop();
                Console.WriteLine($"Total ms to second query {timer.ElapsedMilliseconds}");
                timer.Reset();


                timer.Start();
                setting.Stringvalue = "x";
                ctx.SaveChanges();
                timer.Stop();
                Console.WriteLine($"Total ms to first save {timer.ElapsedMilliseconds}");
                timer.Reset();ac

                timer.Start();
                setting.Stringvalue = "y";
                ctx.SaveChanges();
                timer.Stop();
                Console.WriteLine($"Total ms to second save {timer.ElapsedMilliseconds}");
                timer.Reset();
            }
        }
    }
}

Here is the output from dotnet run
Total ms to create 4993
Total ms to first query 126078
Total ms to second query 17
Total ms to first save 121
Total ms to second save 10

@divega divega removed their assignment Aug 17, 2017
@divega divega removed this from the 2.1.0 milestone Aug 17, 2017
@divega
Copy link
Contributor

divega commented Aug 17, 2017

@ajcvickers I got the repro model from @SergeyBarskiy. This can now be assigned in triage to someone that is going to perform the actual investigation and then I can forward the repro. Thanks.

@smitpatel smitpatel self-assigned this Aug 17, 2017
@smitpatel
Copy link
Contributor

@divega - Can you send me project?

@ajcvickers ajcvickers added this to the 2.1.0 milestone Aug 18, 2017
@smitpatel
Copy link
Contributor

@SergeyBarskiy Can you share create table script for INSTOREITEMTRANSFER table. From what I suspect you generated this model from database using reverse engineering and there are bugs in rev eng causing decimal/numeric types to be scaffolded incorrectly. I would like to repro those bugs too.

@SergeyBarskiy
Copy link
Author

SergeyBarskiy commented Aug 22, 2017

Sure. Here you go, Smit.

CREATE TABLE [dbo].[INSTOREITEMTRANSFER](
	[INSTOREITEMTRANSFERID] [char](36) NOT NULL,
	[INBINIDFROM] [char](36) NOT NULL,
	[INBINIDTO] [char](36) NOT NULL,
	[TRANSFERQUANTITY] [numeric](38, 5) NOT NULL,
	[ITEMID] [char](36) NOT NULL,
 CONSTRAINT [PK_INStoreItemTransfer] PRIMARY KEY CLUSTERED 
(
	[INSTOREITEMTRANSFERID] ASC
))

@smitpatel
Copy link
Contributor

@SergeyBarskiy - Sorry to ask for more info
Can you share table creation script for table BLLICENSEWFSTEP & CATRANSACTION?

@ajcvickers ajcvickers modified the milestones: 2.1.0-preview1, 2.1.0 Jan 17, 2018
@divega divega modified the milestones: 2.1.0-preview2, 2.1.0 Apr 2, 2018
AndriySvyryd added a commit that referenced this issue Apr 2, 2018
Memoize DisplayName() for Type
Call Attribute.IsDefined before GetCustomAttributes()
Use in parameters for TypeMappingInfo and RelationalTypeMappingInfo
Implement IEquatable<T> on TypeMappingInfo and RelationalTypeMappingInfo

Fixes #9347
AndriySvyryd added a commit that referenced this issue Apr 2, 2018
Memoize DisplayName() for Type
Call Attribute.IsDefined before GetCustomAttributes()
Use in parameters for TypeMappingInfo and RelationalTypeMappingInfo
Implement IEquatable<T> on TypeMappingInfo and RelationalTypeMappingInfo

Fixes #9347
AndriySvyryd added a commit that referenced this issue Apr 2, 2018
Memoize DisplayName() for Type
Call Attribute.IsDefined before GetCustomAttributes()
Use in parameters for TypeMappingInfo and RelationalTypeMappingInfo
Implement IEquatable<T> on TypeMappingInfo and RelationalTypeMappingInfo

Fixes #9347
AndriySvyryd added a commit that referenced this issue Apr 4, 2018
Memoize DisplayName() for Type
Call Attribute.IsDefined before GetCustomAttributes()
Use in parameters for TypeMappingInfo and RelationalTypeMappingInfo
Implement IEquatable<T> on TypeMappingInfo and RelationalTypeMappingInfo

Fixes #9347
AndriySvyryd added a commit that referenced this issue Apr 4, 2018
Memoize DisplayName() for Type
Call Attribute.IsDefined before GetCustomAttributes()
Use in parameters for TypeMappingInfo and RelationalTypeMappingInfo
Implement IEquatable<T> on TypeMappingInfo and RelationalTypeMappingInfo

Fixes #9347
@AndriySvyryd AndriySvyryd added closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. and removed type-investigation labels Apr 4, 2018
@AndriySvyryd
Copy link
Member

We've improved the model building perf a bit, but your best bet would be waiting for Compiled Models #1906

@AndriySvyryd AndriySvyryd removed their assignment Apr 4, 2018
@jemiller0
Copy link

One thing that I found that helped me with my large model which has about 1,400 tables was to use the fluent API overloads that accept a string instead of a lambda function as in the following.

            modelBuilder.Entity<A21IndirectCostRecoveryAccount>().Property(nameof(A21IndirectCostRecoveryAccount.Id)).HasColumnType("DECIMAL(10, 0)");

I was having problems running out of stack space when I used lambdas because I had so many method calls in OnModelCreating(). I forget what the exact difference was, but, in addition to cutting the memory footprint down, it also sped things up. My model initializes in about 10 seconds now. There were times in EF 6 when it was more like 2 minutes. If you have lots of fluent API calls, using strings seems to be the way to go as far as I can tell.

@ajcvickers
Copy link
Member

@jemiller0 Thanks for the info. @divega @bricelam @AndriySvyryd Should we consider a reverse engineering mode that generates string-based calls (using nameof)? Also, this may be worth thinking about for the compiled model.

@bricelam
Copy link
Contributor

bricelam commented May 8, 2018

We need templates.

@jemiller0
Copy link

I just double checked to make sure I was remembering things correctly. I have about 1,300 entities. I think switching to using string version of the method calls mainly only helped cut down the memory footprint. I just tried checking the performance. Two runs using lambdas took 00:00:13.2868370 and 00:00:13.5996700. Two runs using strings took 00:00:11.7494176 and 00:00:11.2098554. So, it's not much of an improvement. Mainly, I made that switch to get around running out of stack space.

@jemiller0
Copy link

One thing I've always wondered is whether it would be possible to make it so that EF lazy initializes the model as needed? I guess you would need to have different methods containing the fluent API calls for each table, and call those as the tables are referenced. I don't know NHibernate does it, but, it starts up with no lag at all. It seems like initialization is doing a push, when it should be doing a pull as needed. I'm assuming doing something like that would be a major architectural change.

@divega
Copy link
Contributor

divega commented May 8, 2018

Should we consider a reverse engineering mode that generates string-based calls (using nameof)?

Two runs using lambdas took 00:00:13.2868370 and 00:00:13.5996700. Two runs using strings took 00:00:11.7494176 and 00:00:11.2098554.

Something I have seen in the past (with EF6) is attribute-based code first mapping being significantly faster than fluent mappings for large models. The hypothesis was that this was because of the creation and parsing of expressions so I would also have expected the string versions to be faster as well. But I wonder if there are any cases in which we could either leverage an existing attribute or come up with a new attribute to make things cheaper. I am not even sure we or the compiler team have done any targeted effort to optimize the lambda expression path.

@jemiller0
Copy link

On my model, I use attributes where I can and use the fluent APIs where I can't. I would prefer to use attributes for everything. I'm using my own code generator now. From what I remember the EF built in one uses the fluent API. It would be nice to allow a user to choose which they prefer. I found the attributes easier to use because they are at the same place in the source code, right next to the properties. It would be nice if some of the attributes could be enhanced to support things like foreign keys that have more than one column. Personally, I don't like composite keys and can understand trying to keep the attributes simple though. I'm dealing with a legacy database though that has a lot of composite keys. I think one other problem that I ran into was certain providers defaulting things like decimals to really wide values in the database. So, I am explicitly setting the SQL data type for those. Maybe I could change those to using attributes and it would speed it up. There are a few places, where I have conditional code where I check to see what provider is being used. For example, if it's an XML field, it sets the SQL data type to XML for SQL Server or PostgreSQL, but, sets it to a CLOB for MySQL since it doesn't have an XML data type. I always wondered if the attributes were what was slowing things down compared to NHibernate, which uses XML mapping files by default. I never tried NHibernate using attributes. From what I've heard, reflection is slow. So, I wondered if that was one of the issues. It is interesting that using attributes would be faster than using the fluent API.

@divega divega changed the title Large DbContext Startup Time and View Generation Large DbContext Startup Time Sep 18, 2018
@ajcvickers ajcvickers modified the milestones: 2.1.0-rc1, 2.1.0 Nov 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-perf closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-enhancement
Projects
None yet
Development

No branches or pull requests

8 participants