You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since the expression trees contains different constants, the expression tree differs and each of these queries will be compiled separately by EF Core. In addition, each query produces a slightly different SQL command:
55
52
@@ -67,12 +64,7 @@ Because the SQL differs, your database server will likely also need to produce a
67
64
68
65
A small modification to your queries can change things considerably:
Since the blog name is now *parameterized*, both queries have the same tree shape, and EF only needs to be compiled once. The SQL produced is also parameterized, allowing the database to reuse the same query plan:
Copy file name to clipboardExpand all lines: entity-framework/core/performance/efficient-querying.md
+9-63Lines changed: 9 additions & 63 deletions
Original file line number
Diff line number
Diff line change
@@ -13,12 +13,9 @@ Querying efficiently is a vast subject, that covers subjects as wide-ranging as
13
13
14
14
The main deciding factor in whether a query runs fast or not is whether it will properly utilize indexes where appropriate: databases are typically used to hold large amounts of data, and queries which traverse entire tables are typically sources of serious performance issues. Indexing issues aren't easy to spot, because it isn't immediately obvious whether a given query will use an index or not. For example:
15
15
16
-
```csharp
17
-
_=ctx.Blogs.Where(b=>b.Name.StartsWith("A")).ToList(); // Uses an index defined on Name on SQL Server
18
-
_=ctx.Blogs.Where(b=>b.Name.EndsWith("B")).ToList(); // Does not use the index
The main way the spot indexing issues is to first pinpoint a slow query, and then examine its query plan via your database's favorite tool; see the [performance diagnosis](xref:core/performance/performance-diagnosis) page for more information on how to do that. The query plan displays whether the query traverses the entire table, or uses an index.
18
+
A good way to spot indexing issues is to first pinpoint a slow query, and then examine its query plan via your database's favorite tool; see the [performance diagnosis](xref:core/performance/performance-diagnosis) page for more information on how to do that. The query plan displays whether the query traverses the entire table, or uses an index.
22
19
23
20
As a general rule, there isn't any special EF knowledge to using indexes or diagnosing performance issues related to them; general database knowledge related to indexes is just as relevant to EF applications as to applications not using EF. The following lists some general guidelines to keep in mind when using indexes:
24
21
@@ -31,12 +28,7 @@ As a general rule, there isn't any special EF knowledge to using indexes or diag
31
28
32
29
EF Core makes it very easy to query out entity instances, and then use those instances in code. However, querying entity instances can frequently pull back more data than necessary from your database. Consider the following:
Although this code only actually needs each Blog's `Url` property, the entire Blog entity is fetched, and unneeded columns are transferred from the database:
42
34
@@ -47,12 +39,7 @@ FROM [Blogs] AS [b]
47
39
48
40
This can be optimized by using `Select` to tell EF which columns to project out:
Since the number of rows returned depends on actual data in your database, it's impossible to know how much data will be loaded from the database, how much memory will be taken up by the results, and how much additional load will be generated when processing these results (e.g. by sending them to a user browser over the network). Crucially, test databases frequently contain little data, so that everything works well while testing, but performance problems suddenly appear when the query starts running on real-world data and many rows are returned.
79
62
80
63
As a result, it's usually worth giving thought to limiting the number of results:
At a minimum, your UI could show a message indicating that more rows may exist in the database (and allow retrieving them in some other manner). A full-blown solution would implement *paging*, where your UI only shows a certain number of rows at a time, and allow users to advance to the next page as needed; this typically combines the <xref:System.Linq.Enumerable.Take%2A> and <xref:System.Linq.Enumerable.Skip%2A> operators to select a specific range in the resultset each time.
90
68
@@ -122,15 +100,7 @@ In other scenarios, we may not know which related entity we're going to need bef
This seemingly innocent piece of code iterates through all the blogs and their posts, printing them out. Turning on EF Core's [statement logging](xref:core/logging-events-diagnostics/index) reveals the following:
136
106
@@ -162,15 +132,7 @@ What's going on here? Why are all these queries being sent for the simple loops
162
132
163
133
Assuming we're going to need all of the blogs' posts, it makes sense to use eager loading here instead. We can use the [Include](xref:core/querying/related-data/eager#eager-loading) operator to perform the loading, but since we only need the Blogs' URLs (and we should only [load what's needed](xref:core/performance/efficient-updating#project-only-properties-you-need)). So we'll use a projection instead:
This will make EF Core fetch all the Blogs - along with their Posts - in a single query. In some cases, it may also be useful to avoid cartesian explosion effects by using [split queries](xref:core/querying/single-split-queries).
176
138
@@ -183,23 +145,7 @@ Buffering refers to loading all your query results into memory, whereas streamin
183
145
184
146
Whether a query buffers or streams depends on how it is evaluated:
185
147
186
-
```csharp
187
-
// ToList and ToArray cause the entire resultset to be buffered:
If your queries return just a few results, then you probably don't have to worry about this. However, if your query might return large numbers of rows, it's worth giving thought to streaming instead of buffering.
The above loads a blog from the database, changes its name, and then adds two new blogs; to apply this, two SQL INSERT statements and one UPDATE statement are sent to the database. Rather than sending them one by one, as Blog instances are added, EF Core tracks these changes internally, and executes them in a single roundtrip when <xref:Microsoft.EntityFrameworkCore.DbContext.SaveChanges%2A> is called.
23
17
24
18
The number of statements that EF batches in a single roundtrip depends on the database provider being used. For example, performance analysis has shown batching to be generally less efficient for SQL Server when less than 4 statements are involved. Similarly, the benefits of batching degrade after around 40 statements for SQL Server, so EF Core will by default only execute up to 42 statements in a single batch, and execute additional statements in separate roundtrips.
25
19
26
20
Users can also tweak these thresholds to achieve potentially higher performance - but benchmark carefully before modifying these:
Let's assume you want to give all Employees of a certain department a raise. A typical implementation for this in EF Core would look like the following
26
+
Let's assume you want to give all your employees a raise. A typical implementation for this in EF Core would look like the following:
While this is perfectly valid code, let's analyze what it does from a performance perspective:
48
31
@@ -53,13 +36,11 @@ While this is perfectly valid code, let's analyze what it does from a performanc
53
36
Relational databases also support *bulk updates*, so the above could be rewritten as the following single SQL statement:
54
37
55
38
```sql
56
-
UPDATE [Employees] SET [Salary] = [Salary] +1000WHERE [DepartmentId] =10;
39
+
UPDATE [Employees] SET [Salary] = [Salary] +1000;
57
40
```
58
41
59
-
This performs the entire operation in a single roundtrip, without loading or sending any actual data to the database, and without making use of EF's change tracking machinery, which does have an overhead cost.
42
+
This performs the entire operation in a single roundtrip, without loading or sending any actual data to the database, and without making use of EF's change tracking machinery, which imposes an additional overhead.
60
43
61
44
Unfortunately, EF doesn't currently provide APIs for performing bulk updates. Until these are introduced, you can use raw SQL to perform the operation where performance is sensitive:
62
45
63
-
```csharp
64
-
context.Database.ExecuteSqlRaw("UPDATE [Employees] SET [Salary] = [Salary] + 1000 WHERE [DepartmentId] = {0}", departmentId);
@@ -65,23 +44,16 @@ The above command took 4 milliseconds. If a certain command takes more than expe
65
44
66
45
One problem with command execution logging is that it's sometimes difficult to correlate SQL queries and LINQ queries: the SQL commands executed by EF can look very different from the LINQ queries from which they were generated. To help with this difficulty, you may want to use EF's [query tags](xref:core/querying/tags) feature, which allows you to inject a small, identifying comment into the SQL query:
ORDER BY [p].[Location].STDistance(@__myLocation_0) DESC
85
57
```
86
58
87
59
It's often worth tagging the major queries of an application in this way, to make the command execution logs more immediately readable.
@@ -129,18 +101,34 @@ As a simple benchmark scenario, let's compare the following different methods of
129
101
* Avoid loading the entire Blog entity instances at all, by projecting out the ranking only. The saves us from transferring the other, unneeded columns of the Blog entity type.
130
102
* Calculate the average in the database by making it part of the query. This should be the fastest way, since everything is calculated in the database and only the result is transferred back to the client.
With BenchmarkDotNet, you write the code to be benchmarked as a simple method - just like a unit test - and BenchmarkDotNet automatically runs each method for sufficient number of iterations, reliably measuring how long it takes and how much memory is allocated. Here are the different method ([the full benchmark code can be seen here](https://github.com/dotnet/EntityFramework.Docs/tree/master/samples/core/Benchmarks/AverageBlogRanking.cs)):
| LoadEntities | 2,860.4 us | 54.31 us | 93.68 us | 2,844.5 us | 4.55 | 0.33 | 210.9375 | 70.3125 | - | 1309.56 KB |
129
+
|LoadEntitiesNoTracking| 1,353.0 us | 21.26 us | 18.85 us | 1,355.6 us | 2.10 | 0.14 | 87.8906 | 3.9063 | - | 540.09 KB |
130
+
| ProjectOnlyRanking | 910.9 us | 20.91 us | 61.65 us | 892.9 us | 1.46 | 0.14 | 41.0156 | 0.9766 | - | 252.08 KB |
131
+
| CalculateInDatabase | 627.1 us | 14.58 us | 42.54 us | 626.4 us | 1.00 | 0.00 | 4.8828 | - | - | 33.27 KB |
144
132
145
133
> [!NOTE]
146
134
> As the methods instantiate and dispose the context within the method, these operations are counted for the benchmark, although strictly speaking they are not part of the querying process. This should not matter if the goal is to compare two alternatives to one another (since the context instantiation and disposal are the same), and gives a more holistic measurement for the entire operation.
0 commit comments