Skip to content

Commit

Permalink
Move source snippets to samples
Browse files Browse the repository at this point in the history
  • Loading branch information
roji committed Dec 14, 2020
1 parent 3b7dfb5 commit 84c08ae
Show file tree
Hide file tree
Showing 13 changed files with 433 additions and 146 deletions.
12 changes: 2 additions & 10 deletions entity-framework/core/performance/advanced-performance-topics.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,10 +46,7 @@ When EF receives a LINQ query tree for execution, it must first "compile" that t

Consider the following two queries:

```csharp
var blog1 = ctx.Blogs.FirstOrDefault(b => b.Name == "blog1");
var blog2 = ctx.Blogs.FirstOrDefault(b => b.Name == "blog2");
```
[!code-csharp[Main](../../../samples/core/Performance/Program.cs#QueriesWithConstants)]

Since the expression trees contains different constants, the expression tree differs and each of these queries will be compiled separately by EF Core. In addition, each query produces a slightly different SQL command:

Expand All @@ -67,12 +64,7 @@ Because the SQL differs, your database server will likely also need to produce a

A small modification to your queries can change things considerably:

```csharp
var blogName = "blog1";
var blog1 = ctx.Blogs.FirstOrDefault(b => b.Name == blogName);
blogName = "blog2";
var blog2 = ctx.Blogs.FirstOrDefault(b => b.Name == blogName);
```
[!code-csharp[Main](../../../samples/core/Performance/Program.cs#QueriesWithParameterization)]

Since the blog name is now *parameterized*, both queries have the same tree shape, and EF only needs to be compiled once. The SQL produced is also parameterized, allowing the database to reuse the same query plan:

Expand Down
72 changes: 9 additions & 63 deletions entity-framework/core/performance/efficient-querying.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,9 @@ Querying efficiently is a vast subject, that covers subjects as wide-ranging as

The main deciding factor in whether a query runs fast or not is whether it will properly utilize indexes where appropriate: databases are typically used to hold large amounts of data, and queries which traverse entire tables are typically sources of serious performance issues. Indexing issues aren't easy to spot, because it isn't immediately obvious whether a given query will use an index or not. For example:

```csharp
_ = ctx.Blogs.Where(b => b.Name.StartsWith("A")).ToList(); // Uses an index defined on Name on SQL Server
_ = ctx.Blogs.Where(b => b.Name.EndsWith("B")).ToList(); // Does not use the index
```
[!code-csharp[Main](../../../samples/core/Performance/Program.cs#Indexes)]

The main way the spot indexing issues is to first pinpoint a slow query, and then examine its query plan via your database's favorite tool; see the [performance diagnosis](xref:core/performance/performance-diagnosis) page for more information on how to do that. The query plan displays whether the query traverses the entire table, or uses an index.
A good way to spot indexing issues is to first pinpoint a slow query, and then examine its query plan via your database's favorite tool; see the [performance diagnosis](xref:core/performance/performance-diagnosis) page for more information on how to do that. The query plan displays whether the query traverses the entire table, or uses an index.

As a general rule, there isn't any special EF knowledge to using indexes or diagnosing performance issues related to them; general database knowledge related to indexes is just as relevant to EF applications as to applications not using EF. The following lists some general guidelines to keep in mind when using indexes:

Expand All @@ -31,12 +28,7 @@ As a general rule, there isn't any special EF knowledge to using indexes or diag

EF Core makes it very easy to query out entity instances, and then use those instances in code. However, querying entity instances can frequently pull back more data than necessary from your database. Consider the following:

```csharp
foreach (var blog in ctx.Blogs)
{
Console.WriteLine("Blog: " + blog.Url);
}
```
[!code-csharp[Main](../../../samples/core/Performance/Program.cs#ProjectEntities)]

Although this code only actually needs each Blog's `Url` property, the entire Blog entity is fetched, and unneeded columns are transferred from the database:

Expand All @@ -47,12 +39,7 @@ FROM [Blogs] AS [b]

This can be optimized by using `Select` to tell EF which columns to project out:

```csharp
foreach (var blogName in ctx.Blogs.Select(b => b.Url))
{
Console.WriteLine("Blog: " + blogName);
}
```
[!code-csharp[Main](../../../samples/core/Performance/Program.cs#ProjectSingleProperty)]

The resulting SQL pulls back only the needed columns:

Expand All @@ -69,22 +56,13 @@ Note that this technique is very useful for read-only queries, but things get mo

By default, a query returns all rows that matches its filters:

```csharp
var blogs = ctx.Blogs
.Where(b => b.Name.StartsWith("A"))
.ToList();
```
[!code-csharp[Main](../../../samples/core/Performance/Program.cs#NoLimit)]

Since the number of rows returned depends on actual data in your database, it's impossible to know how much data will be loaded from the database, how much memory will be taken up by the results, and how much additional load will be generated when processing these results (e.g. by sending them to a user browser over the network). Crucially, test databases frequently contain little data, so that everything works well while testing, but performance problems suddenly appear when the query starts running on real-world data and many rows are returned.

As a result, it's usually worth giving thought to limiting the number of results:

```csharp
var blogs = ctx.Blogs
.Where(b => b.Name.StartsWith("A"))
.Take(25)
.ToList();
```
[!code-csharp[Main](../../../samples/core/Performance/Program.cs#Limit25)]

At a minimum, your UI could show a message indicating that more rows may exist in the database (and allow retrieving them in some other manner). A full-blown solution would implement *paging*, where your UI only shows a certain number of rows at a time, and allow users to advance to the next page as needed; this typically combines the <xref:System.Linq.Enumerable.Take%2A> and <xref:System.Linq.Enumerable.Skip%2A> operators to select a specific range in the resultset each time.

Expand Down Expand Up @@ -122,15 +100,7 @@ In other scenarios, we may not know which related entity we're going to need bef

Consider the following:

```csharp
foreach (var blog in ctx.Blogs.ToList())
{
foreach (var post in blog.Posts)
{
Console.WriteLine($"Blog {blog.Url}, Post: {post.Title}");
}
}
```
[!code-csharp[Main](../../../samples/core/Performance/Program.cs#NPlusOne)]

This seemingly innocent piece of code iterates through all the blogs and their posts, printing them out. Turning on EF Core's [statement logging](xref:core/logging-events-diagnostics/index) reveals the following:

Expand Down Expand Up @@ -162,15 +132,7 @@ What's going on here? Why are all these queries being sent for the simple loops

Assuming we're going to need all of the blogs' posts, it makes sense to use eager loading here instead. We can use the [Include](xref:core/querying/related-data/eager#eager-loading) operator to perform the loading, but since we only need the Blogs' URLs (and we should only [load what's needed](xref:core/performance/efficient-updating#project-only-properties-you-need)). So we'll use a projection instead:

```csharp
foreach (var blog in ctx.Blogs.Select(b => new { b.Url, b.Posts }).ToList())
{
foreach (var post in blog.Posts)
{
Console.WriteLine($"Blog {blog.Url}, Post: {post.Title}");
}
}
```
[!code-csharp[Main](../../../samples/core/Performance/Program.cs#EagerlyLoadRelatedAndProject)]

This will make EF Core fetch all the Blogs - along with their Posts - in a single query. In some cases, it may also be useful to avoid cartesian explosion effects by using [split queries](xref:core/querying/single-split-queries).

Expand All @@ -183,23 +145,7 @@ Buffering refers to loading all your query results into memory, whereas streamin

Whether a query buffers or streams depends on how it is evaluated:

```csharp
// ToList and ToArray cause the entire resultset to be buffered:
var blogsList = context.Blogs.Where(b => b.Name.StartsWith("A")).ToList();
var blogsArray = context.Blogs.Where(b => b.Name.StartsWith("A")).ToArray();

// Foreach streams, processing one row at a time:
foreach (var blog in context.Blogs.Where(b => b.Name.StartsWith("A")))
{
// ...
}

// AsEnumerable also streams, allowing you to execute LINQ operators on the client-side:
var groupedBlogs = context.Blogs
.Where(b => b.Name.StartsWith("A"))
.AsEnumerable()
.Where(b => SomeDotNetMethod(b));
```
[!code-csharp[Main](../../../samples/core/Performance/Program.cs#BufferingAndStreaming)]

If your queries return just a few results, then you probably don't have to worry about this. However, if your query might return large numbers of rows, it's worth giving thought to streaming instead of buffering.

Expand Down
33 changes: 7 additions & 26 deletions entity-framework/core/performance/efficient-updating.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,38 +11,21 @@ uid: core/performance/efficient-updating

EF Core helps minimize roundtrips by automatically batching together all updates in a single roundtrip. Consider the following:

```csharp
var blog = context.Blogs.Single(b => b.Name == "EF Core Blog");
blog.Url = "http://some.new.website";
context.Add(new Blog { Name = "Another blog"});
context.Add(new Blog { Name = "Yet another blog"});
context.SaveChanges();
```
[!code-csharp[Main](../../../samples/core/Performance/Program.cs#SaveChangesBatching)]

The above loads a blog from the database, changes its name, and then adds two new blogs; to apply this, two SQL INSERT statements and one UPDATE statement are sent to the database. Rather than sending them one by one, as Blog instances are added, EF Core tracks these changes internally, and executes them in a single roundtrip when <xref:Microsoft.EntityFrameworkCore.DbContext.SaveChanges%2A> is called.

The number of statements that EF batches in a single roundtrip depends on the database provider being used. For example, performance analysis has shown batching to be generally less efficient for SQL Server when less than 4 statements are involved. Similarly, the benefits of batching degrade after around 40 statements for SQL Server, so EF Core will by default only execute up to 42 statements in a single batch, and execute additional statements in separate roundtrips.

Users can also tweak these thresholds to achieve potentially higher performance - but benchmark carefully before modifying these:

```csharp
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
=> optionsBuilder.UseSqlServer(@"...", o => o
.MinBatchSize(1)
.MaxBatchSize(100))
```
[!code-csharp[Main](../../../samples/core/Performance/BatchTweakingContext.cs#BatchTweaking)]

## Bulk updates

Let's assume you want to give all Employees of a certain department a raise. A typical implementation for this in EF Core would look like the following
Let's assume you want to give all your employees a raise. A typical implementation for this in EF Core would look like the following:

```csharp
foreach (var employee in context.Employees.Where(e => e.Department.Id == 10))
{
employee.Salary += 1000;
}
context.SaveChanges();
```
[!code-csharp[Main](../../../samples/core/Performance/BatchTweakingContext.cs#UpdateWithoutBulk)]

While this is perfectly valid code, let's analyze what it does from a performance perspective:

Expand All @@ -53,13 +36,11 @@ While this is perfectly valid code, let's analyze what it does from a performanc
Relational databases also support *bulk updates*, so the above could be rewritten as the following single SQL statement:

```sql
UPDATE [Employees] SET [Salary] = [Salary] + 1000 WHERE [DepartmentId] = 10;
UPDATE [Employees] SET [Salary] = [Salary] + 1000;
```

This performs the entire operation in a single roundtrip, without loading or sending any actual data to the database, and without making use of EF's change tracking machinery, which does have an overhead cost.
This performs the entire operation in a single roundtrip, without loading or sending any actual data to the database, and without making use of EF's change tracking machinery, which imposes an additional overhead.

Unfortunately, EF doesn't currently provide APIs for performing bulk updates. Until these are introduced, you can use raw SQL to perform the operation where performance is sensitive:

```csharp
context.Database.ExecuteSqlRaw("UPDATE [Employees] SET [Salary] = [Salary] + 1000 WHERE [DepartmentId] = {0}", departmentId);
```
[!code-csharp[Main](../../../samples/core/Performance/Program.cs#UpdateWithBulk)]
76 changes: 32 additions & 44 deletions entity-framework/core/performance/performance-diagnosis.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,32 +17,11 @@ EF makes it very easy to capture command execution times, via either [simple log

### [Simple logging](#tab/simple-logging)

```csharp
class MyDbContext
{
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
optionsBuilder
.UseSqlServer(@"Server=(localdb)\mssqllocaldb;Database=Blogging;Integrated Security=True")
.LogTo(Console.WriteLine, LogLevel.Information);
}
}
```
[!code-csharp[Main](../../../samples/core/Performance/BloggingContext.cs#SimpleLogging)]

### [Microsoft.Extensions.Logging](#tab/microsoft-extensions-logging)

```csharp
class MyDbContext
{
static ILoggerFactory ContextLoggerFactory
=> LoggerFactory.Create(b => b.AddConsole().AddFilter("", LogLevel.Information));

protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
=> optionsBuilder
.UseSqlServer(@"Server=(localdb)\mssqllocaldb;Database=Blogging;Integrated Security=True")
.UseLoggerFactory(ContextLoggerFactory);
}
```
[!code-csharp[Main](../../../samples/core/Performance/ExtensionsLoggingContext.cs#ExtensionsLogging)]

***

Expand All @@ -65,23 +44,16 @@ The above command took 4 milliseconds. If a certain command takes more than expe

One problem with command execution logging is that it's sometimes difficult to correlate SQL queries and LINQ queries: the SQL commands executed by EF can look very different from the LINQ queries from which they were generated. To help with this difficulty, you may want to use EF's [query tags](xref:core/querying/tags) feature, which allows you to inject a small, identifying comment into the SQL query:

```csharp
var blogs = ctx.Blogs
.TagWith("GetBlogByName")
.Where(b => b.Name == "foo")
.ToList();
```
[!code-csharp[Main](../../../samples/core/Querying/Tags/Program.cs#BasicQueryTag)]

The tag shows up in the logs:

```csharp
info: 06/12/2020 09:25:42.951 RelationalEventId.CommandExecuted[20101] (Microsoft.EntityFrameworkCore.Database.Command)
Executed DbCommand (4ms) [Parameters=[], CommandType='Text', CommandTimeout='30']
-- GetBlogByName
```sql
-- This is my spatial query!

SELECT [b].[Id], [b].[Name]
FROM [Blogs] AS [b]
WHERE [b].[Name] = N'foo'
SELECT TOP(@__p_1) [p].[Id], [p].[Location]
FROM [People] AS [p]
ORDER BY [p].[Location].STDistance(@__myLocation_0) DESC
```

It's often worth tagging the major queries of an application in this way, to make the command execution logs more immediately readable.
Expand Down Expand Up @@ -129,18 +101,34 @@ As a simple benchmark scenario, let's compare the following different methods of
* Avoid loading the entire Blog entity instances at all, by projecting out the ranking only. The saves us from transferring the other, unneeded columns of the Blog entity type.
* Calculate the average in the database by making it part of the query. This should be the fastest way, since everything is calculated in the database and only the result is transferred back to the client.

With BenchmarkDotNet, you write the code to be benchmarked as a simple method - just like a unit test - and BenchmarkDotNet automatically runs each method for sufficient number of iterations, reliably measuring how long it takes and how much memory is allocated. Here's the benchmark code:
With BenchmarkDotNet, you write the code to be benchmarked as a simple method - just like a unit test - and BenchmarkDotNet automatically runs each method for sufficient number of iterations, reliably measuring how long it takes and how much memory is allocated. Here are the different method ([the full benchmark code can be seen here](https://github.com/dotnet/EntityFramework.Docs/tree/master/samples/core/Benchmarks/AverageBlogRanking.cs)):

### [Load entities](#tab/load-entities)

[!code-csharp[Main](../../../samples/core/Benchmarks/AverageBlogRanking.cs?name=Benchmarks)]
[!code-csharp[Main](../../../samples/core/Benchmarks/AverageBlogRanking.cs?name=LoadEntities)]

### [Load entities, no tracking](#tab/load-entities-no-tracking)

[!code-csharp[Main](../../../samples/core/Benchmarks/AverageBlogRanking.cs?name=LoadEntitiesNoTracking)]

### [Project only ranking](#tab/project-only-ranking)

[!code-csharp[Main](../../../samples/core/Benchmarks/AverageBlogRanking.cs?name=ProjectOnlyRanking)]

### [Calculate in database](#tab/calculate-in-database)

[!code-csharp[Main](../../../samples/core/Benchmarks/AverageBlogRanking.cs?name=CalculateInDatabase)]

***

The results are below, as printed by BenchmarkDotNet:

| Method | Mean | Error | StdDev | Median | Ratio | RatioSD | Gen 0 | Gen 1 | Gen 2 | Allocated |
|------------------------ |-----------:|---------:|---------:|-----------:|------:|--------:|---------:|--------:|------:|-----------:|
| LoadEntities | 2,860.4 us | 54.31 us | 93.68 us | 2,844.5 us | 4.55 | 0.33 | 210.9375 | 70.3125 | - | 1309.56 KB |
| LoadEntitiesNonTracking | 1,353.0 us | 21.26 us | 18.85 us | 1,355.6 us | 2.10 | 0.14 | 87.8906 | 3.9063 | - | 540.09 KB |
| ProjectOnlyRanking | 910.9 us | 20.91 us | 61.65 us | 892.9 us | 1.46 | 0.14 | 41.0156 | 0.9766 | - | 252.08 KB |
| CalculateInDatabase | 627.1 us | 14.58 us | 42.54 us | 626.4 us | 1.00 | 0.00 | 4.8828 | - | - | 33.27 KB |
| Method | Mean | Error | StdDev | Median | Ratio | RatioSD | Gen 0 | Gen 1 | Gen 2 | Allocated |
|----------------------- |-----------:|---------:|---------:|-----------:|------:|--------:|---------:|--------:|------:|-----------:|
| LoadEntities | 2,860.4 us | 54.31 us | 93.68 us | 2,844.5 us | 4.55 | 0.33 | 210.9375 | 70.3125 | - | 1309.56 KB |
| LoadEntitiesNoTracking | 1,353.0 us | 21.26 us | 18.85 us | 1,355.6 us | 2.10 | 0.14 | 87.8906 | 3.9063 | - | 540.09 KB |
| ProjectOnlyRanking | 910.9 us | 20.91 us | 61.65 us | 892.9 us | 1.46 | 0.14 | 41.0156 | 0.9766 | - | 252.08 KB |
| CalculateInDatabase | 627.1 us | 14.58 us | 42.54 us | 626.4 us | 1.00 | 0.00 | 4.8828 | - | - | 33.27 KB |

> [!NOTE]
> As the methods instantiate and dispose the context within the method, these operations are counted for the benchmark, although strictly speaking they are not part of the querying process. This should not matter if the goal is to compare two alternatives to one another (since the context instantiation and disposal are the same), and gives a more holistic measurement for the entire operation.
Expand Down
12 changes: 9 additions & 3 deletions samples/core/Benchmarks/AverageBlogRanking.cs
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ public void Setup()
context.SeedData();
}

#region Benchmarks
#region LoadEntities
[Benchmark]
public double LoadEntities()
{
Expand All @@ -37,9 +37,11 @@ public double LoadEntities()

return sum / count;
}
#endregion

#region LoadEntitiesNoTracking
[Benchmark]
public double LoadEntitiesNonTracking()
public double LoadEntitiesNoTracking()
{
var sum = 0;
var count = 0;
Expand All @@ -52,7 +54,9 @@ public double LoadEntitiesNonTracking()

return sum / count;
}
#endregion

#region ProjectOnlyRanking
[Benchmark]
public double ProjectOnlyRanking()
{
Expand All @@ -67,14 +71,16 @@ public double ProjectOnlyRanking()

return sum / count;
}
#endregion

#region CalculateInDatabase
[Benchmark(Baseline = true)]
public double CalculateInDatabase()
{
using var ctx = new BloggingContext();
return ctx.Blogs.Average(b => b.Rating);
}
#endregion Benchmarks
#endregion

public class BloggingContext : DbContext
{
Expand Down
Loading

0 comments on commit 84c08ae

Please sign in to comment.