Skip to content

Commit

Permalink
Add docs on stateful context pool (#3709)
Browse files Browse the repository at this point in the history
And rearrange the perf docs a little

Closes #3706
  • Loading branch information
roji authored Mar 2, 2022
1 parent 0bf3ee3 commit a230306
Show file tree
Hide file tree
Showing 34 changed files with 815 additions and 51 deletions.
76 changes: 47 additions & 29 deletions entity-framework/core/performance/advanced-performance-topics.md
Original file line number Diff line number Diff line change
@@ -1,36 +1,25 @@
---
title: Advanced Performance Topics
description: Advanced performance topics for Entity Framework Core
author: rick-anderson
ms.author: riande
ms.date: 10/21/2021
author: roji
ms.date: 1/31/2022
uid: core/performance/advanced-performance-topics
---
# Advanced Performance Topics

## DbContext pooling

A `DbContext` is generally a light object: creating and disposing one doesn't involve a database operation, and most applications can do so without any noticeable impact on performance. However, each `DbContext` does set up a various internal services and objects necessary for performing its duties, and the overhead of continuously doing so may be significant in high-performance scenarios. For these cases, EF Core can *pool* your `DbContext` instances: when you dispose your `DbContext`, EF Core resets its state and stores it in an internal pool; when a new instance is next requested, that pooled instance is returned instead of setting up a new one. `DbContext` pooling allows you to pay `DbContext` setup costs only once at program startup, rather than continuously.
A `DbContext` is generally a light object: creating and disposing one doesn't involve a database operation, and most applications can do so without any noticeable impact on performance. However, each context instance does set up a various internal services and objects necessary for performing its duties, and the overhead of continuously doing so may be significant in high-performance scenarios. For these cases, EF Core can *pool* your context instances: when you dispose your context, EF Core resets its state and stores it in an internal pool; when a new instance is next requested, that pooled instance is returned instead of setting up a new one. Context pooling allows you to pay context setup costs only once at program startup, rather than continuously.

Following are the benchmark results for fetching a single row from a SQL Server database running locally on the same machine, with and without `DbContext` pooling. As always, results will change with the number of rows, the latency to your database server and other factors. Importantly, this benchmarks single-threaded pooling performance, while a real-world contended scenario may have different results; benchmark on your platform before making any decisions. [The source code is available here](https://github.com/dotnet/EntityFramework.Docs/tree/main/samples/core/Benchmarks/ContextPooling.cs), feel free to use it as a basis for your own measurements.

| Method | NumBlogs | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---------------------- |--------- |---------:|---------:|---------:|--------:|------:|------:|----------:|
| WithoutContextPooling | 1 | 701.6 us | 26.62 us | 78.48 us | 11.7188 | - | - | 50.38 KB |
| WithContextPooling | 1 | 350.1 us | 6.80 us | 14.64 us | 0.9766 | - | - | 4.63 KB |

Note that `DbContext` pooling is orthogonal to database connection pooling, which is managed at a lower level in the database driver.
Note that context pooling is orthogonal to database connection pooling, which is managed at a lower level in the database driver.

### [With dependency injection](#tab/with-di)

The typical pattern in an ASP.NET Core app using EF Core involves registering a custom <xref:Microsoft.EntityFrameworkCore.DbContext> type into the [dependency injection](/aspnet/core/fundamentals/dependency-injection) container via <xref:Microsoft.Extensions.DependencyInjection.EntityFrameworkServiceCollectionExtensions.AddDbContext%2A>. Then, instances of that type are obtained through constructor parameters in controllers or Razor Pages.

To enable `DbContext` pooling, simply replace `AddDbContext` with <xref:Microsoft.Extensions.DependencyInjection.EntityFrameworkServiceCollectionExtensions.AddDbContextPool%2A>:
To enable context pooling, simply replace `AddDbContext` with <xref:Microsoft.Extensions.DependencyInjection.EntityFrameworkServiceCollectionExtensions.AddDbContextPool%2A>:

```csharp
services.AddDbContextPool<BloggingContext>(
options => options.UseSqlServer(connectionString));
```
[!code-csharp[Main](../../../samples/core/Performance/AspNetContextPooling/Program.cs#AddDbContextPool)]

The `poolSize` parameter of <xref:Microsoft.Extensions.DependencyInjection.EntityFrameworkServiceCollectionExtensions.AddDbContextPool%2A> sets the maximum number of instances retained by the pool (defaults to 1024 in EF Core 6.0, and to 128 in previous versions). Once `poolSize` is exceeded, new context instances are not cached and EF falls back to the non-pooling behavior of creating instances on demand.

Expand All @@ -39,24 +28,53 @@ The `poolSize` parameter of <xref:Microsoft.Extensions.DependencyInjection.Entit
> [!NOTE]
> Pooling without dependency injection was introduced in EF Core 6.0.
To use `DbContext` pooling without dependency injection, initialize a `PooledDbContextFactory` and request context instances from it:
To use context pooling without dependency injection, initialize a `PooledDbContextFactory` and request context instances from it:

[!code-csharp[Main](../../../samples/core/Performance/Program.cs#DbContextPoolingWithoutDI)]
[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#DbContextPoolingWithoutDI)]

The `poolSize` parameter of the `PooledDbContextFactory` constructor sets the maximum number of instances retained by the pool (defaults to 1024 in EF Core 6.0, and to 128 in previous versions). Once `poolSize` is exceeded, new context instances are not cached and EF falls back to the non-pooling behavior of creating instances on demand.

***

### Limitations
### Benchmarks

`DbContext` pooling has a few limitations on what can be done in the `OnConfiguring` method of the context.
Following are the benchmark results for fetching a single row from a SQL Server database running locally on the same machine, with and without context pooling. As always, results will change with the number of rows, the latency to your database server and other factors. Importantly, this benchmarks single-threaded pooling performance, while a real-world contended scenario may have different results; benchmark on your platform before making any decisions. [The source code is available here](https://github.com/dotnet/EntityFramework.Docs/tree/main/samples/core/Benchmarks/ContextPooling.cs), feel free to use it as a basis for your own measurements.

| Method | NumBlogs | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated |
|---------------------- |--------- |---------:|---------:|---------:|--------:|------:|------:|----------:|
| WithoutContextPooling | 1 | 701.6 us | 26.62 us | 78.48 us | 11.7188 | - | - | 50.38 KB |
| WithContextPooling | 1 | 350.1 us | 6.80 us | 14.64 us | 0.9766 | - | - | 4.63 KB |

> [!WARNING]
> Avoid using context pooling in apps that maintain state. For example, private fields in the context that shouldn't be shared across requests. EF Core only resets the state that it is aware of before adding a context instance to the pool.
### Managing state in pooled contexts

Context pooling works by reusing the same context instance across requests. This means that it's effectively registered as a [Singleton](/aspnet/core/fundamentals/dependency-injection#service-lifetimes) in terms of the instance itself so that it's able to persist.
Context pooling works by reusing the same context instance across requests; this means that it's effectively registered as a [Singleton](/aspnet/core/fundamentals/dependency-injection#service-lifetimes), and the same instance is reused across multiple requests (or DI scopes). This means that special care must be taken when the context involves any state that may change between requests. Crucially, the context's `OnConfiguring` is only invoked once - when the instance context is first created - and so cannot be used to set state which needs to vary (e.g. a tenant ID).

Context pooling is intended for scenarios where the context configuration, which includes services resolved, is fixed between requests. For cases where [Scoped](/aspnet/core/fundamentals/dependency-injection#service-lifetimes) services are required, or configuration needs to be changed, don't use pooling.
A typical scenario involving context state would be a multi-tenant ASP.NET Core application, where the context instance has a *tenant ID* which is taken into account by queries (see [Global Query Filters](xref:core/querying/filters) for more details). Since the tenant ID needs to change with each web request, we need to have go through some extra steps to make it all work with context pooling.

First, register a pooling context factory as a Singleton service, as usual:

[!code-csharp[Main](../../../samples/core/Performance/AspNetContextPoolingWithState/Program.cs#RegisterSingletonContextFactory)]

Next, write a custom context factory which gets a pooled context from the Singleton factory we registered, finds the tenant ID in the web request's `HttpContext`, and injects the ID into the context:

[!code-csharp[Main](../../../samples/core/Performance/AspNetContextPoolingWithState/WeatherForecastScopedFactory.cs#WeatherForecastScopedFactory)]

As written above, pay special attention to where you get the tenant ID from: this is an important aspect of your application's security.

Once we have our custom context factory, register it as a Scoped service:

[!code-csharp[Main](../../../samples/core/Performance/AspNetContextPoolingWithState/Program.cs#RegisterScopedContextFactory)]

Finally, arrange for a context to get injected from our Scoped factory:

[!code-csharp[Main](../../../samples/core/Performance/AspNetContextPoolingWithState/Program.cs#RegisterDbContext)]

As this point, your controllers automatically get injected with a context instance that has the right tenant ID, without having to know anything about it.

The full source code for this sample is available [here](https://github.com/dotnet/EntityFramework.Docs/tree/main/samples/core/Performance/AspNetContextPoolingWithState).

> [!NOTE]
> Although EF Core takes care of resetting internal state for `DbContext` and its related services, it generally does not reset state in the underlying database driver, which is outside of EF. For example, if you manually open and use a `DbConnection` or otherwise manipulate ADO.NET state, it's up to you to restore that state before returning the context instance to the pool, e.g. by closing the connection. Failure to do so may cause state to get leaked across unrelated requests.
## Compiled queries

Expand All @@ -75,11 +93,11 @@ EF supports *compiled queries*, which allow the explicit compilation of a LINQ q

To use compiled queries, first compile a query with <xref:Microsoft.EntityFrameworkCore.EF.CompileAsyncQuery%2A?displayProperty=nameWithType> as follows (use <xref:Microsoft.EntityFrameworkCore.EF.CompileQuery%2A?displayProperty=nameWithType> for synchronous queries):

[!code-csharp[Main](../../../samples/core/Performance/Program.cs#CompiledQueryCompile)]
[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#CompiledQueryCompile)]

In this code sample, we provide EF with a lambda accepting a `DbContext` instance, and an arbitrary parameter to be passed to the query. You can now invoke that delegate whenever you wish to execute the query:

[!code-csharp[Main](../../../samples/core/Performance/Program.cs#CompiledQueryExecute)]
[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#CompiledQueryExecute)]

Note that the delegate is thread-safe, and can be invoked concurrently on different context instances.

Expand All @@ -94,7 +112,7 @@ When EF receives a LINQ query tree for execution, it must first "compile" that t

Consider the following two queries:

[!code-csharp[Main](../../../samples/core/Performance/Program.cs#QueriesWithConstants)]
[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#QueriesWithConstants)]

Since the expression trees contains different constants, the expression tree differs and each of these queries will be compiled separately by EF Core. In addition, each query produces a slightly different SQL command:

Expand All @@ -112,7 +130,7 @@ Because the SQL differs, your database server will likely also need to produce a

A small modification to your queries can change things considerably:

[!code-csharp[Main](../../../samples/core/Performance/Program.cs#QueriesWithParameterization)]
[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#QueriesWithParameterization)]

Since the blog name is now *parameterized*, both queries have the same tree shape, and EF only needs to be compiled once. The SQL produced is also parameterized, allowing the database to reuse the same query plan:

Expand Down
16 changes: 8 additions & 8 deletions entity-framework/core/performance/efficient-querying.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Querying efficiently is a vast subject, that covers subjects as wide-ranging as

The main deciding factor in whether a query runs fast or not is whether it will properly utilize indexes where appropriate: databases are typically used to hold large amounts of data, and queries which traverse entire tables are typically sources of serious performance issues. Indexing issues aren't easy to spot, because it isn't immediately obvious whether a given query will use an index or not. For example:

[!code-csharp[Main](../../../samples/core/Performance/Program.cs#Indexes)]
[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#Indexes)]

A good way to spot indexing issues is to first pinpoint a slow query, and then examine its query plan via your database's favorite tool; see the [performance diagnosis](xref:core/performance/performance-diagnosis) page for more information on how to do that. The query plan displays whether the query traverses the entire table, or uses an index.

Expand All @@ -28,7 +28,7 @@ As a general rule, there isn't any special EF knowledge to using indexes or diag

EF Core makes it very easy to query out entity instances, and then use those instances in code. However, querying entity instances can frequently pull back more data than necessary from your database. Consider the following:

[!code-csharp[Main](../../../samples/core/Performance/Program.cs#ProjectEntities)]
[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#ProjectEntities)]

Although this code only actually needs each Blog's `Url` property, the entire Blog entity is fetched, and unneeded columns are transferred from the database:

Expand All @@ -39,7 +39,7 @@ FROM [Blogs] AS [b]

This can be optimized by using `Select` to tell EF which columns to project out:

[!code-csharp[Main](../../../samples/core/Performance/Program.cs#ProjectSingleProperty)]
[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#ProjectSingleProperty)]

The resulting SQL pulls back only the needed columns:

Expand All @@ -56,13 +56,13 @@ Note that this technique is very useful for read-only queries, but things get mo

By default, a query returns all rows that matches its filters:

[!code-csharp[Main](../../../samples/core/Performance/Program.cs#NoLimit)]
[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#NoLimit)]

Since the number of rows returned depends on actual data in your database, it's impossible to know how much data will be loaded from the database, how much memory will be taken up by the results, and how much additional load will be generated when processing these results (e.g. by sending them to a user browser over the network). Crucially, test databases frequently contain little data, so that everything works well while testing, but performance problems suddenly appear when the query starts running on real-world data and many rows are returned.

As a result, it's usually worth giving thought to limiting the number of results:

[!code-csharp[Main](../../../samples/core/Performance/Program.cs#Limit25)]
[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#Limit25)]

At a minimum, your UI could show a message indicating that more rows may exist in the database (and allow retrieving them in some other manner). A full-blown solution would implement *pagination*, where your UI only shows a certain number of rows at a time, and allow users to advance to the next page as needed; see the next section for more details on how to implement this efficiently.

Expand Down Expand Up @@ -106,7 +106,7 @@ In other scenarios, we may not know which related entity we're going to need bef

Consider the following:

[!code-csharp[Main](../../../samples/core/Performance/Program.cs#NPlusOne)]
[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#NPlusOne)]

This seemingly innocent piece of code iterates through all the blogs and their posts, printing them out. Turning on EF Core's [statement logging](xref:core/logging-events-diagnostics/index) reveals the following:

Expand Down Expand Up @@ -138,7 +138,7 @@ What's going on here? Why are all these queries being sent for the simple loops

Assuming we're going to need all of the blogs' posts, it makes sense to use eager loading here instead. We can use the [Include](xref:core/querying/related-data/eager#eager-loading) operator to perform the loading, but since we only need the Blogs' URLs (and we should only [load what's needed](xref:core/performance/efficient-querying#project-only-properties-you-need)). So we'll use a projection instead:

[!code-csharp[Main](../../../samples/core/Performance/Program.cs#EagerlyLoadRelatedAndProject)]
[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#EagerlyLoadRelatedAndProject)]

This will make EF Core fetch all the Blogs - along with their Posts - in a single query. In some cases, it may also be useful to avoid cartesian explosion effects by using [split queries](xref:core/querying/single-split-queries).

Expand All @@ -151,7 +151,7 @@ Buffering refers to loading all your query results into memory, whereas streamin

Whether a query buffers or streams depends on how it is evaluated:

[!code-csharp[Main](../../../samples/core/Performance/Program.cs#BufferingAndStreaming)]
[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#BufferingAndStreaming)]

If your queries return just a few results, then you probably don't have to worry about this. However, if your query might return large numbers of rows, it's worth giving thought to streaming instead of buffering.

Expand Down
Loading

0 comments on commit a230306

Please sign in to comment.