Add docs on stateful context pool (#3709)

And rearrange the perf docs a little Closes #3706
dotnet · Mar 2, 2022 · a230306 · a230306
1 parent 0bf3ee3
commit a230306
Show file tree

Hide file tree

Showing 34 changed files with 815 additions and 51 deletions.
diff --git a/entity-framework/core/performance/advanced-performance-topics.md b/entity-framework/core/performance/advanced-performance-topics.md
@@ -1,36 +1,25 @@
 ---
 title: Advanced Performance Topics
 description: Advanced performance topics for Entity Framework Core
-author: rick-anderson
-ms.author: riande
-ms.date: 10/21/2021
+author: roji
+ms.date: 1/31/2022
 uid: core/performance/advanced-performance-topics
 ---
 # Advanced Performance Topics
 
 ## DbContext pooling
 
-A `DbContext` is generally a light object: creating and disposing one doesn't involve a database operation, and most applications can do so without any noticeable impact on performance. However, each `DbContext` does set up a various internal services and objects necessary for performing its duties, and the overhead of continuously doing so may be significant in high-performance scenarios. For these cases, EF Core can *pool* your `DbContext` instances: when you dispose your `DbContext`, EF Core resets its state and stores it in an internal pool; when a new instance is next requested, that pooled instance is returned instead of setting up a new one. `DbContext` pooling allows you to pay `DbContext` setup costs only once at program startup, rather than continuously.
+A `DbContext` is generally a light object: creating and disposing one doesn't involve a database operation, and most applications can do so without any noticeable impact on performance. However, each context instance does set up a various internal services and objects necessary for performing its duties, and the overhead of continuously doing so may be significant in high-performance scenarios. For these cases, EF Core can *pool* your context instances: when you dispose your context, EF Core resets its state and stores it in an internal pool; when a new instance is next requested, that pooled instance is returned instead of setting up a new one. Context pooling allows you to pay context setup costs only once at program startup, rather than continuously.
 
-Following are the benchmark results for fetching a single row from a SQL Server database running locally on the same machine, with and without `DbContext` pooling. As always, results will change with the number of rows, the latency to your database server and other factors. Importantly, this benchmarks single-threaded pooling performance, while a real-world contended scenario may have different results; benchmark on your platform before making any decisions. [The source code is available here](https://github.com/dotnet/EntityFramework.Docs/tree/main/samples/core/Benchmarks/ContextPooling.cs), feel free to use it as a basis for your own measurements.
-
-|                Method | NumBlogs |     Mean |    Error |   StdDev |   Gen 0 | Gen 1 | Gen 2 | Allocated |
-|---------------------- |--------- |---------:|---------:|---------:|--------:|------:|------:|----------:|
-| WithoutContextPooling |        1 | 701.6 us | 26.62 us | 78.48 us | 11.7188 |     - |     - |  50.38 KB |
-|    WithContextPooling |        1 | 350.1 us |  6.80 us | 14.64 us |  0.9766 |     - |     - |   4.63 KB |
-
-Note that `DbContext` pooling is orthogonal to database connection pooling, which is managed at a lower level in the database driver.
+Note that context pooling is orthogonal to database connection pooling, which is managed at a lower level in the database driver.
 
 ### [With dependency injection](#tab/with-di)
 
 The typical pattern in an ASP.NET Core app using EF Core involves registering a custom <xref:Microsoft.EntityFrameworkCore.DbContext> type into the [dependency injection](/aspnet/core/fundamentals/dependency-injection) container via <xref:Microsoft.Extensions.DependencyInjection.EntityFrameworkServiceCollectionExtensions.AddDbContext%2A>. Then, instances of that type are obtained through constructor parameters in controllers or Razor Pages.
 
-To enable `DbContext` pooling, simply replace `AddDbContext` with <xref:Microsoft.Extensions.DependencyInjection.EntityFrameworkServiceCollectionExtensions.AddDbContextPool%2A>:
+To enable context pooling, simply replace `AddDbContext` with <xref:Microsoft.Extensions.DependencyInjection.EntityFrameworkServiceCollectionExtensions.AddDbContextPool%2A>:
 
-```csharp
-services.AddDbContextPool<BloggingContext>(
-    options => options.UseSqlServer(connectionString));
-```
+[!code-csharp[Main](../../../samples/core/Performance/AspNetContextPooling/Program.cs#AddDbContextPool)]
 
 The `poolSize` parameter of <xref:Microsoft.Extensions.DependencyInjection.EntityFrameworkServiceCollectionExtensions.AddDbContextPool%2A> sets the maximum number of instances retained by the pool (defaults to 1024 in EF Core 6.0, and to 128 in previous versions). Once `poolSize` is exceeded, new context instances are not cached and EF falls back to the non-pooling behavior of creating instances on demand.
 
@@ -39,24 +28,53 @@ The `poolSize` parameter of <xref:Microsoft.Extensions.DependencyInjection.Entit
 > [!NOTE]
 > Pooling without dependency injection was introduced in EF Core 6.0.
 
-To use `DbContext` pooling without dependency injection, initialize a `PooledDbContextFactory` and request context instances from it:
+To use context pooling without dependency injection, initialize a `PooledDbContextFactory` and request context instances from it:
 
-[!code-csharp[Main](../../../samples/core/Performance/Program.cs#DbContextPoolingWithoutDI)]
+[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#DbContextPoolingWithoutDI)]
 
 The `poolSize` parameter of the `PooledDbContextFactory` constructor sets the maximum number of instances retained by the pool (defaults to 1024 in EF Core 6.0, and to 128 in previous versions). Once `poolSize` is exceeded, new context instances are not cached and EF falls back to the non-pooling behavior of creating instances on demand.
 
 ***
 
-### Limitations
+### Benchmarks
 
-`DbContext` pooling has a few limitations on what can be done in the `OnConfiguring` method of the context.
+Following are the benchmark results for fetching a single row from a SQL Server database running locally on the same machine, with and without context pooling. As always, results will change with the number of rows, the latency to your database server and other factors. Importantly, this benchmarks single-threaded pooling performance, while a real-world contended scenario may have different results; benchmark on your platform before making any decisions. [The source code is available here](https://github.com/dotnet/EntityFramework.Docs/tree/main/samples/core/Benchmarks/ContextPooling.cs), feel free to use it as a basis for your own measurements.
+
+|                Method | NumBlogs |     Mean |    Error |   StdDev |   Gen 0 | Gen 1 | Gen 2 | Allocated |
+|---------------------- |--------- |---------:|---------:|---------:|--------:|------:|------:|----------:|
+| WithoutContextPooling |        1 | 701.6 us | 26.62 us | 78.48 us | 11.7188 |     - |     - |  50.38 KB |
+|    WithContextPooling |        1 | 350.1 us |  6.80 us | 14.64 us |  0.9766 |     - |     - |   4.63 KB |
 
-> [!WARNING]
-> Avoid using context pooling in apps that maintain state. For example, private fields in the context that shouldn't be shared across requests. EF Core only resets the state that it is aware of before adding a context instance to the pool.
+### Managing state in pooled contexts
 
-Context pooling works by reusing the same context instance across requests. This means that it's effectively registered as a [Singleton](/aspnet/core/fundamentals/dependency-injection#service-lifetimes) in terms of the instance itself so that it's able to persist.
+Context pooling works by reusing the same context instance across requests; this means that it's effectively registered as a [Singleton](/aspnet/core/fundamentals/dependency-injection#service-lifetimes), and the same instance is reused across multiple requests (or DI scopes). This means that special care must be taken when the context involves any state that may change between requests. Crucially, the context's `OnConfiguring` is only invoked once - when the instance context is first created - and so cannot be used to set state which needs to vary (e.g. a tenant ID).
 
-Context pooling is intended for scenarios where the context configuration, which includes services resolved, is fixed between requests. For cases where [Scoped](/aspnet/core/fundamentals/dependency-injection#service-lifetimes) services are required, or configuration needs to be changed, don't use pooling.
+A typical scenario involving context state would be a multi-tenant ASP.NET Core application, where the context instance has a *tenant ID* which is taken into account by queries (see [Global Query Filters](xref:core/querying/filters) for more details). Since the tenant ID needs to change with each web request, we need to have go through some extra steps to make it all work with context pooling.
+
+First, register a pooling context factory as a Singleton service, as usual:
+
+[!code-csharp[Main](../../../samples/core/Performance/AspNetContextPoolingWithState/Program.cs#RegisterSingletonContextFactory)]
+
+Next, write a custom context factory which gets a pooled context from the Singleton factory we registered, finds the tenant ID in the web request's `HttpContext`, and injects the ID into the context:
+
+[!code-csharp[Main](../../../samples/core/Performance/AspNetContextPoolingWithState/WeatherForecastScopedFactory.cs#WeatherForecastScopedFactory)]
+
+As written above, pay special attention to where you get the tenant ID from: this is an important aspect of your application's security.
+
+Once we have our custom context factory, register it as a Scoped service:
+
+[!code-csharp[Main](../../../samples/core/Performance/AspNetContextPoolingWithState/Program.cs#RegisterScopedContextFactory)]
+
+Finally, arrange for a context to get injected from our Scoped factory:
+
+[!code-csharp[Main](../../../samples/core/Performance/AspNetContextPoolingWithState/Program.cs#RegisterDbContext)]
+
+As this point, your controllers automatically get injected with a context instance that has the right tenant ID, without having to know anything about it.
+
+The full source code for this sample is available [here](https://github.com/dotnet/EntityFramework.Docs/tree/main/samples/core/Performance/AspNetContextPoolingWithState).
+
+> [!NOTE]
+> Although EF Core takes care of resetting internal state for `DbContext` and its related services, it generally does not reset state in the underlying database driver, which is outside of EF. For example, if you manually open and use a `DbConnection` or otherwise manipulate ADO.NET state, it's up to you to restore that state before returning the context instance to the pool, e.g. by closing the connection. Failure to do so may cause state to get leaked across unrelated requests.
 
 ## Compiled queries
 
@@ -75,11 +93,11 @@ EF supports *compiled queries*, which allow the explicit compilation of a LINQ q
 
 To use compiled queries, first compile a query with <xref:Microsoft.EntityFrameworkCore.EF.CompileAsyncQuery%2A?displayProperty=nameWithType> as follows (use <xref:Microsoft.EntityFrameworkCore.EF.CompileQuery%2A?displayProperty=nameWithType> for synchronous queries):
 
-[!code-csharp[Main](../../../samples/core/Performance/Program.cs#CompiledQueryCompile)]
+[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#CompiledQueryCompile)]
 
 In this code sample, we provide EF with a lambda accepting a `DbContext` instance, and an arbitrary parameter to be passed to the query. You can now invoke that delegate whenever you wish to execute the query:
 
-[!code-csharp[Main](../../../samples/core/Performance/Program.cs#CompiledQueryExecute)]
+[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#CompiledQueryExecute)]
 
 Note that the delegate is thread-safe, and can be invoked concurrently on different context instances.
 
@@ -94,7 +112,7 @@ When EF receives a LINQ query tree for execution, it must first "compile" that t
 
 Consider the following two queries:
 
-[!code-csharp[Main](../../../samples/core/Performance/Program.cs#QueriesWithConstants)]
+[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#QueriesWithConstants)]
 
 Since the expression trees contains different constants, the expression tree differs and each of these queries will be compiled separately by EF Core. In addition, each query produces a slightly different SQL command:
 
@@ -112,7 +130,7 @@ Because the SQL differs, your database server will likely also need to produce a
 
 A small modification to your queries can change things considerably:
 
-[!code-csharp[Main](../../../samples/core/Performance/Program.cs#QueriesWithParameterization)]
+[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#QueriesWithParameterization)]
 
 Since the blog name is now *parameterized*, both queries have the same tree shape, and EF only needs to be compiled once. The SQL produced is also parameterized, allowing the database to reuse the same query plan:
 

diff --git a/entity-framework/core/performance/efficient-querying.md b/entity-framework/core/performance/efficient-querying.md
@@ -13,7 +13,7 @@ Querying efficiently is a vast subject, that covers subjects as wide-ranging as
 
 The main deciding factor in whether a query runs fast or not is whether it will properly utilize indexes where appropriate: databases are typically used to hold large amounts of data, and queries which traverse entire tables are typically sources of serious performance issues. Indexing issues aren't easy to spot, because it isn't immediately obvious whether a given query will use an index or not. For example:
 
-[!code-csharp[Main](../../../samples/core/Performance/Program.cs#Indexes)]
+[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#Indexes)]
 
 A good way to spot indexing issues is to first pinpoint a slow query, and then examine its query plan via your database's favorite tool; see the [performance diagnosis](xref:core/performance/performance-diagnosis) page for more information on how to do that. The query plan displays whether the query traverses the entire table, or uses an index.
 
@@ -28,7 +28,7 @@ As a general rule, there isn't any special EF knowledge to using indexes or diag
 
 EF Core makes it very easy to query out entity instances, and then use those instances in code. However, querying entity instances can frequently pull back more data than necessary from your database. Consider the following:
 
-[!code-csharp[Main](../../../samples/core/Performance/Program.cs#ProjectEntities)]
+[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#ProjectEntities)]
 
 Although this code only actually needs each Blog's `Url` property, the entire Blog entity is fetched, and unneeded columns are transferred from the database:
 
@@ -39,7 +39,7 @@ FROM [Blogs] AS [b]
 
 This can be optimized by using `Select` to tell EF which columns to project out:
 
-[!code-csharp[Main](../../../samples/core/Performance/Program.cs#ProjectSingleProperty)]
+[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#ProjectSingleProperty)]
 
 The resulting SQL pulls back only the needed columns:
 
@@ -56,13 +56,13 @@ Note that this technique is very useful for read-only queries, but things get mo
 
 By default, a query returns all rows that matches its filters:
 
-[!code-csharp[Main](../../../samples/core/Performance/Program.cs#NoLimit)]
+[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#NoLimit)]
 
 Since the number of rows returned depends on actual data in your database, it's impossible to know how much data will be loaded from the database, how much memory will be taken up by the results, and how much additional load will be generated when processing these results (e.g. by sending them to a user browser over the network). Crucially, test databases frequently contain little data, so that everything works well while testing, but performance problems suddenly appear when the query starts running on real-world data and many rows are returned.
 
 As a result, it's usually worth giving thought to limiting the number of results:
 
-[!code-csharp[Main](../../../samples/core/Performance/Program.cs#Limit25)]
+[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#Limit25)]
 
 At a minimum, your UI could show a message indicating that more rows may exist in the database (and allow retrieving them in some other manner). A full-blown solution would implement *pagination*, where your UI only shows a certain number of rows at a time, and allow users to advance to the next page as needed; see the next section for more details on how to implement this efficiently.
 
@@ -106,7 +106,7 @@ In other scenarios, we may not know which related entity we're going to need bef
 
 Consider the following:
 
-[!code-csharp[Main](../../../samples/core/Performance/Program.cs#NPlusOne)]
+[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#NPlusOne)]
 
 This seemingly innocent piece of code iterates through all the blogs and their posts, printing them out. Turning on EF Core's [statement logging](xref:core/logging-events-diagnostics/index) reveals the following:
 
@@ -138,7 +138,7 @@ What's going on here? Why are all these queries being sent for the simple loops
 
 Assuming we're going to need all of the blogs' posts, it makes sense to use eager loading here instead. We can use the [Include](xref:core/querying/related-data/eager#eager-loading) operator to perform the loading, but since we only need the Blogs' URLs (and we should only [load what's needed](xref:core/performance/efficient-querying#project-only-properties-you-need)). So we'll use a projection instead:
 
-[!code-csharp[Main](../../../samples/core/Performance/Program.cs#EagerlyLoadRelatedAndProject)]
+[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#EagerlyLoadRelatedAndProject)]
 
 This will make EF Core fetch all the Blogs - along with their Posts - in a single query. In some cases, it may also be useful to avoid cartesian explosion effects by using [split queries](xref:core/querying/single-split-queries).
 
@@ -151,7 +151,7 @@ Buffering refers to loading all your query results into memory, whereas streamin
 
 Whether a query buffers or streams depends on how it is evaluated:
 
-[!code-csharp[Main](../../../samples/core/Performance/Program.cs#BufferingAndStreaming)]
+[!code-csharp[Main](../../../samples/core/Performance/Other/Program.cs#BufferingAndStreaming)]
 
 If your queries return just a few results, then you probably don't have to worry about this. However, if your query might return large numbers of rows, it's worth giving thought to streaming instead of buffering.