Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider changing the default handling of parameterized collections (back to constantization, RECOMPILE, inline collection of parameters...) #34347

Open
roji opened this issue Aug 3, 2024 · 4 comments

Comments

@roji
Copy link
Member

roji commented Aug 3, 2024

In 8.0, we changed the default translation for parameterized lists from constantization to parameterization (e.g. with OPENJSON); this is better for plan/query caching (single SQL), but can produce bad query plans (#32394). For 9, we're adding extensive means to control the parameterization vs. constantization behavior, both at the global context options level (#34344) and at the per-collection level (EF.Constant, EF.Parameter (#34345)).

For 10, we plan to reexamine what the default should be. The problem with the current full parameterization, is that the bad plans it occasionally causes can create a significant perf regression, with queries that previously executed near instantaneously (with constants) now taking seconds or more. In contrast, the performance issues cause by constantization are much more constant and limited (continuous re-planning, query cache bloat leading to possible less buffer cache memory, and premature eviction of other query plans).

In addition, the main problem with constantization is caused by SQL Server's behavior to cache query plans on the very first execution. Other databases do not typically do this (e.g. in PostgrSQL, Npgsql can be configured to prepare/cache only after X executions), and SQL Server itself has the RECOMPILE query hint for not caching, as well as an ad-hoc workload mode (see also this blog post), which makes SQL Server behave better around single-use queries (but this is a global server option). So the plan cache bloat seems like it can be solved for SQL Server (and may not exist for other databases); the remaining issue is only the constant replanning causes by different SQLs, but this is likely to be minor compared to the better plans produced thanks to the database visibility into the cardinality.

So a comprehensive cross-database investigation into both plan caching/bloat and the impact of parameterization/constantization is needed here.

Note that instead of returning to full constantization by default (pre-8.0), we could choose to switch to inline collection of parameters.

To summarize, here are our options:

-- Full parameterization (current translation):
SELECT * FROM foo WHERE x IN (SELECT v FROM OPENJSON(@values))

-- Simple constantization (pre-8.0 translation):
SELECT * FROM foo WHERE x IN (1, 2, 3);

-- Simple constantization with RECOMPILE on SQL Server (no plan cache bloat):
SELECT * FROM foo WHERE x IN (1, 2, 3) RECOMPILE;

-- Inline collection with parameters:
SELECT * FROM foo WHERE x IN (@p1, @p2, @p3);

-- Inline collection with parameters, with bucketization (optimization for Contains only):
SELECT * FROM foo WHERE x IN (@p1, @p2, @p3, @p3, @p3);
@roji roji added this to the Backlog milestone Aug 3, 2024
@roji roji changed the title Consider changing the default handilng of parameterized collections back to constantization (or inline collection of parameters) Consider changing the default handling of parameterized collections back to constantization (or inline collection of parameters) Aug 3, 2024
@roji roji changed the title Consider changing the default handling of parameterized collections back to constantization (or inline collection of parameters) Consider changing the default handling of parameterized collections (back to constantization, RECOMPILE, inline collection of parameters...) Aug 4, 2024
@IanKemp
Copy link

IanKemp commented Aug 5, 2024

In addition, the main problem with constantization is caused by SQL Server's behavior to cache query plans on the very first execution.

Is this not mitigated via Parameter Sensitive Plan optimization?

@roji
Copy link
Member Author

roji commented Aug 5, 2024

@IanKemp how so? Constantization (which you're quoting) means you don't have parameters at all in the query, only constants; so different collection values mean different SQLs, and therefore different plans (so plan bloat).

@kamiyn

This comment has been minimized.

@roji

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment