IN() list queries are not parameterized, causing increased SQL Server CPU usage #13617

NickCraver · 2018-10-13T12:36:43Z

Currently, EF Core is not parameterizing IN(...) queries created from .Contains() (and maybe other cases). This has a very detrimental impact on SQL Server itself because:

The query hash is different for each distinct set of IDs passed in.
SQL Server needs to calculate a plan for each of these distinct combinations (CPU hit).
The query plans created occupy an entry in SQL Server's plan cache (unless ad-hoc behavior is on), fighting for resources on other plans and cascading CPU issues by re-calculations due to evictions.

Note: when SQL Server has memory pressure, plan cache is the first thing to empty. So this has a profound impact at scale, doubly so when things have gone sideways.

Steps to reproduce

Here's a reduced down version of the problem:

var ids = Enumerable.Range(1, 1000).ToList();
var throwAwary = context.Users.Where(u => ids.Contains(u.Id)).ToList();

This results in the following:

SELECT [u].[Id], [u].[AcceptRateAccepted], [u].[AcceptRateAsked], [u].[AccountId], [u].[AnswerCount], [u].[BronzeBadges], [u].[CreationDate], [u].[DaysVisitedConsecutive], [u].[DaysVisitedTotal], [u].[DisplayName], [u].[Email], [u].[Flags], [u].[GoldBadges], [u].[HasAboutMeExcerpt], [u].[HasReplies], [u].[IsVeteran], [u].[JobSearchStatus], [u].[LastAccessDate], [u].[LastDailySiteAccessDate], [u].[LastEmailDate], [u].[LastLoginDate], [u].[LastLoginIP], [u].[LastModifiedDate], [u].[Location], [u].[OptInEmail], [u].[PreferencesRaw], [u].[ProfileImageUrl], [u].[QuestionCount], [u].[RealName], [u].[Reputation], [u].[ReputationMonth], [u].[ReputationQuarter], [u].[ReputationSinceLastCheck], [u].[ReputationToday], [u].[ReputationWeek], [u].[ReputationYear], [u].[SignupStarted], [u].[SilverBadges], [u].[TeamId], [u].[TeamName], [u].[TimedPenaltyDate], [u].[Title], [u].[UserTypeId], [u].[WebsiteUrl]
FROM [Users] AS [u]
WHERE [u].[Id] IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299, 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511, 512, 513, 514, 515, 516, 517, 518, 519, 520, 521, 522, 523, 524, 525, 526, 527, 528, 529, 530, 531, 532, 533, 534, 535, 536, 537, 538, 539, 540, 541, 542, 543, 544, 545, 546, 547, 548, 549, 550, 551, 552, 553, 554, 555, 556, 557, 558, 559, 560, 561, 562, 563, 564, 565, 566, 567, 568, 569, 570, 571, 572, 573, 574, 575, 576, 577, 578, 579, 580, 581, 582, 583, 584, 585, 586, 587, 588, 589, 590, 591, 592, 593, 594, 595, 596, 597, 598, 599, 600, 601, 602, 603, 604, 605, 606, 607, 608, 609, 610, 611, 612, 613, 614, 615, 616, 617, 618, 619, 620, 621, 622, 623, 624, 625, 626, 627, 628, 629, 630, 631, 632, 633, 634, 635, 636, 637, 638, 639, 640, 641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654, 655, 656, 657, 658, 659, 660, 661, 662, 663, 664, 665, 666, 667, 668, 669, 670, 671, 672, 673, 674, 675, 676, 677, 678, 679, 680, 681, 682, 683, 684, 685, 686, 687, 688, 689, 690, 691, 692, 693, 694, 695, 696, 697, 698, 699, 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, 713, 714, 715, 716, 717, 718, 719, 720, 721, 722, 723, 724, 725, 726, 727, 728, 729, 730, 731, 732, 733, 734, 735, 736, 737, 738, 739, 740, 741, 742, 743, 744, 745, 746, 747, 748, 749, 750, 751, 752, 753, 754, 755, 756, 757, 758, 759, 760, 761, 762, 763, 764, 765, 766, 767, 768, 769, 770, 771, 772, 773, 774, 775, 776, 777, 778, 779, 780, 781, 782, 783, 784, 785, 786, 787, 788, 789, 790, 791, 792, 793, 794, 795, 796, 797, 798, 799, 800, 801, 802, 803, 804, 805, 806, 807, 808, 809, 810, 811, 812, 813, 814, 815, 816, 817, 818, 819, 820, 821, 822, 823, 824, 825, 826, 827, 828, 829, 830, 831, 832, 833, 834, 835, 836, 837, 838, 839, 840, 841, 842, 843, 844, 845, 846, 847, 848, 849, 850, 851, 852, 853, 854, 855, 856, 857, 858, 859, 860, 861, 862, 863, 864, 865, 866, 867, 868, 869, 870, 871, 872, 873, 874, 875, 876, 877, 878, 879, 880, 881, 882, 883, 884, 885, 886, 887, 888, 889, 890, 891, 892, 893, 894, 895, 896, 897, 898, 899, 900, 901, 902, 903, 904, 905, 906, 907, 908, 909, 910, 911, 912, 913, 914, 915, 916, 917, 918, 919, 920, 921, 922, 923, 924, 925, 926, 927, 928, 929, 930, 931, 932, 933, 934, 935, 936, 937, 938, 939, 940, 941, 942, 943, 944, 945, 946, 947, 948, 949, 950, 951, 952, 953, 954, 955, 956, 957, 958, 959, 960, 961, 962, 963, 964, 965, 966, 967, 968, 969, 970, 971, 972, 973, 974, 975, 976, 977, 978, 979, 980, 981, 982, 983, 984, 985, 986, 987, 988, 989, 990, 991, 992, 993, 994, 995, 996, 997, 998, 999, 1000);

Further technical details

EF Core version: 2.1.4
Database Provider: Microsoft.EntityFrameworkCore.SqlServer
Operating system: Windows 2016/Windows 10
IDE: Visual Studio 2017 15.9

Proposal

EF Core should parameterize here. Instead of IN (1, 2, 3, ...) we should see IN (@__ids1, @__ids2, @__ids3, ...) or similar. This would allow query plan cache to be shared. For example if we ran this 1,000 times to fetch 1,000,000 users in batches, we'd have 1 plan in cache, whereas today we have 1,000 plans. Let's say a user gets removed (or added!) on page 2 of 1,000...today we'd calculate and cache another 999 plans next run.

To further address the cardinality problem, an approach similar to what Dapper does would be at least a starting point. We generate only lists of certain sizes, let's just say 5, 10, 50, 100, 500, 1000. Here's an example with 3 parameters:

SELECT [u].[Id], [u].[AcceptRateAccepted], [u].[AcceptRateAsked], [u].[AccountId], [u].[AnswerCount], [u].[BronzeBadges], [u].[CreationDate], [u].[DaysVisitedConsecutive], [u].[DaysVisitedTotal], [u].[DisplayName], [u].[Email], [u].[Flags], [u].[GoldBadges], [u].[HasAboutMeExcerpt], [u].[HasReplies], [u].[IsVeteran], [u].[JobSearchStatus], [u].[LastAccessDate], [u].[LastDailySiteAccessDate], [u].[LastEmailDate], [u].[LastLoginDate], [u].[LastLoginIP], [u].[LastModifiedDate], [u].[Location], [u].[OptInEmail], [u].[PreferencesRaw], [u].[ProfileImageUrl], [u].[QuestionCount], [u].[RealName], [u].[Reputation], [u].[ReputationMonth], [u].[ReputationQuarter], [u].[ReputationSinceLastCheck], [u].[ReputationToday], [u].[ReputationWeek], [u].[ReputationYear], [u].[SignupStarted], [u].[SilverBadges], [u].[TeamId], [u].[TeamName], [u].[TimedPenaltyDate], [u].[Title], [u].[UserTypeId], [u].[WebsiteUrl]
FROM [Users] AS [u]
WHERE [u].[Id] IN (@ids1, @ids2, @ids3, @ids4, @ids5);

The 5 Ids are so that 1-5 values all use the same plan. Anything < n in length repeats the last value (don't use null!). In this case let's say our values are 1, 2, and 3. Our parameterization would be:

@ids1 = 1;
@ids2 = 2;
@ids3 = 3;
@ids4 = 3;
@ids5 = 3;

This fetches the users we want, is friendly to the cache, and lessens the amount of generated permutations for all layers.

To put this in perspective, at Stack Overflow scale we're generating millions of one-time-use plans needlessly in EF Core (the parameterization in Linq2Sql lowered this to only cardinality permutations). We alleviate the cache issue by enabling ad-hoc query mode on our SQL Servers, but that doesn't lessen the CPU use from all the query hashes involved except in the (very rare) reuse case of the exact same list.

This problem is dangerous because it's also hard to see. If you're looking at the plan cache, you're not going to see it by any sort of impact query analysis. Each query, hash, and plan is different. There's no sane way to group them. It's death by a thousand cuts you can't see. I'm currently finding and killing as many of these in our app as I can, but we should fix this at the source for everyone.

The text was updated successfully, but these errors were encountered:

ErikEJ · 2018-10-13T16:19:00Z

Related issue: #12777

kroymann · 2018-10-13T17:32:38Z

I wrote an extension to EF6 to handle this exact problem for our product and apparently it is extremely similar to what Nick is proposing above. By replacing this where clause: .Where(entity => myEnumerable.Contains(entity.prop)) with this extension method: .BatchedWhereKeyInValues(entity => entity.prop, myEnumerables, batchSize: 100), my solution will break the list of values into batches, and then produce a properly parameterized query that generates the desired SQL.

For example, this expression:
query.BatchedWhereKeyInValues(q => q.Id, values: {1,2,5,7,8,9}, batchSize: 3)

would effectively become the following two queryables, where the numbers are supplied as variables that allow EF to parameterize the query rather than embed them as constants:
query.Where(q => q.Id == 1 || q.Id == 2 || q.Id == 5)
query.Where(q => q.Id == 7 || q.Id == 8 || q.Id == 9)

EF then converts those LINQ expressions into this SQL expression:
WHERE Id IN (@p__linq__0, @p__linq__1, @p__linq__2)

Check out the source here: https://gist.github.com/kroymann/e57b3b4f30e6056a3465dbf118e5f13d

StasPerekrestov · 2018-10-13T20:02:00Z

@kroymann Works like a charm! Big Thanks!

public static class QueryableExtension<TQuery>
    {
        internal static IEnumerable<IQueryable<TQuery>> WhereIn<TKey>(IQueryable<TQuery> queryable,
            Expression<Func<TQuery, TKey>> keySelector, IEnumerable<TKey> values, int batchSize)
        {
            List<TKey> distinctValues = values.Distinct().ToList();
            int lastBatchSize = distinctValues.Count % batchSize;
            if (lastBatchSize != 0)
            {
                distinctValues.AddRange(Enumerable.Repeat(distinctValues.Last(), batchSize - lastBatchSize));
            }

            int count = distinctValues.Count;
            for (int i = 0; i < count; i += batchSize)
            {
                var body = distinctValues
                    .Skip(i)
                    .Take(batchSize)
                    .Select(v =>
                    {
                        // Create an expression that captures the variable so EF can turn this into a parameterized SQL query
                        Expression<Func<TKey>> valueAsExpression = () => v;
                        return Expression.Equal(keySelector.Body, valueAsExpression.Body);
                    })
                    .Aggregate((a, b) => Expression.OrElse(a, b));
                if (body == null)
                {
                    yield break;
                }

                var whereClause = Expression.Lambda<Func<TQuery, bool>>(body, keySelector.Parameters);
                yield return queryable.Where(whereClause);
            }
        }

// doesn't use batching
        internal static IQueryable<TQuery> WhereIn<TKey>(IQueryable<TQuery> queryable,
            Expression<Func<TQuery, TKey>> keySelector, IEnumerable<TKey> values)
        {
            TKey[] distinctValues = values.Distinct().ToArray();


            int count = distinctValues.Length;
            for (int i = 0; i < count; ++i)
            {
                var body = distinctValues
                    .Select(v =>
                    {
                        // Create an expression that captures the variable so EF can turn this into a parameterized SQL query
                        Expression<Func<TKey>> valueAsExpression = () => v;
                        return Expression.Equal(keySelector.Body, valueAsExpression.Body);
                    })
                    .Aggregate((a, b) => Expression.OrElse(a, b));

                var whereClause = Expression.Lambda<Func<TQuery, bool>>(body, keySelector.Parameters);
                return queryable.Where(whereClause);
            }

            return Enumerable.Empty<TQuery>().AsQueryable();
        }
    }

Usage:

int[] a = Enumerable.Range(1, 10).ToArray();
var queries = QueryableExtension<User>.WhereIn(dbContext.Users, t => t.Id, a, 5);
foreach (var queryable in queries)
{
    _ = queryable.ToArray();
}

var everything = QueryableExtension<User>.WhereIn(dbContext.Users, t => t.Id, a);
_ = everything.ToArray();

SonicGD · 2018-11-02T03:23:38Z

Also such queries don't aggregate well in logging systems like Application Insights, cause every query has unique sql.

divega · 2018-12-15T00:08:56Z

I just wanted to share here some lexperiments I did some time ago with different alternative approaches to Enumerable.Contains using existing EF Core APIs:

https://github.com/divega/ContainsOptimization/

The goal was both to find possible workarounds, and to explore how we could deal with Contains in the future.

I timed the initial construction and first execution of the query with different approaches. This isn't a representative benchmark because there is no data and it doesn't measure the impact of caching, but I think it is still interesting to see the perf sensitivity to the number of parameters.

The code tests 4 approaches:

Test("standard Contains", context.People.Where(p => (ids.Contains(p.Id))));

Test("Parameter rewrite", context.People.In(ids, p => p.Id));

Test("Split function", 
    context.People.FromSql(
        $"select * from People p where p.Id in(select value from string_split({string.Join(",", ids)}, ','))"));

Test("table-valued parameter", 
    context.People.FromSql(
        $"select * from People p where p.Id in(select * from {context.CreateTableValuedParameter(ids, "@p0")})"));

Standard Contains

This is included as a baseline and is what happens today when you call Contains with a list: the elements of the list get in-lined as constants in the SQL as literals and the SQL cannot be reused.

Parameter rewrite

This approaches tries to rewrite the expression to mimic what the compiler would produce for a query like this:

var p1 = 1;
var p2 = 2; 
var p3 = 3;
context.People.Where(p => (new List<int> {p1, p2, p3}.Contains(p.Id))).ToList();

It also "bucketizes" the list, meaning that it only produces lists of specific lengths (powers of 2), repeating the last element as necessary, with the goal of favoring caching.

This approach only works within the limits of 2100 parameters in SQL Server, and sp_executesql takes a couple of parameters, so the size of the last possible bucket is 2098.

Overall this seems to be the most expensive approach using EF Core, at least on initial creation and execution.

Split function

This approach was mentioned by @NickCraver and @mgravell as something that Dapper can leverage. I implemented it using FromSql. It is interesting because it leads to just one (potentially very long) string parameter and it seems to perform very well (at least in the first query), but the STRING_SPLIT function
is only available in SQL Server since version 2016.

Table-valued parameter

For this I took advantage of the ability to create table-valued parameters that contain an IEnumerable<DbDataRecord> (I used a list rather than a streaming IEnumerable, but not sure that matters in this case) and pass them as a DbParameter in FromSql. I also added a hack to define the table type for the parameter if it isn't defined. Overall this seems to be the second most lightweight approach after the split function.

ErikEJ · 2018-12-16T13:16:21Z

@divega Just stumbled upon this: https://stackoverflow.com/questions/25228362/how-to-avoid-query-plan-re-compilation-when-using-ienumerable-contains-in-entity

ErikEJ · 2018-12-17T08:06:12Z

And this: https://vladmihalcea.com/improve-statement-caching-efficiency-in-clause-parameter-padding/

IGx89 · 2019-05-10T19:09:30Z

Here's an alternative solution I discovered thanks to a semi-related SO post by @davidbaxterbrowne:

var records = await _context.Items
	.AsNoTracking()
	.FromSql(
		"SELECT * FROM dbo.Items WHERE Id IN (SELECT value FROM OPENJSON({0}))",
		JsonConvert.SerializeObject(itemIds.ToList()))
	.ToListAsync();

It's specific to SQL Server 2017+ (or Azure SQL Server), and in my testing with a list of 50,000 IDs against a remote Azure SQL database it was 5x faster than a standard batching solution (dividing into lists of 1,000 IDs).

Part of dotnet#13617

michaelmesser · 2023-07-06T20:42:17Z

@roji Have you had a chance to revisit the WithUnorderedPrefetch issue? As far as I understand the issue is caused by OPENJSON having a low cardinality estimate. I do not think EXISTS vs IN would likely affect it.

roji · 2023-07-06T23:24:31Z

@michaelmesser I haven't, and unfortunately there's little chance I'd be able to for EF Core 8.0 as there are many other highly-prioritized work items.

Can you please open a new issue for this, so that it's tracked? It would be most helpful if you could attach the full repro given the latest SQL currently generated by EF (best to use a daily build). If there's a very clear, reproducible improvement given the latest SQL and it's not too hard, I may try to slip it in...

clement911 · 2023-10-24T01:19:54Z

Interesting discussion.

It would be interesting to see benchmark results with the upcoming JSON native type which "will allow you to store JSON documents in a native binary format that is optimized for storage and query performance".
It's probably more relevant for column data type, but one can also declare JSON parameters/variables.

DECLARE @json JSON = N'
[
    {
        "OrderNumber": "S043659",
        "Date":"2022-05-24T08:01:00",
        "AccountNumber":"AW29825",
        "Price":59.99,
        "Quantity":1
    },
    {
        "OrderNumber": "S043661",
        "Date":"2022-05-20T12:20:00",
        "AccountNumber":"AW73565",
        "Price":24.99,
        "Quantity":3
    }
]';

Although I'm not sure if it's possible to use this type via c# code.

roji · 2023-10-24T09:29:36Z

@clement911 note that the above IN/Contains discussion is likely somewhat orthogonal to the new JSON native type, since with IN/Contains there's no column containing JSON data - only a parameter (though it's possible parameter transfer may be affected as well).

But it absolutely could affect cases where on searches for a value inside a JSON column (rather than a parameter). The continuing improvement of JSON support across databases (e.g. this new SQL Server JSON type) was one of the factors that led us to go in this direction here.

Regardless, I've opened #32150 to track EF support for the new JSON native type, when it's available.

drdamour · 2024-01-05T06:24:25Z

does this now leveraging OPENJSON allow us to leverage Composite key in Contains? eg:

var compositeKeys = new(){ 
  (1,2,3),
  (4,5,6)
 }
dbset.Where(
  i => compositeKeys.Contains(  (i.X, i.Y, i.Z) )
 )

roji · 2024-01-05T07:10:59Z

@drdamour unfortunately not - the support in 8.0 is for primitive collections only, not tuples. What you're asking for is covered by #11799.

yv989c · 2024-01-06T11:02:29Z

@drdamour, you can give https://github.com/yv989c/BlazarTech.QueryableValues a try. With it, you can compose in-memory collections of complex objects in your EF queries.

An example can be seen here:
https://github.com/yv989c/BlazarTech.QueryableValues?tab=readme-ov-file#complex-type-example

drdamour · 2024-01-06T18:23:32Z

@drdamour, you can give https://github.com/yv989c/BlazarTech.QueryableValues a try. With it, you can compose in-memory collections of complex objects in your EF queries.

An example can be seen here: https://github.com/yv989c/BlazarTech.QueryableValues?tab=readme-ov-file#complex-type-example

yeah thats what i use today...was hoping it made its way to ef core :(

ibrahim33 · 2024-01-10T07:58:11Z

Hello everyone

Ich just tested the feature and found out that OPENJSON ist not used when the collection is a HashSet. I am using EF Core Version 8.0.1.

LINQ with HashSet:

var productSets= await dbContext.ProductSets()
            .Where(ps => ids.ToHashSet().Contains(ps.ProductId))
            .ToListAsync(cancellationToken)

QUERY with HashSet:

SELECT [p].[Id], [p].[ProductId], [p].[ReactivateIfSubProductsReappear], [p].[Version], [p0].[Id], [p0].[IsMainProduct], [p0].[ProductId], [p0].[ProductSetId], [p0].[Quantity]
FROM [externaldata].[ProductSet] AS [p]
LEFT JOIN [externaldata].[ProductSetProduct] AS [p0] ON [p].[Id] = [p0].[ProductSetId]
WHERE [p].[ProductId] IN (CAST(213171 AS bigint), CAST(234107 AS bigint), CAST(254039 AS bigint), CAST(252018 AS bigint), CAST(257030 AS bigint), CAST(282759 AS bigint), CAST(290215 AS bigint), CAST(352284 AS bigint), CAST(354318 AS bigint), CAST(354179 AS bigint))
ORDER BY [p].[Id]

LINQ with List:

var productSets= await dbContext.ProductSets()
            .Where(ps => ids.ToList().Contains(ps.ProductId))
            .ToListAsync(cancellationToken)

QUERY with List:

SELECT [p].[Id], [p].[ProductId], [p].[ReactivateIfSubProductsReappear], [p].[Version], [p0].[Id], [p0].[IsMainProduct], [p0].[ProductId], [p0].[ProductSetId], [p0].[Quantity]
FROM [externaldata].[ProductSet] AS [p]
LEFT JOIN [externaldata].[ProductSetProduct] AS [p0] ON [p].[Id] = [p0].[ProductSetId]
WHERE [p].[ProductId] IN (
    SELECT [i].[value]
    FROM OPENJSON(@__ids_0) WITH ([value] bigint '$') AS [i]
)
ORDER BY [p].[Id]

Would it be possible to add support for HashSets?

ibrahim33 · 2024-01-11T09:34:11Z

For everyone interessted, the issue can be tracked here #32365

mscirri · 2024-09-05T15:04:54Z

I wrote an extension to EF6 to handle this exact problem for our product and apparently it is extremely similar to what Nick is proposing above. By replacing this where clause: .Where(entity => myEnumerable.Contains(entity.prop)) with this extension method: .BatchedWhereKeyInValues(entity => entity.prop, myEnumerables, batchSize: 100), my solution will break the list of values into batches, and then produce a properly parameterized query that generates the desired SQL.

For example, this expression: query.BatchedWhereKeyInValues(q => q.Id, values: {1,2,5,7,8,9}, batchSize: 3)

would effectively become the following two queryables, where the numbers are supplied as variables that allow EF to parameterize the query rather than embed them as constants: query.Where(q => q.Id == 1 || q.Id == 2 || q.Id == 5) query.Where(q => q.Id == 7 || q.Id == 8 || q.Id == 9)

EF then converts those LINQ expressions into this SQL expression: WHERE Id IN (@p__linq__0, @p__linq__1, @p__linq__2)

Check out the source here: https://gist.github.com/kroymann/e57b3b4f30e6056a3465dbf118e5f13d

Is this extension still good to go?

xt0rted mentioned this issue Oct 14, 2018

Where clause with boolean check isn't always parameterized as expected #12956

Closed

ajcvickers added the customer-reported label Oct 15, 2018

ajcvickers added this to the 3.0.0 milestone Oct 15, 2018

ajcvickers added the type-enhancement label Oct 15, 2018

ajcvickers modified the milestones: 3.0.0, Backlog Oct 15, 2018

ajcvickers assigned smitpatel Oct 15, 2018

ajcvickers mentioned this issue Oct 15, 2018

Query: SelectExpression Cache hit for InExpression coming from parameter #12777

Closed

smitpatel added the propose-punt label Oct 15, 2018

divega mentioned this issue Oct 16, 2018

SQL Server: Possible off-by one in batch splitting logic #13639

Closed

ajcvickers modified the milestones: 3.0.0, Backlog Jan 23, 2019

ajcvickers removed the propose-punt label Jan 23, 2019

smitpatel removed their assignment Jan 29, 2019

ajcvickers mentioned this issue Mar 29, 2019

ToListAsync finished with exeptions #14737

Closed

jeradrose mentioned this issue May 3, 2019

Add support for STRING_SPLIT when using .Contains() #15593

Closed

divega mentioned this issue Jul 26, 2019

Query: Add caching of generated RelationalCommand based on nullability of parameters #15892

Closed

smitpatel mentioned this issue Sep 3, 2019

General QueryContext-based mechanism for caching of generated RelationalCommand #17598

Open

smitpatel mentioned this issue Sep 12, 2019

Feature Request: - For SqlServer generate join on table variable instead of generating WHERE IN clause when .Contains is used in the EF query #17790

Closed

punppis mentioned this issue Sep 14, 2019

Cached value sometimes invalid VahidN/EFSecondLevelCache.Core#52

Closed

roji added a commit to roji/efcore that referenced this issue Jun 5, 2023

Generate OPENJSON with WITH unless ordering is required

be1f02d

Part of dotnet#13617

roji added a commit to roji/efcore that referenced this issue Jun 5, 2023

Generate OPENJSON with WITH unless ordering is required

8c8cb6e

Part of dotnet#13617

This comment was marked as resolved.

Sign in to view

roji added a commit to roji/efcore that referenced this issue Jun 6, 2023

Generate OPENJSON with WITH unless ordering is required

a310f2a

Part of dotnet#13617

roji added a commit to roji/efcore that referenced this issue Jun 7, 2023

Generate OPENJSON with WITH unless ordering is required

9eb8e67

Part of dotnet#13617

yv989c mentioned this issue Jun 27, 2023

Possibility of using implicit version of OpenJson? yv989c/BlazarTech.QueryableValues#38

Closed

roji added the breaking-change label Jul 27, 2023

KLuuKer mentioned this issue Sep 22, 2023

Adding EF.Constant to prevent parameterization in query #31552

Closed

AndriySvyryd mentioned this issue Oct 18, 2023

Improve data access performance #24110

Open

9 tasks

roji mentioned this issue Oct 23, 2023

Dynamic OR predicate on multiple columns #32092

Closed

ajcvickers modified the milestones: 8.0.0-preview4, 8.0.0 Nov 14, 2023

stevendarby mentioned this issue Nov 23, 2023

Contains to OPENJSON translation regresses performance #32394

Closed

GertArnold mentioned this issue Feb 6, 2024

SqlServer Error 8623 The query processor ran out of internal resources and could not produce a query plan #11006

Closed

roji mentioned this issue Mar 28, 2024

For "Contains" queries of numeric values, just use the value in an "IN" statement instead of casting, converting, or using JSON. #33423

Closed

maliming mentioned this issue Apr 1, 2024

Error from TokenCleanupService.CleanAsync abpframework/abp#19460

Closed

roji mentioned this issue Aug 3, 2024

Introduce context options to control constantization/parameterization of parameterized collections #34344

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IN() list queries are not parameterized, causing increased SQL Server CPU usage #13617

IN() list queries are not parameterized, causing increased SQL Server CPU usage #13617

NickCraver commented Oct 13, 2018

ErikEJ commented Oct 13, 2018

kroymann commented Oct 13, 2018 •

edited

Loading

StasPerekrestov commented Oct 13, 2018 •

edited

Loading

SonicGD commented Nov 2, 2018

divega commented Dec 15, 2018 •

edited

Loading

ErikEJ commented Dec 16, 2018

ErikEJ commented Dec 17, 2018

IGx89 commented May 10, 2019 •

edited

Loading

This comment was marked as resolved.

This comment was marked as resolved.

michaelmesser commented Jul 6, 2023

roji commented Jul 6, 2023

clement911 commented Oct 24, 2023

roji commented Oct 24, 2023

drdamour commented Jan 5, 2024 •

edited

Loading

roji commented Jan 5, 2024

yv989c commented Jan 6, 2024

drdamour commented Jan 6, 2024

ibrahim33 commented Jan 10, 2024 •

edited

Loading

ibrahim33 commented Jan 11, 2024

mscirri commented Sep 5, 2024

IN() list queries are not parameterized, causing increased SQL Server CPU usage #13617

IN() list queries are not parameterized, causing increased SQL Server CPU usage #13617

Comments

NickCraver commented Oct 13, 2018

Steps to reproduce

Further technical details

Proposal

ErikEJ commented Oct 13, 2018

kroymann commented Oct 13, 2018 • edited Loading

StasPerekrestov commented Oct 13, 2018 • edited Loading

SonicGD commented Nov 2, 2018

divega commented Dec 15, 2018 • edited Loading

Standard Contains

Parameter rewrite

Split function

Table-valued parameter

ErikEJ commented Dec 16, 2018

ErikEJ commented Dec 17, 2018

IGx89 commented May 10, 2019 • edited Loading

This comment was marked as resolved.

This comment was marked as resolved.

michaelmesser commented Jul 6, 2023

roji commented Jul 6, 2023

clement911 commented Oct 24, 2023

roji commented Oct 24, 2023

drdamour commented Jan 5, 2024 • edited Loading

roji commented Jan 5, 2024

yv989c commented Jan 6, 2024

drdamour commented Jan 6, 2024

ibrahim33 commented Jan 10, 2024 • edited Loading

ibrahim33 commented Jan 11, 2024

mscirri commented Sep 5, 2024

kroymann commented Oct 13, 2018 •

edited

Loading

StasPerekrestov commented Oct 13, 2018 •

edited

Loading

divega commented Dec 15, 2018 •

edited

Loading

IGx89 commented May 10, 2019 •

edited

Loading

drdamour commented Jan 5, 2024 •

edited

Loading

ibrahim33 commented Jan 10, 2024 •

edited

Loading