-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MinBatchSize option #10091
Add MinBatchSize option #10091
Conversation
Part of #9270
bc8e814
to
a22e406
Compare
I'm synchronizing some old EF Core changes to Npgsql, and came across this issue. I can see that MinBatchSize defaults to a hardcoded 4 in Concretely I don't have reason to believe it's not a good idea to start batching from 2 statements (and no time to actually benchmark). Any chance you move the constant 4 to some provider-specific place where it would be easier to change? Otherwise I have to set it in NpgsqlOptionsExtension which isn't very nice. |
* Switch to PoolableDbContext in various tests (dotnet/efcore#11311) * Test for MinBatchSize (dotnet/efcore#10091)
@roji Why isn't it nice to do in NpgsqlOptionsExtension? Where would you want to do it? |
My point was that I'd like to leave MinBatchSize null on NpgsqlOptionsExtension, to signify that the user hasn't specified it. If I set it to some value such as 2, we can no longer distinguish between a user-set value and a default value... Or maybe I misunderstood you suggestion... Regarding 2 vs 3 vs 4, I should benchmark. However, on the wire, sending two messages as a batch or as two separate messages involved the exact same protocol messages, the only difference is in whether you wait for the first statement to complete before sending the second. So there really is no reason to think that batching would be bad for performance for any batch size... I understand this works different for SQL Server, which is why I think it's better to manage the default separately in each provider. |
I just can't think of a scenario where it would matter that the value was set by the user. There is some overhead in EF when creating batches. |
For example, when logging we could check whether the value was set by the user, and only only log if so. It's just a theoretical idea that the distinction is important, if we really don't care I can set it.
Are you sure the slowdown you're seeing in the benchmarks comes from EF Core rather than the ADO.NET driver itself? Because in my experience the network round-trip you add by not batching dominates most things happening inside your application (especially as latency to your database server increases...) |
Yes, that's why this has to be configurable. For the default I think we should consider the co-located scenario (<10ms latency) |
I agree, but the point is that in PostgreSQL I don't think there's any real latency low enough where not batching would make sense, even against localhost. I know it's bad to talk about perf in theory, but I have no real environment here to benchmark. Is there any chance you guys can run a quick benchmark with varying batch sizes against PostgreSQL? If not I'll at least run a localhost benchmark. |
* Switch to PoolableDbContext in various tests (dotnet/efcore#11311) * Test for MinBatchSize (dotnet/efcore#10091)
* Switch to PoolableDbContext in various tests (dotnet/efcore#11311) * Test for MinBatchSize (dotnet/efcore#10091)
* Switch to PoolableDbContext in various tests (dotnet/efcore#11311) * Test for MinBatchSize (dotnet/efcore#10091)
* Switch to PoolableDbContext in various tests (dotnet/efcore#11311) * Test for MinBatchSize (dotnet/efcore#10091)
* Switch to PoolableDbContext in various tests (dotnet/efcore#11311) * Test for MinBatchSize (dotnet/efcore#10091)
* Switch to PoolableDbContext in various tests (dotnet/efcore#11311) * Test for MinBatchSize (dotnet/efcore#10091)
Part of #9270