Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IConnection.CreateModel() is significantly slower in 6.1.0. #860

Closed
makercob opened this issue Jun 9, 2020 · 20 comments
Closed

IConnection.CreateModel() is significantly slower in 6.1.0. #860

makercob opened this issue Jun 9, 2020 · 20 comments
Milestone

Comments

@makercob
Copy link

makercob commented Jun 9, 2020

Updated this client from 6.0.0 to 6.1.0, and noticed that IConnection.CreateModel() becomes significantly slower.

Environments and preconditions:

  • .NET Core 3.1 (on Windows)
  • RabbitMQ 3.8.4
  • ConnectionFactory.DispatchConsumersAsync is set to true
  • WAN connection

Symptoms:
When creating IModel, the initial 2 or 3 invoking are pretty quick, each taking about 40ms (WAN connection), however succeeding attempts become much slower, taking up to 1,000ms for each.

And also, putting lock(_lockObject) { ... } around IConnection.CreateModel() just makes it even slower.

@makercob makercob changed the title IConnection.CreateModel() is significantly slower in 6.2.0. IConnection.CreateModel() is significantly slower in 6.1.0. Jun 9, 2020
@michaelklishin
Copy link
Member

Please share a benchmark snippet we can use to reproduce. We try to minimize guessing and implicit assumptions when reasoning about library behavior.
Opening a channel involves a network round trip. There can be any number of things involved.

Acquiring and releasing a lock around an operation will not make it more efficient.

@michaelklishin
Copy link
Member

Ideally, a traffic capture should be taken and shared along with an executable way to run the benchmark. The latency of the method in question is highly dependent on the network.

@lukebakken
Copy link
Contributor

lukebakken commented Jun 9, 2020

Please share a traffic capture and a complete set of code we can compile and run to observe this behavior. If we find a regression, I will re-open this issue.

@makercob
Copy link
Author

Snippet as follows:

using System;
using System.Linq;
using System.Threading.Tasks;
using RabbitMQ.Client;

namespace ModelCreationTest {
    class Program {
        static void Main(string[] args) {
            // connects to RabbitMQ
            var factory = new ConnectionFactory {
                AutomaticRecoveryEnabled = true,
                TopologyRecoveryEnabled = true,
                HostName = "127.0.0.1",
                Port = 5672,
                UserName = "user",
                Password = "password",
                VirtualHost = "/",
                RequestedConnectionTimeout = new TimeSpan(0, 0, 2),
                UseBackgroundThreadsForIO = false,
                DispatchConsumersAsync = true
            };

            var connection = factory.CreateConnection();

            // creates channels
            Console.WriteLine("Creating channels...");
            var totalTicksCounter = Environment.TickCount;

            /*Creating channels synchronously works fine with 6.0.0 and 6.1.0.
             *Creating channels asynchronously is very slow with 6.1.0. */
            var tasks = Enumerable.Range(0, 20).Select(
                i => Task.Factory.StartNew(
                    obj => {
                        var ticksCounter = Environment.TickCount;
                        var channel = connection.CreateModel();
                        var cost = Environment.TickCount - ticksCounter;
                        Console.WriteLine($"ID:{(int)obj}, Cost:{cost:N0}ms.");
                    },
                    i));

            // waits for completion
            Task.WhenAll(tasks).Wait();
            var totalCost = Environment.TickCount - totalTicksCounter;
            Console.WriteLine($"Done, Total Cost:{totalCost:N0}ms.");

            Console.WriteLine("Press any key to exit...");
            Console.ReadKey();
        }
    }
}

Results with 6.0.0 as follows:
Screen Shot 2020-06-10 at 12 59 30

Results with 6.1.0 as follows:
Screen Shot 2020-06-10 at 13 01 52

@stebet
Copy link
Contributor

stebet commented Jun 10, 2020

What is the core count on the machine you are using?

If you add ThreadPool.SetMinThreads(16, 16); to the start of your program, does that change anything?

@lukebakken lukebakken reopened this Jun 10, 2020
@lukebakken
Copy link
Contributor

Re-opening since we received code to try and reproduce.

@makercob
Copy link
Author

@stebet

  1. What is the core count on the machine you are using?
    The core count is 4.

  2. If you add ThreadPool.SetMinThreads(16, 16); to the start of your program, does that change anything?
    When ThreadPool.SetMinThreads(16, 16); is added, about 50% performance gain is observed, however still much slower than in 6.0.0.

Screen Shot 2020-06-11 at 09 24 29

@lukebakken
Copy link
Contributor

lukebakken commented Jun 12, 2020

@stebet @makercob this version consistently performs well using both 6.0.0 and 6.1.0:

https://github.com/lukebakken/rabbitmq-dotnet-client-860/blob/main/Program.cs

Main differences:

  • Thread pool count is higher than the number of channels created
  • LINQ is removed since a for loop suffices. In addition, the removal of Enumerable.Range fixes the issue. There must be an interaction between lazy evaluation of the range and the changes in 6.1.0. We can investigate this, but I don't consider it to be a show-stopper.

@lukebakken lukebakken added this to the 6.1.1 milestone Jun 12, 2020
@lukebakken
Copy link
Contributor

lukebakken commented Jun 12, 2020

Here's another interesting thing I found. If you apply #863 (which removes Task.Yield) then this code runs quickly:

https://github.com/lukebakken/rabbitmq-dotnet-client-860/blob/main/Program.cs

Note that the above adds ThreadPool.SetMinThreads but keeps the LINQ statement.

I hope that @stebet or @danielmarbach can explain this 😂

@stebet
Copy link
Contributor

stebet commented Jun 12, 2020

Try calling .ToArray() on the end of the LINQ statement in the current code and see if that changes anything. It's possible that Task.WhenAll iterates through the loop taking only X tasks at a time, and therefore a new task isn't started until the previous one finishes, where a ToArray will evaluate the entire enumeration before calling Task.WhenAll, hopefully scheduling all the tasks immediately.

@lukebakken
Copy link
Contributor

lukebakken commented Jun 15, 2020

@stebet as you suspected calling ToArray addresses the issue as well. You still have to call SetMinThreads, however. I'm assuming we should add guidance to the docs or something now that we're using the thread pool.

I won't close this issue (#860) because it does demonstrate a problem that is resolved by #863.

@makercob please take the time to test the workarounds discussed in the recent comments. Thanks.

@danielmarbach
Copy link
Collaborator

@lukebakken I doubt it is a good idea to add guidance around setting min threads. How would you ever come up with good guidance to guide the users to set appropriate values?

You as a library can't really make any assumptions about the nature of threading involved by the code that is calling you. The problem I see here is the fact that even-though we have an IO-bound path we can't really leverage async here because the API surface is sync. Because the API is sync anyone that calls it and wants to do it in parallel has to offload the calls explicitly to the worker thread pool. Explicit offloading to the worker threadpool increases the pressure on the pool and might cause it to ramp up it's capacity (also known as hill climbing). That operation can cause significant stalls depending on the workload. And the library in addition to that yielding internally increases the pressure even more. That's why you see an improvement when setting the min threads because the thread pool will likely already be ready to handle the workload that is demanded in the sample and removing the yield causes less stress.

So removing the yield didn't make it into 6.1?

@lukebakken
Copy link
Contributor

Thanks for the input.

There were enough caveats in your comment here that I didn't remove the yield call for 6.1.0. It can be done for 6.1.1 however.

@makercob
Copy link
Author

makercob commented Jun 16, 2020

@lukebakken Here are my findings, on a 4-core machine:

LINQ is used, not for loop.

with 6.1.0

  • Adding ThreadPool.SetMinThreads(32, 32) did bring some performance gain, with or without .ToArray().

Screen Shot 2020-06-16 at 11 28 40

with 6.0.0

  • Adding ThreadPool.SetMinThreads(32, 32) slows it down about 10x, with or without .ToArray().

With ThreadPool.SetMinThreads(32, 32)
Screen Shot 2020-06-16 at 11 31 31

Without ThreadPool.SetMinThreads(32, 32)
Screen Shot 2020-06-16 at 11 31 52

@lukebakken
Copy link
Contributor

@makercob please test the changes in #863 if you can. Thanks.

@lukebakken
Copy link
Contributor

My tests show that this issue can be fixed via the code changes suggested in the comments as well as #863. I will produce 6.1.1-rc.1 today.

@lukebakken
Copy link
Contributor

@makercob
Copy link
Author

@lukebakken Updated to 6.1.1-rc.1, however, the issue still persists.

@michaelklishin
Copy link
Member

@makercob this is open source software. You are welcome to profile your workload with both versions and share both results.

@lukebakken
Copy link
Contributor

@makercob in my environment where I could reproduce the issue, the issue is fixed. "the issue still persists" does not give us actionable information. At the very least, provide an exact copy of the code you are running. Note that calling ThreadPool.SetMinThreads(A, B) is necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants