Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure DNS changes are detected by client #3658

Closed
ealsur opened this issue Jan 17, 2023 · 16 comments · Fixed by #3762
Closed

Ensure DNS changes are detected by client #3658

ealsur opened this issue Jan 17, 2023 · 16 comments · Fixed by #3762
Assignees

Comments

@ealsur
Copy link
Member

ealsur commented Jan 17, 2023

Background: https://azure.status.microsoft/en-us/status/history/ Tracking ID: VN11-JD8

Reference: https://www.meziantou.net/avoid-dns-issues-with-httpclient-in-dotnet.htm

Azure SDK reference: Azure/azure-sdk-for-net#19457

Azure SDK has followed this article and added the default configuration to HttpClient on Azure/azure-sdk-for-net#19524

More code pointers:

We should evaluate if we want to follow this approach, the Azure SDK went through these changes so we can assume they are ok, but it's worth investigating.

There is the alternative that none of these APIs are available for NETStandard 2.0

@ealsur
Copy link
Member Author

ealsur commented Feb 16, 2023

Our SDK is a netstandard2.0 library, so we can only leverage the APIs that are available.

We use HttpClientHandler internally when creating the HttpClient:

public static HttpMessageHandler CreateHttpClientHandler(int gatewayModeMaxConnectionLimit, IWebProxy webProxy, Func<X509Certificate2, X509Chain, SslPolicyErrors, bool> serverCertificateCustomValidationCallback)
{
HttpClientHandler httpClientHandler = new HttpClientHandler();

HttpClient httpClient = new HttpClient(httpMessageHandler);

In netstandard2.0, the HttpClientHandler (which inherits from HttpMessageHandler) does not include the PooledConnectionLifetime property to be set.

When looking at the Azure SDK PR, they are multi-targetting Azure.Core and using compilation flags to set this option only when NETCOREAPP is the target and it uses the SocketHttpHandler that does not exist in netstandard2.0:

https://github.com/Azure/azure-sdk-for-net/blob/b983a7055f67d486679e5e66d2fc7d079b73cd93/sdk/core/Azure.Core/src/Pipeline/ServicePointHelpers.cs#L79-L90

#if NETCOREAPP
                case SocketsHttpHandler socketsHttpHandler:
                    if (socketsHttpHandler.MaxConnectionsPerServer == RuntimeDefaultConnectionLimit)
                    {
                        socketsHttpHandler.MaxConnectionsPerServer = IncreasedConnectionLimit;
                    }
                    if (socketsHttpHandler.PooledConnectionLifetime == DefaultConnectionLeaseTimeoutTimeSpan)
                    {
                        socketsHttpHandler.PooledConnectionLifetime = IncreasedConnectionLeaseTimeoutTimeSpan;
                    }
                    break;
#endif

However, because we expose a hook to HttpClientFactory customers can customize the HttpClient themselves, this means that if the customer application is on .NET 6 for example, they can do:

public void ConfigureServices(IServiceCollection services)
{
	SocketsHttpHandler socketsHttpHandler = new SocketsHttpHandler();
	socketsHttpHandler.PooledConnectionLifetime = TimeSpan.FromMinutes(10); // Customize this value based on desired DNS refresh timer
	services.AddSingleton<SocketsHttpHandler>(socketsHttpHandler);

	services.AddSingleton<CosmosClient>(serviceProvider =>
	{
		SocketsHttpHandler socketsHttpHandler = serviceProvider.GetRequiredService<SocketsHttpHandler>();
		CosmosClientOptions cosmosClientOptions = new CosmosClientOptions()
		{
                        // Pass your customized SocketHttpHandler to be used by the CosmosClient
                        // Make sure `disposeHandler` is `false`
			HttpClientFactory = () => new HttpClient(socketsHttpHandler, disposeHandler: false)
		};

		return new CosmosClient("<connection-string>", cosmosClientOptions);
	});
}

If they use Dependency Injection, or if they don't:

// Assuming this is all Singleton
SocketsHttpHandler socketsHttpHandler = new SocketsHttpHandler();
socketsHttpHandler.PooledConnectionLifetime = TimeSpan.FromMinutes(10); // Customize this value based on desired DNS refresh timer

CosmosClientOptions cosmosClientOptions = new CosmosClientOptions()
{
         // Pass your customized SocketHttpHandler to be used by the CosmosClient
        // Make sure `disposeHandler` is `false`
	HttpClientFactory = () => new HttpClient(socketsHttpHandler, disposeHandler: false)
};

return new CosmosClient("<connection-string>", cosmosClientOptions);

Reference: https://devblogs.microsoft.com/cosmosdb/httpclientfactory-cosmos-db-net-sdk/#sharing-httpclients

The goal of this code is that you create a SocketHttpHandler instance and you use that to provide an HttpClient to the SDK client that it will use instead of creating an HttpClient internally

Unless we multi-target this library to NET6/NETCOREAPP we cannot set these configurations internally.

@kirankumarkolli
Copy link
Member

Agreed on multi-targeting.

One hypothesis is that custom handler is only used in rare cases => default multi-targeting will benefit majority of CX.

@ealsur
Copy link
Member Author

ealsur commented Feb 16, 2023

If we multi-target, we can definitively do it internally. Yes, my assumption would be that customers rarely customize the HttpClientFactory

@kirankumarkolli
Copy link
Member

Few options (from offline conversation)

  • Reflection per platform: Security aspects to be explored
  • Account refresh: Isolated client and re-create every time (forcing new connection)
  • DNS monitoring and re-create HttpClient OnChange

@philipthomas-MSFT
Copy link
Contributor

reflection to try to do it we happen to be running on .NET6 - Kevin P.

@ealsur ealsur changed the title Investigate if it's possible to set PooledConnectionLifetime on default HttpClient Ensure DNS changes are detected by client Feb 21, 2023
@ealsur
Copy link
Member Author

ealsur commented Mar 3, 2023

Potential solutions

  1. Multi-target SDK to net6
  2. Create an independent HttpClient only for GlobalEndpointManager and re-create it each time the account information is fetched.
  3. Use Reflection to test if the runtime can create an instance of SocketsHttpHandler, if so, then use that to create the base HttpClient with PooledConnectionLifetime.

Solution 1

Pros Cons
Strongly tiped APIs when available Major time investment to validate functionality across, not a short term solution
We can leverage new APIs across other .NET types that can bring performance improvements

Solution 2

Pros Cons
Disposing/Recreating the HttpClient will force the DNS clearing Code complexity is elevated, requires un-wiring the single HttpClient we use for the whole SDK and maintain a new one inside GlobalEndpointManager
Existing strongly typed APIs (HttpClient.Dispose) When we eventually multi-target (Solution 1), it will require un-doing the work because it can be done through the APIs on SocketHttpHandler
Does not require extra validations on how the APIs behave on different frameworks Creates (even though they might be few) extra connections

Solution 3

Pros Cons
Focused code change on the creation of the single HttpClient instance Using Reflection is not trivial and requires try/catch statements to handle the case where the desired API is not available
Easier migration to multi-targeting than Solution 2 because it will require to just remove Reflection and use the existing APIs Requires validation on at least .NET Framework 4.8 and .NET 6 to see the behavior (.NET 6 would benefit, .NET 4.8 would not)
Would work only on implementations that have SocketHttpHandler

@NaluTripician
Copy link
Contributor

NaluTripician commented Mar 7, 2023

Findings on .NET Detection of DNS Changes

Here are my findings on the current behavior of .NET 6.0 and 4.8 in regard to how DNS changes are detected. For all the following test, first an instance of a HttpClient is created. Then the console app enters the following loop:

while (true)
{
    HttpRequestMessage request = new HttpRequestMessage
    {
        Method = HttpMethod.Get,
        RequestUri = new Uri("http://fakeURL.com")
    };

    try
    {
        HttpResponseMessage res = await client.SendAsync(request,
        HttpCompletionOption.ResponseHeadersRead);
        Console.WriteLine(res);
        res.Dispose();
    }
    catch (TaskCanceledException) 
    {
        Console.WriteLine("Task Canceled");
    }
    
    await Task.Delay(TimeSpan.FromMinutes(1));
}

While in the loop, changes are made to the hosts file on my local machine that will point the fake url to different IP addresses. It is also important to note that all of the HttpClient have a Timeout value set to 65 seconds to mirror the behavior of the SDK (the default is 100 seconds).

.NET 6.0 - Default Behavior

In .NET 6.0, changes to the IP address are only detected if a new Http Connection is made. The connection will be dropped after a timeout period if the connection is a keep-alive connection, or after ~15 seconds if the connection is not a keep-alive connection. The backend gateway uses a keep-alive connection.

This is the expected behavior based on the customer reported issues.

.NET 4.8 - Default Behavior

dotnet-counters does not work for this version so the behavior with creating new Http Connection is a guess based off of the .NET 6.0 behavior.

.NET 4.8 behaves the same as .NET 6.0.

Using Reflection to create an Instance of SocketsHttpHandler

To test to see if using reflection is a viable option, I created a separate library that both the .NET 4.8 and 6.0 console apps call to create the HttpClient. Here is what that library looks like:

public static (HttpClient, string) CreateHttpClient()
{
    try
    {
        Type socketHandlerType = Type.GetType("System.Net.Http.SocketsHttpHandler, 
            System.Net.Http");

        object socketHttpHandler = Activator.CreateInstance(socketHandlerType);

        PropertyInfo pooledConnectionLifetimeInfo = socketHandlerType.GetProperty("PooledConnectionLifetime");
        pooledConnectionLifetimeInfo.SetValue(socketHttpHandler, TimeSpan.FromMinutes(10), null);

        return (new HttpClient((HttpMessageHandler)socketHttpHandler, disposeHandler: false), ".NET 6.0");
    }
    catch
    {
        Console.WriteLine("\n\n\nFailed to make SocketHttpHandler ");
    }    
    return (new HttpClient(), ".NET 4.8 or Other");
}

.NET 6.0 Behavior

The IP address changes are detected after the given TimeSpan. This means that using a SocketsHttpHandler is giving us the desired outcome.

.NET 4.8 Behavior

Behaves the same as the default .NET 4.8 as expected. When using the created library, it attempts to create a SocketsHttpHandler, fails (as SocketsHttpHandler does not exist in .NET 4.8), then return a normal HttpHandler.

Unanswered questions and Conclusions.

What should the TimeSpan value be if this solution is used in the SDK? See earlier comments in the .NET 6.0 Behavior with Reflection Section.

@Pilchie
Copy link
Member

Pilchie commented Mar 7, 2023

For the second scenario, if after the first SendAsync call you change the IP address in the hosts file, the change will be detected by the client, as it is opening a new connection.

Why is it opening a new connection for the second request, but not any subsequent requests?

@Pilchie
Copy link
Member

Pilchie commented Mar 7, 2023

@karelz - Wondering if you can help us out with some DNS related queries. Part of Azure Cosmos DB's strategy for moving to different regions involves updating DNS, but we recently observed some cases where our clients didn't pick up the changes, and we're trying to make the SDK work better in these sort of cases.

It's complicated by the fact that we're a .NET Standard 2.0 library, and still have plenty of use on .NET Framework.

@ealsur
Copy link
Member Author

ealsur commented Mar 7, 2023

For the second scenario, if after the first SendAsync call you change the IP address in the hosts file, the change will be detected by the client, as it is opening a new connection.

Why is it opening a new connection for the second request, but not any subsequent requests?

Good question, @NaluTripician can you confirm this? With the default HttpClient, iIf the Timeout is 65 seconds and requests are happening faster, if you change the hosts file, is it picking up the changes or is it not? My understanding was that it was not, but the sentence seems to imply the contrary?

@NaluTripician
Copy link
Contributor

For the second scenario, if after the first SendAsync call you change the IP address in the hosts file, the change will be detected by the client, as it is opening a new connection.

Why is it opening a new connection for the second request, but not any subsequent requests?

Good question, @NaluTripician can you confirm this? With the default HttpClient, iIf the Timeout is 65 seconds and requests are happening faster, if you change the hosts file, is it picking up the changes or is it not? My understanding was that it was not, but the sentence seems to imply the contrary?

I think I might know what’s going on here, there is a 10 second delay I added before the loop in the test to give me time to start the dot-net counters which is longer than the Timeout value. I will update the original comment where this is not an issue.

@NaluTripician
Copy link
Contributor

After some more testing I do not think that this is the issue. I am still seeing the connection drop after the first send, even when setting the PooledConnectionLifetime / Timeout values to large amounts of time.

@NaluTripician
Copy link
Contributor

Another possible reason that the HTTP connections were being dropped in the test was that the target IP address used had no keepalive,

@NaluTripician
Copy link
Contributor

Updated original comment with results from fix to tests.

@karelz
Copy link

karelz commented Mar 21, 2023

@Pilchie sure, we can try. However note that DNS is done by OS on both .NET Framework and .NET Core. We as .NET do not interact with it, we just use it and leave all policy decisions and other fun stuff to OS.
Do you have any specific questions or suggestions?

BTW: We toyed with the idea to implement our own DNS - see dotnet/runtime#19443. It has upsides (TTL available, etc.) and also downsides (no cross-process sharing which OS does). As a result, we didn't see huge need given that it would be only an alternative without clear winner between the 2 implementations.

@Pilchie
Copy link
Member

Pilchie commented Mar 21, 2023

However note that DNS is done by OS on both .NET Framework and .NET Core.

Understood - this issue is really about connection lifetimes. As long as a new connection is established, the OS seems to do the right thing, but connection pooling means that we sometimes re-use a connection to an old host. That's what we're trying to address.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
6 participants