Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After Upgrade to 23.3.3 from 23.2.2 about every 20th - 30th request finished with status code 500 #2110

Closed
antikorol opened this issue Jun 22, 2024 · 43 comments · Fixed by #2111
Assignees
Labels
bug Identified as a potential bug high High priority Kubernetes Service discovery by Kubernetes Load Balancer Ocelot feature: Load Balancer merged Issue has been merged to dev and is waiting for the next release Service Discovery Ocelot feature: Service Discovery

Comments

@antikorol
Copy link
Contributor

antikorol commented Jun 22, 2024

Expected Behavior / New Feature

Resolving a service address to route an API request

Actual Behavior / Motivation for New Feature

Sometimes, about every 20th - 30th request finished with status code 500

requestId: 0HN4ISPBSIV4V:000037FF, previousRequestId: No PreviousRequestId, 
message: 'Exception caught in global error handler, 
exception message: Object reference not set to an instance of an object.,
exception stack:
   at Ocelot.LoadBalancer.LoadBalancers.RoundRobin.Lease(HttpContext httpContext)
   at Ocelot.LoadBalancer.Middleware.LoadBalancingMiddleware.Invoke(HttpContext httpContext)
   at Ocelot.DownstreamPathManipulation.Middleware.ClaimsToDownstreamPathMiddleware.Invoke(HttpContext httpContext)
   at Ocelot.QueryStrings.Middleware.ClaimsToQueryStringMiddleware.Invoke(HttpContext httpContext)
   at Ocelot.Headers.Middleware.ClaimsToHeadersMiddleware.Invoke(HttpContext httpContext)
   at Ocelot.Authorization.Middleware.AuthorizationMiddleware.Invoke(HttpContext httpContext)
   at Ocelot.Claims.Middleware.ClaimsToClaimsMiddleware.Invoke(HttpContext httpContext)
   at Ocelot.Authentication.Middleware.AuthenticationMiddleware.Invoke(HttpContext httpContext)
   at Ocelot.RequestId.Middleware.RequestIdMiddleware.Invoke(HttpContext httpContext)
   at Ocelot.RateLimiting.Middleware.RateLimitingMiddleware.Invoke(HttpContext httpContext)
   at Ocelot.Request.Middleware.DownstreamRequestInitialiserMiddleware.Invoke(HttpContext httpContext)
   at Ocelot.Headers.Middleware.HttpHeadersTransformationMiddleware.Invoke(HttpContext httpContext)
   at Ocelot.Security.Middleware.SecurityMiddleware.Invoke(HttpContext httpContext)
   at Ocelot.Multiplexer.MultiplexingMiddleware.Invoke(HttpContext httpContext)
   at Ocelot.DownstreamRouteFinder.Middleware.DownstreamRouteFinderMiddleware.Invoke(HttpContext httpContext)
   at Ocelot.Responder.Middleware.ResponderMiddleware.Invoke(HttpContext httpContext)
   at Ocelot.Errors.Middleware.ExceptionHandlerMiddleware.Invoke(HttpContext httpContext) RequestId: 0HN4ISPBSIV4V:000037FF'

Steps to Reproduce the Problem

{
  "Routes": [
    {
      "UpstreamPathTemplate": "/public/{everything}",
      "UpstreamHttpMethod": [
        "POST"
      ],
      "ServiceName": "serviceName",
      "DownstreamPathTemplate": "/public/{everything}",
      "DownstreamScheme": "http",
      "LoadBalancerOptions": {
        "Type": "RoundRobin"
      }
    }
  ],
  "GlobalConfiguration": {
    "ServiceDiscoveryProvider": {
      "Namespace": "namespace",
      "Type": "Kube"
    }
}
}

Specifications

  • Version: 23.3.3 (NET6 and NET8)
  • Platform: Linux
  • Subsystem:
@raman-m
Copy link
Member

raman-m commented Jun 24, 2024

Hello Roman!
Welcome to Ocelot world! 🐯

It's difficult to determine what happened. Could you please upload the entire solution to GitHub for review? Alternatively, you could attach the following artifacts to this thread:

  • The Kubernetes service configurations, particularly the serviceName configuration
  • The Ocelot application startup C# code, also known as the app services setup
  • The Kubernetes client JSON configuration, also referred to as appsettings.json

This information will likely assist in identifying the issue.


After Upgrade to 23.3.3 from 23.2.2 about every 20th - 30th request finished with status code 500

You didn't observe the same errors in the Ocelot logs for version 23.2.x, did you?
Are you now operating in two different environments, or did you only upgrade a single environment?

Since you're utilizing Kubernetes, could you provide more details about your K8s server deployment?
How do you initiate the server, is it locally or in a cloud remotely?
And how long have you been using Kubernetes in conjunction with Ocelot service discovery?

@raman-m raman-m added Service Discovery Ocelot feature: Service Discovery Load Balancer Ocelot feature: Load Balancer Kubernetes Service discovery by Kubernetes waiting Waiting for answer to question or feedback from issue raiser labels Jun 24, 2024
@raman-m
Copy link
Member

raman-m commented Jun 24, 2024

Regarding the exception details...

requestId: 0HN4ISPBSIV4V:000037FF, previousRequestId: No PreviousRequestId, 
message: 'Exception caught in global error handler, 
exception message: Object reference not set to an instance of an object.,
exception stack:
   at Ocelot.LoadBalancer.LoadBalancers.RoundRobin.Lease(HttpContext httpContext)
   at Ocelot.LoadBalancer.Middleware.LoadBalancingMiddleware.Invoke(HttpContext httpContext)

Are you a C# developer? Can you debug C# code? Do you use the official Ocelot NuGet package?

Deploying the Release version of the Ocelot package DLLs means there is no debug information for the called lines in the Call Stack trace. However, we can analyze the code of the RoundRobin.Lease method.

public async Task<Response<ServiceHostAndPort>> Lease(HttpContext httpContext)
{
var services = await _servicesDelegate?.Invoke() ?? new List<Service>();
if (services?.Count != 0)
{
lock (_lock)
{
if (_last >= services.Count)
{
_last = 0;
}
var next = services[_last++];
return new OkResponse<ServiceHostAndPort>(next.HostAndPort);
}
}
return new ErrorResponse<ServiceHostAndPort>(new ServicesAreEmptyError($"There were no services in {nameof(RoundRobin)} during {nameof(Lease)} operation."));
}

It appears the issue may lie here:

var next = services[_last++];
return new OkResponse<ServiceHostAndPort>(next.HostAndPort);

There's a missing null check for line 33. Theoretically, the service provider could return a null object, which should be accounted for.

To conclude, it seems that for some reason, the Kube service provider and its service builder returned a null list item, leading to the raising of NullReferenceException.
Therefore, further investigation into the problem is warranted...

@raman-m
Copy link
Member

raman-m commented Jun 24, 2024

@antikorol The subsequent step in the investigation involves searching for subsequent messages in Ocelot's log.
Here is the service builder logic:

public virtual IEnumerable<Service> BuildServices(KubeRegistryConfiguration configuration, EndpointsV1 endpoint)
{
ArgumentNullException.ThrowIfNull(configuration);
ArgumentNullException.ThrowIfNull(endpoint);
var services = endpoint.Subsets
.SelectMany(subset => _serviceCreator.Create(configuration, endpoint, subset))
.ToArray();
_logger.LogDebug(() => $"K8s '{Check(endpoint.Kind)}:{Check(endpoint.ApiVersion)}:{Check(endpoint.Metadata?.Name)}' endpoint: Total built {services.Length} services.");
return services;
}

We need to search for the following message in the log:

_logger.LogDebug(() => $"K8s '{Check(endpoint.Kind)}:{Check(endpoint.ApiVersion)}:{Check(endpoint.Metadata?.Name)}' endpoint: Total built {services.Length} services.");

In appsettings.json file the Debug logging level for the application can be defined as follows:

{
  "Logging": {
    "LogLevel": {
      "Default": "Debug",
      "Microsoft.AspNetCore": "Warning"
    }
  }
}

Deploy and review the log for the message:

  • "K8s 'kind:version:name' endpoint: Total built N services."

If this message appears in the log, it indicates that the Kubernetes setup is correct, and we can proceed to investigate the issue more thoroughly.

@raman-m
Copy link
Member

raman-m commented Jun 24, 2024

every 20th - 30th request finished with status code 500

It seems that the requests to the Kubernetes endpoint were failing around the 20th to 30th attempts, as indicated by the code in these lines:

var endpoint = await _kubeApi
.ResourceClient(client => new EndPointClientV1(client))
.GetAsync(_configuration.KeyOfServiceInK8s, _configuration.KubeNamespace);

This suggests that the Kubernetes client connectivity might be unstable. It would be advisable to switch from the Kube to the PollKube provider and observe the behavior again. The error should no longer occur if this change is made. Additionally, the PollingInterval should be set to less than half the average duration between the occurrences of the 500 error.

Let's figure out what's happening and possibly provide a hotfix for your user scenario to eliminate the Null Reference Error logging. We need to handle this 500 status case and process it accordingly.
Will you be opening a PR soon?

@antikorol
Copy link
Contributor Author

antikorol commented Jun 24, 2024

  1. I am using the official NuGet package.
    There was no error in version 23.2.2, but I had my own logic for extending the config to specify the service port now I have started using your logic with schema matching with the named port

  2. I have cleaned the service code from corporate code, leaving the minimum on which the error definitely reproduces.
    https://github.com/antikorol/sample-of-error

  3. When I make a copy of the RoundRobin class and simply use it in the config as RoundRobin2, I can no longer reproduce this error. (I replaced it this way because I wanted to see in which line the null reference exception appears)

  4. When I use PollKube, I get an error in the logs "requestId: 0HN4ISPBSIV51:000023E9, previousRequestId: No PreviousRequestId, message: 'Error Code: ServicesAreEmptyError Message: There were no services in RoundRobin during Lease operation. errors found in ResponderMiddleware"

  5. When I changed log level to debug, I was only able to reproduce this exception once

Below is the log with debug information.


  | Jun 24, 2024 @ 15:13:45.018 | requestId: 0HN4K8QDBB2FT:00000016, previousRequestId: No PreviousRequestId, message: 'K8s 'Endpoints:v1:login-service' endpoint: Total built 1 services.'
-- | --

  | Jun 24, 2024 @ 15:13:39.565 | requestId: 0HN4K8QDBB2FO:00000015, previousRequestId: No PreviousRequestId, message: 'Exception caught in global error handler, exception message: Object reference not set to an instance of an object., exception stack:    at Ocelot.LoadBalancer.LoadBalancers.RoundRobin.Lease(HttpContext httpContext)    at Ocelot.LoadBalancer.Middleware.LoadBalancingMiddleware.Invoke(HttpContext httpContext)    at Ocelot.DownstreamPathManipulation.Middleware.ClaimsToDownstreamPathMiddleware.Invoke(HttpContext httpContext)    at Ocelot.QueryStrings.Middleware.ClaimsToQueryStringMiddleware.Invoke(HttpContext httpContext)    at Ocelot.Headers.Middleware.ClaimsToHeadersMiddleware.Invoke(HttpContext httpContext)    at Ocelot.Authorization.Middleware.AuthorizationMiddleware.Invoke(HttpContext httpContext)    at Ocelot.Claims.Middleware.ClaimsToClaimsMiddleware.Invoke(HttpContext httpContext)    at Ocelot.Authentication.Middleware.AuthenticationMiddleware.Invoke(HttpContext httpContext)    at Ocelot.RequestId.Middleware.RequestIdMiddleware.Invoke(HttpContext httpContext)    at Ocelot.RateLimiting.Middleware.RateLimitingMiddleware.Invoke(HttpContext httpContext)    at Ocelot.Request.Middleware.DownstreamRequestInitialiserMiddleware.Invoke(HttpContext httpContext)    at Ocelot.Headers.Middleware.HttpHeadersTransformationMiddleware.Invoke(HttpContext httpContext)    at Ocelot.Security.Middleware.SecurityMiddleware.Invoke(HttpContext httpContext)    at Ocelot.Multiplexer.MultiplexingMiddleware.Invoke(HttpContext httpContext)    at Ocelot.DownstreamRouteFinder.Middleware.DownstreamRouteFinderMiddleware.Invoke(HttpContext httpContext)    at Ocelot.Responder.Middleware.ResponderMiddleware.Invoke(HttpContext httpContext)    at Ocelot.Errors.Middleware.ExceptionHandlerMiddleware.Invoke(HttpContext httpContext) RequestId: 0HN4K8QDBB2FO:00000015'

  | Jun 24, 2024 @ 15:13:39.515 | requestId: 0HN4K8QDBB2FU:00000015, previousRequestId: No PreviousRequestId, message: 'K8s 'Endpoints:v1:login-service' endpoint: Total built 1 services.'

  | Jun 24, 2024 @ 15:13:39.514 | requestId: 0HN4K8QDBB2FU:00000015, previousRequestId: No PreviousRequestId, message: 'K8s service with key 'login-service' and address 172.31.40.225; Detected port is http:80. Total 1 ports of [http].'

  | Jun 24, 2024 @ 15:13:39.514 | requestId: 0HN4K8QDBB2FQ:00000015, previousRequestId: No PreviousRequestId, message: 'K8s 'Endpoints:v1:login-service' endpoint: Total built 1 services.'

  | Jun 24, 2024 @ 15:13:39.512 | requestId: 0HN4K8QDBB2FP:00000015, previousRequestId: No PreviousRequestId, message: 'K8s service with key 'login-service' and address 172.31.40.225; Detected port is http:80. Total 1 ports of [http].'

  | Jun 24, 2024 @ 15:13:39.512 | requestId: 0HN4K8QDBB2FO:00000015, previousRequestId: No PreviousRequestId, message: 'K8s service with key 'login-service' and address 172.31.40.225; Detected port is http:80. Total 1 ports of [http].'

  | Jun 24, 2024 @ 15:13:39.512 | requestId: 0HN4K8QDBB2FO:00000015, previousRequestId: No PreviousRequestId, message: 'K8s 'Endpoints:v1:login-service' endpoint: Total built 1 services.'

  | Jun 24, 2024 @ 15:13:39.512 | requestId: 0HN4K8QDBB2FP:00000015, previousRequestId: No PreviousRequestId, message: 'K8s 'Endpoints:v1:login-service' endpoint: Total built 1 services.'

  | Jun 24, 2024 @ 15:13:39.511 | requestId: 0HN4K8QDBB2FQ:00000015, previousRequestId: No PreviousRequestId, message: 'K8s service with key 'login-service' and address 172.31.40.225; Detected port is http:80. Total 1 ports of [http].'

  | Jun 24, 2024 @ 15:13:39.506 | requestId: 0HN4K8QDBB2FR:00000015, previousRequestId: No PreviousRequestId, message: 'K8s service with key 'login-service' and address 172.31.40.225; Detected port is http:80. Total 1 ports of [http].'

  | Jun 24, 2024 @ 15:13:39.506 | requestId: 0HN4K8QDBB2FR:00000015, previousRequestId: No PreviousRequestId, message: 'K8s 'Endpoints:v1:login-service' endpoint: Total built 1 services.'

  | Jun 24, 2024 @ 15:13:39.505 | requestId: 0HN4K8QDBB2FT:00000015, previousRequestId: No PreviousRequestId, message: 'K8s service with key 'login-service' and address 172.31.40.225; Detected port is http:80. Total 1 ports of [http].'

  | Jun 24, 2024 @ 15:13:39.505 | requestId: 0HN4K8QDBB2FT:00000015, previousRequestId: No PreviousRequestId, message: 'K8s 'Endpoints:v1:login-service' endpoint: Total built 1 services.'

  | Jun 24, 2024 @ 15:13:39.505 | requestId: 0HN4K8QDBB2FN:00000015, previousRequestId: No PreviousRequestId, message: 'K8s service with key 'login-service' and address 172.31.40.225; Detected port is http:80. Total 1 ports of [http].'

  | Jun 24, 2024 @ 15:13:39.505 | requestId: 0HN4K8QDBB2FN:00000015, previousRequestId: No PreviousRequestId, message: 'K8s 'Endpoints:v1:login-service' endpoint: Total built 1 services.'

  | Jun 24, 2024 @ 15:13:39.504 | requestId: 0HN4K8QDBB2FS:00000015, previousRequestId: No PreviousRequestId, message: 'K8s service with key 'login-service' and address 172.31.40.225; Detected port is http:80. Total 1 ports of [http].'

  | Jun 24, 2024 @ 15:13:39.504 | requestId: 0HN4K8QDBB2FS:00000015, previousRequestId: No PreviousRequestId, message: 'K8s 'Endpoints:v1:login-service' endpoint: Total built 1 services.'

  | Jun 24, 2024 @ 15:13:39.209 | requestId: 0HN4K8QDBB2FR:00000014, previousRequestId: No PreviousRequestId, message: 'K8s 'Endpoints:v1:login-service' endpoint: Total built 1 services.'


@raman-m
Copy link
Member

raman-m commented Jun 24, 2024

Dear Roman,
What's your full name?
What's your LinkedIn? We can continue discussion in LinkedIn chat...

@antikorol
Copy link
Contributor Author

@raman-m https://www.linkedin.com/in/roman-shevchik/

@raman-m
Copy link
Member

raman-m commented Jun 24, 2024

  1. I am using the official NuGet package.
    There was no error in version 23.2.2, but I had my own logic for extending the config to specify the service port now I have started using your logic with schema matching with the named port

Please be aware that any custom logic may compromise stability, and the Ocelot team cannot be held accountable for it. It appears your existing custom logic is only compatible with version 23.2.
I guess, you might be facing a Kubernetes misconfiguration issue related to the Downstream Scheme versus Port Names feature. Are you utilizing this feature?

It would be helpful to examine the service definition in your Kubernetes configuration. Could you provide it, please?

@antikorol
Copy link
Contributor Author

No, I removed that logic because your changes already cover the case I needed with multiple ports.

@raman-m
Copy link
Member

raman-m commented Jun 24, 2024

  1. I have cleaned the service code from corporate code, leaving the minimum on which the error definitely reproduces.
    https://github.com/antikorol/sample-of-error

Good! After a quick review:

  • ApiGateway.Service: Where is the ocelot.json file? In other words, where is the primary configuration file for Ocelot?
  • appsettings.json: I'm very sorry, but why have you defined routes inside appsettings.json? The Ocelot configuration should be in ocelot.json! For configuration guidance: configuration.

I wonder that your Ocelot app started at all!
Did you develop the Ocelot app in the past using version 23.2 or someone else?

Did you read Configuration docs?
Especially, the Multiple Environments paragraph which say how to configure app with .AddJsonFile("ocelot.json") // primary config file

So, I don't see configuration part in these lines L27-L42
You have to setup configuration:

    builder.WebHost
        .ConfigureAppConfiguration((context, config) =>
        {
            var env = context.HostingEnvironment;
            config.SetBasePath(env.ContentRootPath)
                .AddJsonFile("appsettings.json", true, true)
                .AddJsonFile($"appsettings.{env.EnvironmentName}.json", true, true)
                .AddOcelot(env) // happy path
                .AddEnvironmentVariables();
        })

So, it's very important to configure via AddOcelot method:

 .ConfigureAppConfiguration((context, config) => config.AddOcelot())

This is minimal required configuring code! But better to use official template from docs.

@raman-m
Copy link
Member

raman-m commented Jun 24, 2024

@antikorol commented:

No, I removed that logic because your changes already cover the case I needed with multiple ports.

Good! You wanna use the feature: Downstream Scheme vs Port Names

Also you enabled the port finder via "DownstreamScheme": "http".
Now, are you sure your K8s definition is correct?
Should we review K8s config together?

@antikorol
Copy link
Contributor Author

antikorol commented Jun 24, 2024

There was update from 18.0.0 to 23.2.2 a long time ago and now to 23.3.3
I had a configuration in a YAML file, but to exclude all possible libraries, I converted it to JSON and added it to the appsettings, and yes, everything worked.
Now I've moved it to a separate ocelot.json and fixed registration

According to the Microsoft documentation, it is not mandatory to explicitly specify appsettings.json or its variations depending on the environment.
https://learn.microsoft.com/en-us/aspnet/core/fundamentals/configuration/?view=aspnetcore-6.0

Should we review K8s config together?

yes, what I should provide?

@raman-m
Copy link
Member

raman-m commented Jun 24, 2024

  1. When I make a copy of the RoundRobin class and simply use it in the config as RoundRobin2, I can no longer reproduce this error.

I apologize, but I see absolutely no difference from the official code. There's no difference at all; every line is exactly the same. I don't understand what you're referring to.

(I replaced it this way because I wanted to see in which line the null reference exception appears)

I apologize, but uncommenting line L53 does not result in any replacement. You need to implement actual service injection using the official helpers: Custom Load Balancers. Have you consulted this documentation?

Even if you substitute the services with custom ones, it won't aid in identifying the failing line, as this necessitates the inclusion of Debug version DLLs in your Docker file, as seen here: L21. Therefore, you must build with the debug configuration -c Debug to ensure the log contains Call Stack code lines within messages for exceptions.

@raman-m
Copy link
Member

raman-m commented Jun 24, 2024

@antikorol commented:

yes, what I should provide?

Please provide the Kubernetes service definition in text, JSON format, or the Kubernetes client response JSON. From the log you've shared, it appears you have defined only one service; however, the number of ports in the definition is unclear.

Additionally, I'm curious as to why every twentieth Kubernetes client request fails or returns an invalid response, causing the Ocelot provider to throw NullReferenceException. We need to pinpoint this faulty request and obtain the response data in JSON format to replicate the user scenario through tests.

Lastly, I trust that you have the expertise to debug the provider, identify the issue, and potentially write tests or even submit a pull request with a fix.

@antikorol
Copy link
Contributor Author

I apologize, but I see absolutely no difference from the official code. There's no difference at all; every line is exactly the same.

Sorry for the confusion. The files are identical. I didn't want to change the project build type. In the release build, I can see lines in the stack trace in my code. I wanted to understand in which exact line the error occurred so that I could add logs later. Surprisingly, the error didn't reproduce when I used just a copy

@raman-m
Copy link
Member

raman-m commented Jun 24, 2024

I am awaiting a PR that will resolve the issue...
At present, I cannot replicate your user scenario as there are no reproduction steps provided. Writing these steps is challenging because everything appears to function correctly for the first 29 requests, and only the 30th request fails with an exception. This is integration issue with invalid data from integrated service which is K8s client.
Since I cannot reproduce the issue and lack access to your actual production environment, emulating your production setting is also difficult.
Please pinpoint the issue and return to us with a bugfix PR that better illustrates the problem.
Best of luck!

@antikorol
Copy link
Contributor Author

antikorol commented Jun 24, 2024

@raman-m
I haven't still found 100% reproduction case, but I delved deeper into the code and noticed one detail.


This class is registered as a singleton for each ServiceProviderConfiguration, and the class that resolves it, RoundRobinCreator is also registered as a singleton. At the same time, we use the Kube.GetAsync method call in async code, where we can easily encounter a "Collection was modified" exception. Maybe somewhere deep in the List.Clear or elsewhere, due to resize or clear array, we might lost references and encounter an "Object reference not set to an instance..." exception.
What do you think about this part of code?

@raman-m
Copy link
Member

raman-m commented Jun 25, 2024

This class is registered as a singleton for each ServiceProviderConfiguration, and the class that resolves it

Look at the actual registration method

public static IOcelotBuilder AddKubernetes(this IOcelotBuilder builder, bool usePodServiceAccount = true)
{
builder.Services
.AddKubeClient(usePodServiceAccount)
.AddSingleton(KubernetesProviderFactory.Get)
.AddSingleton<IKubeServiceBuilder, KubeServiceBuilder>()
.AddSingleton<IKubeServiceCreator, KubeServiceCreator>();
return builder;
}

So, the Kube class is not registered at all. Only KubernetesProviderFactory.Get delegate is registered:
.AddSingleton(KubernetesProviderFactory.Get)

Kube is instantiated on every request. The logic is quite complicated. And I think Kube is scoped service.
Also, IKubeApiClient is scoped registered service.
The problem could be bad/empty response from IKubeApiClient


RoundRobinCreator is also registered as a singleton.

The issue seems to be with the load balancing logic, which involves a singleton RoundRobin service that is poorly synchronized. The RoundRobin is receiving an unsynchronized IServiceDiscoveryProvider object. Furthermore, the Lease method invokes an asynchronous method that may return lists of varying lengths, which I believe is problematic. Additionally, the exception call stack pinpoints the Lease method as the source of exceptions. It would be prudent to rewrite the Lease method to operate synchronously and stabilize the list length immediately.

What's happening in my opinion.
1st thread gets the list of 1 item (count == 1)
2nd thread gets list of 1 null-item (count == 1)
1st thread goes inside of the if-block considering the list is not empty
2nd thread has finished calling async call of await _servicesDelegate?.Invoke() which returns list of null items
2nd thread re-initializes services variable with list of null items
1st thread goes to the line 32 and reads null-object
1st thread read data on line 33 from null object and the exception is generated

I believe the problem is on the line 33

return new OkResponse<ServiceHostAndPort>(next.HostAndPort);

It must be rewritten!

@antikorol
Copy link
Contributor Author

antikorol commented Jun 25, 2024

return new OkResponse<ServiceHostAndPort>(next.HostAndPort);

You are right that problem at this row. But the main problem in concurrent issue that I described before in the class Kube because the collection of services is declared at the class field level.

I spent a lot of time for confirming the issue, I hope it will be enough.
I added debug logs and can see that after clearing the list, sometimes after inserting one service, the list becames with a size of 2.

image

The second element of list is null I can see it in class RoundRoubin

image

Example of logs:
Kube class where I can see that captured list and _services list are different:

public virtual async Task<List<Ocelot.Values.Service>> GetAsync()
{
        var endpoint = await _kubeApi
           .ResourceClient(client => new EndPointClientV1(client))
           .GetAsync(_configuration.KeyOfServiceInK8s, _configuration.KubeNamespace);

        _services.Clear();
        if (endpoint?.Subsets.Count != 0)
        {
            var array = BuildServices(_configuration, endpoint!);

            _services.AddRange(array);

            var captured = new List<Ocelot.Values.Service>(array);

            _logger.LogWarning(() => $"RoundRobin2: Kube.GetAsync;"
                + $"_services.Count: {_services.Count}; captured.Count: {captured.Count};"
                + $" Services: {string.Join(Environment.NewLine, captured.Select(s => s is null ? "null" : JsonConvert.SerializeObject(s)))}");
        }
        else
        {
            _logger.LogWarning(() => $"K8s Namespace:{_configuration.KubeNamespace}, Service:{_configuration.KeyOfServiceInK8s}; Unable to use: it is invalid. Address must contain host only e.g. localhost and port must be greater than 0!");
        }

        return _services;
}

If the logs aren't entirely clear, I created a code sample that shows if the list is modified from different threads, we can get null where it shouldn't be. In the sample, I always add one element to the list, but in different threads, and I end up with the list size being more than 1 and the second element is null. And we don't get an exeption that the collection was modified.
Example is here https://github.com/antikorol/incorrect-list-usage-sample

@raman-m raman-m added bug Identified as a potential bug Oct'24 October 2024 release and removed waiting Waiting for answer to question or feedback from issue raiser labels Jun 25, 2024
@raman-m raman-m added this to the Summer'24 milestone Jun 25, 2024
@raman-m
Copy link
Member

raman-m commented Jun 25, 2024

@antikorol Roman,
This code analysis and research are excellent, thank you!
I concur that the Kube design warrants a review, and synchronization issues need resolution. It appears that the forthcoming refactoring may alter between 2 to 5 classes in the K8s provider. Following your report, it seems the Kube instance is acting as a singleton, likely due to the storage of provider delegates in a static field.

If you open a fix PR, I will prepare an additional hotfix release. However, I'm concerned that we need to refactor the Kube provider logic, which may require some time to develop smoothly.

FYI, I've included this bug issue in the Summer'24 milestone. This represents the least favorable delivery scenario. I am optimistic that we will manage to deliver the fix within a week or two, or alternatively, we can assign the issue to the Annual'23 release which is early bird aka current release.

@raman-m
Copy link
Member

raman-m commented Jun 25, 2024

Tasks

@antikorol
Copy link
Contributor Author

A quick fix is to move the creation of the List from the class fields to a method.
This will increase allocations, but it will help avoid exception in high-load APIs.
A memory-optimized solution can be written or refactored later.

@raman-m
Copy link
Member

raman-m commented Jun 30, 2024

Thank you, Roman!
To understand the subsequent steps, please review our Dev Process.

@raman-m raman-m self-assigned this Jul 10, 2024
@raman-m raman-m added the high High priority label Jul 10, 2024
@raman-m raman-m modified the milestones: Summer'24, v23.3.x Hotfixes Jul 10, 2024
@raman-m
Copy link
Member

raman-m commented Jul 11, 2024

Could you provide the response data or JSON that your client returns, please? Understanding the actual return values of these lines would be helpful:

     var endpoint = await _kubeApi
           .ResourceClient(client => new EndPointClientV1(client))
           .GetAsync(_configuration.KeyOfServiceInK8s, _configuration.KubeNamespace);

@antikorol
Copy link
Contributor Author

Could you provide the response data or JSON that your client returns, please? Understanding the actual return values of these lines would be helpful:

     var endpoint = await _kubeApi
           .ResourceClient(client => new EndPointClientV1(client))
           .GetAsync(_configuration.KeyOfServiceInK8s, _configuration.KubeNamespace);
{
   "kind":"Endpoints",
   "apiVersion":"v1",
   "metadata":{
      "name":"fake-service",
      "namespace":"fake-namespace-dev",
      "uid":"a00000a0-a0aa-0000-0000-0a00aaa0a00a",
      "resourceVersion":"111111",
      "creationTimestamp":"2023-11-01T13:07:32Z",
      "labels":{
         "app.kubernetes.io/instance":"fake-service",
         "app.kubernetes.io/managed-by":"Helm",
         "app.kubernetes.io/name":"fake-service",
         "app.kubernetes.io/version":"1.0.1",
         "helm.sh/chart":"fake-service-0.1.0"
      },
      "managedFields":[
         {
            "manager":"kube-controller-manager",
            "operation":"Update",
            "apiVersion":"v1",
            "time":"2024-07-11T11:04:39Z",
            "fieldsType":"FieldsV1",
            "fieldsV1":{
               "f:metadata":{
                  "f:labels":{
                     ".":{
                        
                     },
                     "f:app.kubernetes.io/instance":{
                        
                     },
                     "f:app.kubernetes.io/managed-by":{
                        
                     },
                     "f:app.kubernetes.io/name":{
                        
                     },
                     "f:app.kubernetes.io/version":{
                        
                     },
                     "f:helm.sh/chart":{
                        
                     }
                  }
               },
               "f:subsets":{
                  
               }
            }
         }
      ]
   },
   "subsets":[
      {
         "addresses":[
            {
               "ip":"111.22.33.10",
               "nodeName":"ip-111.22.33.10.eu-central-1.compute.internal",
               "targetRef":{
                  "kind":"Pod",
                  "namespace":"fake-namespace-dev",
                  "name":"fake-service-ddsadasd-1111",
                  "uid":"a00000a0-a0aa-0000-0000-0a00aaa0a00a"
               }
            },
            {
               "ip":"111.33.33.11",
               "nodeName":"ip-111.33.33.11.eu-central-1.compute.internal",
               "targetRef":{
                  "kind":"Pod",
                  "namespace":"fake-namespace-dev",
                  "name":"fake-service-ddsadasd-2222",
                  "uid":"a00000a0-a0aa-0000-0000-0a00aaa0a00a"
               }
            }
         ],
         "ports":[
            {
               "name":"http",
               "port":80,
               "protocol":"TCP"
            }
         ]
      }
   ]
}

@raman-m
Copy link
Member

raman-m commented Jul 11, 2024

Thank you for the JSON samples! I will utilize them to write an acceptance test.

@raman-m
Copy link
Member

raman-m commented Jul 12, 2024

Roman, I have the next question regarding load balancing in your project...
Why does your team use Kubernetes for service registrations? Is the project big enough?
Using containerization tools makes sense in case of large projects, in a cloud to deploy containerization tools like Kubernetes or Consul.

If the project is small and the number of downstream services is constant (with all service IPs known), you can use static service definitions in the DownstreamHostAndPorts route options and manually load balance, as described in this documentation section:

So, it's just a question of...

@antikorol
Copy link
Contributor Author

antikorol commented Jul 12, 2024

In my opinion, Kubernetes is a good choice for a startup/initial stage of the project as a simple and quick way to implement service discovery (it can be a five-minute job). In the future, after analyzing the load, you can implement another solution.

Using a static IP for a pod, in my view, is an anti-pattern.
Using pods IPs in static configuration is a bad idea because, in Kubernetes, there are many ways when IP address will be changed: deployment, scaling, and even simply rescheduling the pod to a different node by Kubernetes itself.

At the same time, to reduce the load on Kubernetes, I can simply use the PollKube. However, unfortunately, this bug affects this provider too, but this fix resolves it. :)

Is the project big enough?

I hope it will be huge in the future :D, but even now there is scaling during peak load hours.

@raman-m
Copy link
Member

raman-m commented Jul 17, 2024

@antikorol Hi Roman!
I was able to reproduce the problem in this branch: raman-m/2110-steps-to-reproduce → the KubernetesServiceDiscoveryTests class. You need to run ShouldReturnServicesFromK8s_HavingHighLoadOnTheProviderAndRoundRobinBalancer tests.

Be aware that there are stable and unstable theory cases; however, both exhibit the 500 status issue. To reveal the 500 error in the green test, you must use the "Run Until Failed" option. Additionally, both cases, when in Debug mode, might display the exception under discussion, because debug mode show exceptions being thrown.
I'm going to reuse my research results in the fixing PR...

@antikorol
Copy link
Contributor Author

antikorol commented Jul 17, 2024

@raman-m Hello Raman
Thank you for acceptance tests
I've tried to run this test, but I have the error (on macOS)

System.Net.Sockets.SocketException: Can't assign requested address

This is a known issue for macOS. I tried a couple of suggestions from the internet, but they didn't help, and many of the suggestions are such that I wouldn't want to implement on my work laptop.

I see that your CI failed on this test case. Can I simply cherry-pick the commit and check it on your CI?

@raman-m
Copy link
Member

raman-m commented Jul 17, 2024

We use Windows and Linux only for now. Don't use MacOS please.
CircleCI runs builds using Linux env.
Now the test is green

@raman-m
Copy link
Member

raman-m commented Jul 18, 2024

I've tried to run this test, but I have the error (on macOS)

System.Net.Sockets.SocketException: Can't assign requested address

This is a known issue for macOS. I tried a couple of suggestions from the internet, but they didn't help, and many of the suggestions are such that I wouldn't want to implement on my work laptop.

@antikorol, you've highlighted an important aspect concerning MacOS builds. We will soon establish automated Windows CI-CD builds as certain features of Ocelot necessitate it. Linux builds satisfy 99.9% of our requirements, and the .NET SDK versus Linux is highly stable since Microsoft conducts extensive testing of the .NET SDK on Linux OS. I am in the process of ascertaining whether the issue lies with the .NET SDK on MacOS or if it is an issue unique to Ocelot?

@antikorol
Copy link
Contributor Author

antikorol commented Jul 18, 2024

@raman-m Loopback does not work on macOS. If I want to use addresses 127.0.0.2+, I need to manually configure the alias.

https://github.com/ThreeMammals/Ocelot/blob/raman-m/2110-steps-to-reproduce/test/Ocelot.AcceptanceTests/Steps.cs#L69

If I don't use loopback here, the tests pass (the method is called with different ports).

@raman-m
Copy link
Member

raman-m commented Jul 18, 2024

Understood! As a MacOS user, you can suggest a directive preprocessor correction to ensure this helper runs successfully in MacOS environments. MacOS machines should indeed have localhost as an embedded local IP. Have you investigated this issue thoroughly? Could you run the Terminal command ping localhost or an equivalent and share the Terminal output, please? Unfortunately, I do not have access to any MacOS machines.

Loopback does not work on macOS. If I want to use addresses 127.0.0.2+, I need to manually configure the alias.

You can manually configure aliases such as localhost1 and localhost2 in macOS; however, this method is not suitable for automated scripts.

@antikorol
Copy link
Contributor Author

I encountered this problem many years ago, and it seems that nothing has changed since then.
I've attached a screenshot of the ping results.

image

I solved this problem last time by using the command for each needed ips

sudo ifconfig lo0 alias 127.0.0.2 up

There might be a more elegant solution, but I didn't have free time to investigate further.

@raman-m
Copy link
Member

raman-m commented Jul 31, 2024

@antikorol commented on July 11, 2024

Hello! I have a few questions regarding the response JSON from Kubernetes when a service instance goes offline:

  • Is an offline service included in the list of addresses?
  • Is an offline service removed or marked with a special tag, etc.?
  • What occurs if all services go offline? Is the response still returned with an empty list of addresses?

Links to related Kubernetes documentation would be appreciated.

@antikorol
Copy link
Contributor Author

antikorol commented Jul 31, 2024

@antikorol commented on July 11, 2024

Hello.
Kubernetes docs: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.30/#read-operations-endpoints-v1-core

  • If all pods are terminated or scaled to 0, subsets are not returned.
{
  "kind": "Endpoints",
  "apiVersion": "v1",
   "metadata":{
    "name":"fake-service",
    "namespace":"fake-namespace-dev",
    "uid":"a00000a0-a0aa-0000-0000-0a00aaa0a00a",
    "resourceVersion":"111111",
    "creationTimestamp": "2023-11-01T13:01:12Z",
    "labels": {
      "app.kubernetes.io/instance":"fake-service",
      "app.kubernetes.io/managed-by":"Helm",
      "app.kubernetes.io/name":"fake-service",
      "app.kubernetes.io/version":"1.0.0",
      "helm.sh/chart":"fake-service-0.1.0"
    },
    "managedFields": [
      {
        "manager": "kube-controller-manager",
        "operation": "Update",
        "apiVersion": "v1",
        "time": "2024-07-30T12:31:35Z",
        "fieldsType": "FieldsV1",
        "fieldsV1": {
          "f:metadata": {
            "f:labels": {
              ".": {},
              "f:app.kubernetes.io/instance": {},
              "f:app.kubernetes.io/managed-by": {},
              "f:app.kubernetes.io/name": {},
              "f:app.kubernetes.io/version": {},
              "f:helm.sh/chart": {}
            }
          }
        }
      }
    ]
  }
}
  • If the service has been deleted, return a status code 404 with the following response:
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "endpoints \"fake-service\" not found",
  "reason": "NotFound",
  "details": {
    "name": "fake-service",
    "kind": "endpoints"
  },
  "code": 404
}

@raman-m
Copy link
Member

raman-m commented Jul 31, 2024

If the service has been deleted, return a status code 404 with the following response:

If the service has not been deleted but has gone offline, Kubernetes clients should still return one or more service instances as addresses in subsets. If there were two instances and one went offline without being deleted, the question is: What happens to the list of services in the JSON response? Will it still be present or marked with a special tag, annotation, etc.?

From the details of your issue and the logs provided, it appears there was one instance online. For some reason, this instance went offline, resulting in an empty JSON addresses/subsets list, causing Ocelot's Kube provider to behave incorrectly at the load balancer level.
If the Kubernetes endpoint removes offline instances from the list, then we have one testing scenario.
If the Kubernetes endpoint does not remove offline instances but marks them with tags/metadata, then we have a different testing scenario.
Currently, I am testing the first scenario, assuming that offline services are removed from the list in the subset. I hope this reflects the real-world behavior of the Kubernetes endpoint. However, I understand that DevOps can configure orchestration differently.

@antikorol
Copy link
Contributor Author

antikorol commented Jul 31, 2024

A service cannot be offline in Kubernetes; it may have a list of pods equal to 0.
Only pods that are in the 'Running' state are included in the subset list. Even when a pod starts to delete (state 'Terminating'), it will no longer be considered available (will not present in subset list)
Maybe I just misunderstood you.

From the details of your issue and the logs provided, it appears there was one instance online

I provided information on the service that has the fewest pods, to reduce logs.
The error is also reproduced on services with more than 10 pods.

⚠️ As I mentioned earlier, somewhere at the beginning of our discussion, the main issue here is modifying the list from different threads (clearing and adding). It doesn't depend on the number of elements, whether it's 1 or 5. If you look at the example I shared earlier, you'll see a lot of interesting behaviors when modifying from different threads. Specifically, for an example with 5 services:

  • You might just get an empty list (in this case, there wouldn't be an error in your class).
  • You might get a size twice and more as large as expected (instead of 5, you get 10, 20 etc).
  • You might get a list consisting entirely of nulls.
  • You might get a list half full of valid values and half of nulls. (in case of list was twiced)

Here is example with results (Five valid elements are always added to the list!):
image

The code commited to Git contains commented-out code for serialization and copying of the list, as this significantly increases the number of exceptions related to the collection being modified (accessing an element by index does not return such an exception, which often leads to cases with null elements)

raman-m added a commit that referenced this issue Aug 7, 2024
…ices in `Kube` provider (#2111)

* Move the creation of the services list from the class field to the method, to prevent modification list from different threads

* Early return after data checking

* Add unit test for concurrent get list of services

* Add logging for invalid service configuration error in RoundRobin load balancer

* Code review by @raman-m

* Workaround for mistakes made during acceptance testing of load balancing versus service discovery, where tests designed for parallel requests were mistakenly executed sequentially. This resulted in load balancers being loaded by sequential `HttpClient` calls, which was a significant oversight.

* Let's DRY StickySessionsTests

* Add acceptance tests, but...
RoundRobin is not actually RoundRobin 😁 -> 😆

* Independent static indexing iterators per route via service names

* Stabilize `CookieStickySessions` load balancer.
Review tests after refactoring of `RoundRobin` load balancer

* Refactor Lease operation for load balancing.
Review LeastConnection load balancer

* Leasing mechanism in Round Robin load balancer

* Acceptance tests, final version

* Apply Retry pattern for K8s endpoint integration

* Fix IDE warnings and messages

* Follow suggestions and fix issues from code review by @ggnaegi

* Bump KubeClient from 2.4.10 to 2.5.8

* Fix warnings

* Final version of `Retry` pattern

---------

Co-authored-by: Raman Maksimchuk <dotnet044@gmail.com>
@raman-m raman-m added merged Issue has been merged to dev and is waiting for the next release and removed accepted Bug or feature would be accepted as a PR or is being worked on labels Aug 7, 2024
@raman-m
Copy link
Member

raman-m commented Aug 7, 2024

Hello Roman,
The hotfix has been merged, and we are ready to begin the testing phase for this development version. As I understand, your project has not yet been moved to Production, but you do have a Staging/Testing environment.
Could you please compile the development version manually, update the project reference, and commence testing with logging enabled at the Debug level?
The official release of the hotfix will occur once the remaining tasks in the milestone are completed, which I anticipate will be by September'24.
So, let me know testing results in your env in the future.


⚠️ As I mentioned earlier, somewhere at the beginning of our discussion, the main issue here is modifying the list from different threads (clearing and adding). It doesn't depend on the number of elements, whether it's 1 or 5. If you look at the example I shared earlier, you'll see a lot of interesting behaviors when modifying from different threads. Specifically, for an example with 5 services:

While running acceptance tests in a local testing environment, I observed an increased likelihood of encountering 404/500 status errors as more services went offline. This issue may stem from the heavy load on the IKubeApiClient integration. I managed to stabilize the interaction between RoundRobin and Kube by implementing a Retry pattern within the Kube, which appears to have eliminated the 404 status errors when RoundRobin fails after receiving an empty list of services. However, Ocelot sometimes returns 500/502 status errors when the 'Run Until Failure' test command is executed in Visual Studio.

public void ShouldHighlyLoadOnUnstableKubeProvider_WithRoundRobinLoadBalancing(int totalServices, int totalRequests, int k8sGeneration)

The root cause remains unidentified, necessitating further testing in production environments, as my tests were only conducted locally. In conclusion, the Kube provider seems unstable under heavy loads with an error ratio of approximately 1/500, suggesting that the issue may still lie within the IKubeApiClient integration.

P.S.

Consider switching to the PollKube provider, which reduces the load on the IKubeApiClient and the Kubernetes instance significantly.

@raman-m raman-m removed the Oct'24 October 2024 release label Aug 12, 2024
raman-m added a commit that referenced this issue Oct 3, 2024
…Blue Olympic Balumbes release

* #2084 Apply default config file paths in `GetMergedOcelotJson` when providing the `folder` argument of `AddOcelot` (#2120)

* Adding unit test first

* Fixing default global config file not being found in folder

* Adding PR trait to test

* Backing out whitespace changes

* Code review by @raman-m

* Create Configuration feature folder and move test classes

* Adjust namespace and review what we have

* Acceptance tests for #2084 user scenario

---------

Co-authored-by: Raman Maksimchuk <dotnet044@gmail.com>

* Bump Steeltoe.Discovery.Eureka from 3.2.5 to 3.2.8 in /src/Ocelot.Provider.Eureka (#2122)

* Bump Steeltoe.Discovery.Eureka in /src/Ocelot.Provider.Eureka

Bumps [Steeltoe.Discovery.Eureka](https://github.com/SteeltoeOSS/Steeltoe) from 3.2.5 to 3.2.8.
- [Release notes](https://github.com/SteeltoeOSS/Steeltoe/releases)
- [Changelog](https://github.com/SteeltoeOSS/Steeltoe/blob/main/Steeltoe.Release.ruleset)
- [Commits](SteeltoeOSS/Steeltoe@3.2.5...3.2.8)

---
updated-dependencies:
- dependency-name: Steeltoe.Discovery.Eureka
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Bump Steeltoe.Discovery.ClientCore from 3.2.5 to 3.2.8

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Raman Maksimchuk <dotnet044@gmail.com>

* #2110 Review load balancing and independent fetching the list of services in `Kube` provider (#2111)

* Move the creation of the services list from the class field to the method, to prevent modification list from different threads

* Early return after data checking

* Add unit test for concurrent get list of services

* Add logging for invalid service configuration error in RoundRobin load balancer

* Code review by @raman-m

* Workaround for mistakes made during acceptance testing of load balancing versus service discovery, where tests designed for parallel requests were mistakenly executed sequentially. This resulted in load balancers being loaded by sequential `HttpClient` calls, which was a significant oversight.

* Let's DRY StickySessionsTests

* Add acceptance tests, but...
RoundRobin is not actually RoundRobin 😁 -> 😆

* Independent static indexing iterators per route via service names

* Stabilize `CookieStickySessions` load balancer.
Review tests after refactoring of `RoundRobin` load balancer

* Refactor Lease operation for load balancing.
Review LeastConnection load balancer

* Leasing mechanism in Round Robin load balancer

* Acceptance tests, final version

* Apply Retry pattern for K8s endpoint integration

* Fix IDE warnings and messages

* Follow suggestions and fix issues from code review by @ggnaegi

* Bump KubeClient from 2.4.10 to 2.5.8

* Fix warnings

* Final version of `Retry` pattern

---------

Co-authored-by: Raman Maksimchuk <dotnet044@gmail.com>

* Downgrade the Warning to Information on missing `Content-Length` header in `MultiplexingMiddleware` (#2146)

* fix: downgrade the warning to information on missing content-length header

* chore: add route name to logs

* test: fixing multiplexing middleware tests

* Code review by @raman-m

---------

Co-authored-by: Paul Roy <paul.roy@astriis.com>
Co-authored-by: Raman Maksimchuk <dotnet044@gmail.com>

* Correct the broken link to the GraphQL sample's `README.md` (#2149)

Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
Co-authored-by: Raman Maksimchuk <dotnet044@gmail.com>

* #2116 Escaping unsafe pattern values of `Regex` constructor ​​derived from URL query parameter values containing special `Regex` chars (#2150)

* regex escape handling for url templates

* refactored regex method to lamda version

* Quick code review by @raman-m

* added acceptance test for url regex bug

* moved acceptance test to routing tests

* Convert to theory: define 2 test cases

---------

Co-authored-by: Raman Maksimchuk <dotnet044@gmail.com>

* #2119 Review load balancing (2nd round) and redesign `DefaultConsulServiceBuilder` with `ConsulProviderFactory` refactoring to make it thread safe and friendly (#2151)

* Review tests

* History of Service Discovery testing: add traits

* LoadBalancer traits

* #2119 Steps to Reproduce

* Reuse service handlers of `ConcurrentSteps`

* Reuse service counters of `ConcurrentSteps`

* Add LoadBalancer namespace and move classes

* Move `Lease`

* Move `LeaseEventArgs`

* Analyze load balancers aka `ILoadBalancerAnalyzer` interface objects

* Prefer using named local methods as delegates over anonymous methods for awesome call stack, ensuring the delegate's typed result matches the typed balancer's creator. Additionally, employ an IServiceProvider workaround.

* Review load balancing. Assert service & leasing counters as concurrent step. Final version of acceptance test.

* Fixed naming violation for asynchronous methods: `Lease` -> `LeaseAsync`

* Fix ugly reflection issue of dymanic detection in favor of static type property

* Propagate the `ConsulRegistryConfiguration` object through `HttpContext` in the scoped version of the default service builder, utilizing the injected `IHttpContextAccessor` object.
Update `ConsulProviderFactory`.
Update docs.
Update tests.

* Add tests from clean experiment

* Final review of the tests

* Review `IHttpContextAccessor` logic.
Convert anonymous delegates to named ones in placeholders processing

* Tried to enhance more, but failed

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
Co-authored-by: Ben Bartholomew <70723971+ben-bartholomew@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Roman <61905975+antikorol@users.noreply.github.com>
Co-authored-by: Paul Roy <paul.achess.roy@gmail.com>
Co-authored-by: Paul Roy <paul.roy@astriis.com>
Co-authored-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
Co-authored-by: Finn <26823828+int0x81@users.noreply.github.com>
@raman-m
Copy link
Member

raman-m commented Oct 8, 2024

Fixed in version 23.3.4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Identified as a potential bug high High priority Kubernetes Service discovery by Kubernetes Load Balancer Ocelot feature: Load Balancer merged Issue has been merged to dev and is waiting for the next release Service Discovery Ocelot feature: Service Discovery
Projects
None yet
2 participants