Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local HTTP listener causing startup issues #1273

Closed
ConnorMcMahon opened this issue Mar 16, 2020 · 28 comments · Fixed by #1719
Closed

Local HTTP listener causing startup issues #1273

ConnorMcMahon opened this issue Mar 16, 2020 · 28 comments · Fixed by #1719
Assignees
Labels
Milestone

Comments

@ConnorMcMahon
Copy link
Contributor

Tracking this information as an issue here from @olitomlinson. Quotes below from him:

I've got an open support case (120031225000535) because my DF app failed to start up for approx 30 minutes and then eventually corrected itself without intervention.

The error message during startup was :

Failed to bind to address http://127.0.0.1:17071: address already in use. Only one usage of each socket address (protocol/network address/port) is normally permitted

My non-scientific googling has brought me to this GitHub issue. I don't really know if this work item impacts my particular issue, it probably doesn't, but are you in a position to shed any light? No worries if not, ill just keep following up with support to get to the root cause. Thanks in advance!

@ConnorMcMahon
Copy link
Contributor Author

@olitomlinson

This is the port we are opening. We should be properly shutting it down, so it is concerning you are seeing this cause stalling on restarting your application.

That being said, I find it interesting that you would even have this code starting up. We only try to turn it on for non-.NET apps, as it is only used in our out-of-proc language SDKs (i.e. JavaScript and soon Python).

To mitigate in the meantime, can you make sure your application has FUNCTIONS_WORKER_RUNTIME set to dotnet? That should prevent us from starting up any listener on this port. The other option that should help mitigate is to localRpcEndpointEnabled to false in your host.json.

@olitomlinson
Copy link
Contributor

@ConnorMcMahon thanks for the swift reply!

Yes it appears that I don't have FUNCTIONS_WORKER_RUNTIME set in my configuration at all. I will apply it.

Btw is this configuration mandatory? I guess I'm confused because it doesn't feel mandatory, but it kinda actually is?

@ConnorMcMahon
Copy link
Contributor Author

I know it is something we flag as a warning in our internal tooling for investigating customer issues. That being said, I'm not sure if it is explicitly required, because we are often able to infer based on the code we end up seeing.

In general, I think our tooling now sets this automatically when you select your language, but I may be incorrect about that. Assuming your function application has been around for a long time, it could be from before we started setting that in the default app creation in our tools.

@cgillum
Copy link
Member

cgillum commented Mar 16, 2020

It just occurred to me that this listener behavior could be very problematic for apps that use slots because of how multiple instances may be running simultaneously on the same VM. We probably need to prioritize making the listener port selection dynamic like @anthonychu suggested to avoid these kinds of problems.

@olitomlinson
Copy link
Contributor

olitomlinson commented Mar 17, 2020

@cgillum @ConnorMcMahon We had this again last night, my fault for not getting the config update rolled out to prod in time.

However this time, we didn't get any traces from the host trying to start-up. Just exceptions. This Function App was therefor not consuming messages building up on Service Bus for the last 9 hours.

It seems weird that we've had 2 instances of this failure in the last few days. Has something changed in the underlying host environment that would make this issue more likely to happen?

@ConnorMcMahon
Copy link
Contributor Author

This is being addressed in #1307.

@ConnorMcMahon ConnorMcMahon added this to the v2.2.1 milestone Apr 9, 2020
@ConnorMcMahon ConnorMcMahon self-assigned this Apr 9, 2020
@francescopersico
Copy link

@cgillum @ConnorMcMahon
I have 2.2.1 version installed in my functionapp and i am using slots.

Just got the error:

Failed to bind to address http://127.0.0.1:17071: address already in use. Only one usage of each socket address (protocol/network address/port) is normally permitted. Only one usage of each socket address (protocol/network address/port) is normally permitted.

I am using NodeJS for my functions code.

@ConnorMcMahon
Copy link
Contributor Author

@francescopersico,

Hmm, that is very curious.

Do you have an application name you can share with us publicly (or privately).

Also, a rough timestamp in UTC of when this error occured would be helpful.

@CastleArg
Copy link

I am having this issue as well. A node durable function app that has been running for months suddenly went dead in the water and cannot start due to port in use error

@cgillum
Copy link
Member

cgillum commented May 29, 2020

@CastleArg does it stay in that state or does it recover after a minute or so?

@CastleArg
Copy link

nope it can't start at all. I even deleted and redeployed the app with no result. We have a prod environment set up in an identical way and that is still running funnily enough.

2020-05-28T08:21:29.406 [Error] A host error has occurred during startup operation '629af924-5cfd-4642-9647-847b8124ac55'.
System.IO.IOException : Failed to bind to address http://127.0.0.1:17071: address already in use. ---> Microsoft.AspNetCore.Connections.AddressInUseException : Only one usage of each socket address (protocol/network address/port) is normally permitted. ---> System.Net.Sockets.SocketException : Only one usage of each socket address (protocol/network address/port) is normally permitted.
at System.Net.Sockets.Socket.UpdateStatusAfterSocketErrorAndThrowException(SocketError error,String callerName)
at System.Net.Sockets.Socket.DoBind(EndPoint endPointSnapshot,SocketAddress socketAddress)
at System.Net.Sockets.Socket.Bind(EndPoint localEP)
at Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets.SocketConnectionListener.Bind()
End of inner exception
at Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets.SocketConnectionListener.Bind()
at Microsoft.AspNetCore.Server.Kestrel.Transport.Sockets.SocketTransportFactory.BindAsync(EndPoint endpoint,CancellationToken cancellationToken)
at async Microsoft.AspNetCore.Server.Kestrel.Core.KestrelServer.<>c__DisplayClass21_01.<StartAsync>g__OnBind|0[TContext](??) at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.AspNetCore.Server.Kestrel.Core.Internal.AddressBinder.BindEndpointAsync(ListenOptions endpoint,AddressBindContext context) End of inner exception at async Microsoft.AspNetCore.Server.Kestrel.Core.Internal.AddressBinder.BindEndpointAsync(ListenOptions endpoint,AddressBindContext context) at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.AspNetCore.Server.Kestrel.Core.ListenOptions.BindAsync(AddressBindContext context) at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.AspNetCore.Server.Kestrel.Core.Internal.AddressBinder.AddressesStrategy.BindAsync(AddressBindContext context) at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.AspNetCore.Server.Kestrel.Core.Internal.AddressBinder.BindAsync(IServerAddressesFeature addresses,KestrelServerOptions serverOptions,ILogger logger,Func2 createBinding)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at async Microsoft.AspNetCore.Server.Kestrel.Core.KestrelServer.StartAsync[TContext](IHttpApplication1 application,CancellationToken cancellationToken) at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.AspNetCore.Hosting.WebHost.StartAsync(CancellationToken cancellationToken) at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Extensions.DurableTask.LocalHttpListener.StartAsync() at d:\a\r1\a\azure-functions-durable-extension\src\WebJobs.Extensions.DurableTask\LocalHttpListener.cs : 47 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Extensions.DurableTask.HttpApiHandler.StartLocalHttpServerAsync() at d:\a\r1\a\azure-functions-durable-extension\src\WebJobs.Extensions.DurableTask\HttpApiHandler.cs : 768 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at Microsoft.Azure.WebJobs.Extensions.DurableTask.DurableTaskExtension.StartLocalRcpServer() at d:\a\r1\a\azure-functions-durable-extension\src\WebJobs.Extensions.DurableTask\DurableTaskExtension.cs : 224 at Microsoft.Azure.WebJobs.Extensions.DurableTask.DurableTaskExtension.Microsoft.Azure.WebJobs.Host.Config.IExtensionConfigProvider.Initialize(ExtensionConfigContext context) at d:\a\r1\a\azure-functions-durable-extension\src\WebJobs.Extensions.DurableTask\DurableTaskExtension.cs : 200 at Microsoft.Azure.WebJobs.Host.DefaultExtensionRegistryFactory.Create() at C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\DefaultExtensionRegistryFactory.cs : 38 at Microsoft.Azure.WebJobs.WebJobsServiceCollectionExtensions.<>c.<AddWebJobs>b__1_0(IServiceProvider p) at C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Hosting\WebJobsServiceCollectionExtensions.cs : 53 at DryIoc.Microsoft.DependencyInjection.DryIocAdapter.<>c__DisplayClass3_0.<RegisterDescriptor>b__0(IResolverContext r) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\DryIocAdapter.cs : 156 at DryIoc.Registrator.<>c__DisplayClass27_0.<RegisterDelegate>b__0(IResolverContext r) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 4540 at lambda_method(Closure ,IResolverContext ) at DryIoc.Factory.<>c__DisplayClass26_0.<ApplyReuse>b__2() at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 6595 at DryIoc.Scope.TryGetOrAdd(ImMap1 items,Int32 id,CreateScopedValue createValue,Int32 disposalOrder) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 7839
at DryIoc.Scope.GetOrAdd(Int32 id,CreateScopedValue createValue,Int32 disposalOrder) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 7824
at DryIoc.Factory.ApplyReuse(Expression serviceExpr,Request request) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 6594
at DryIoc.Factory.GetExpressionOrDefault(Request request) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 6554
at DryIoc.Factory.GetDelegateOrDefault(Request request) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 6624
at DryIoc.DelegateFactory.GetDelegateOrDefault(Request request) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 7730
at DryIoc.Container.DryIoc.IResolver.Resolve(Type serviceType,Object serviceKey,IfUnresolved ifUnresolved,Type requiredServiceType,Request preResolveParent,Object[] args) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 289
at lambda_method(Closure ,IResolverContext )
at DryIoc.Factory.<>c__DisplayClass26_0.b__2() at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 6595
at DryIoc.Scope.TryGetOrAdd(ImMap1 items,Int32 id,CreateScopedValue createValue,Int32 disposalOrder) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 7839 at DryIoc.Scope.GetOrAdd(Int32 id,CreateScopedValue createValue,Int32 disposalOrder) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 7824 at DryIoc.Factory.ApplyReuse(Expression serviceExpr,Request request) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 6594 at DryIoc.Factory.GetExpressionOrDefault(Request request) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 6554 at DryIoc.Factory.GetDelegateOrDefault(Request request) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 6624 at DryIoc.Container.ResolveAndCacheDefaultFactoryDelegate(Type serviceType,IfUnresolved ifUnresolved) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 209 at DryIoc.Container.DryIoc.IResolver.Resolve(Type serviceType,IfUnresolved ifUnresolved) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 194 at Microsoft.Azure.WebJobs.Script.WebHost.DependencyInjection.JobHostServiceProvider.GetService(Type serviceType,IfUnresolved ifUnresolved) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\JobHostServiceProvider.cs : 99 at Microsoft.Azure.WebJobs.Script.WebHost.DependencyInjection.JobHostServiceProvider.GetRequiredService(Type serviceType) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\JobHostServiceProvider.cs : 82 at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService(IServiceProvider provider,Type serviceType) at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService[T](IServiceProvider provider) at Microsoft.Azure.WebJobs.WebJobsServiceCollectionExtensions.<>c.<AddWebJobs>b__1_4(IServiceProvider p) at C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Hosting\WebJobsServiceCollectionExtensions.cs : 87 at DryIoc.Microsoft.DependencyInjection.DryIocAdapter.<>c__DisplayClass3_0.<RegisterDescriptor>b__0(IResolverContext r) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\DryIocAdapter.cs : 156 at DryIoc.Registrator.<>c__DisplayClass27_0.<RegisterDelegate>b__0(IResolverContext r) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 4540 at lambda_method(Closure ,IResolverContext ) at DryIoc.Factory.<>c__DisplayClass26_0.<ApplyReuse>b__2() at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 6595 at DryIoc.Scope.TryGetOrAdd(ImMap1 items,Int32 id,CreateScopedValue createValue,Int32 disposalOrder) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 7839
at DryIoc.Scope.GetOrAdd(Int32 id,CreateScopedValue createValue,Int32 disposalOrder) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 7824
at DryIoc.Factory.ApplyReuse(Expression serviceExpr,Request request) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 6594
at DryIoc.Factory.GetExpressionOrDefault(Request request) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 6554
at DryIoc.Factory.GetDelegateOrDefault(Request request) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 6624
at DryIoc.DelegateFactory.GetDelegateOrDefault(Request request) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 7730
at DryIoc.Container.DryIoc.IResolver.Resolve(Type serviceType,Object serviceKey,IfUnresolved ifUnresolved,Type requiredServiceType,Request preResolveParent,Object[] args) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 289
at lambda_method(Closure ,IResolverContext )
at DryIoc.Factory.<>c__DisplayClass26_0.b__2() at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 6595
at DryIoc.Scope.TryGetOrAdd(ImMap`1 items,Int32 id,CreateScopedValue createValue,Int32 disposalOrder) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 7839
at DryIoc.Scope.GetOrAdd(Int32 id,CreateScopedValue createValue,Int32 disposalOrder) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 7824
at DryIoc.Factory.ApplyReuse(Expression serviceExpr,Request request) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 6594
at DryIoc.Factory.GetExpressionOrDefault(Request request) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 6554
at DryIoc.ReflectionFactory.CreateExpressionOrDefault(Request request) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 7073
at DryIoc.Factory.GetExpressionOrDefault(Request request) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 6544
at DryIoc.ReflectionFactory.CreateExpressionOrDefault(Request request) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 7073
at DryIoc.Factory.GetExpressionOrDefault(Request request) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 6544
at DryIoc.Factory.GetDelegateOrDefault(Request request) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 6624
at DryIoc.Container.ResolveAndCacheDefaultFactoryDelegate(Type serviceType,IfUnresolved ifUnresolved) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 209
at DryIoc.Container.DryIoc.IResolver.Resolve(Type serviceType,IfUnresolved ifUnresolved) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\DryIoc\Container.cs : 194
at Microsoft.Azure.WebJobs.Script.WebHost.DependencyInjection.JobHostServiceProvider.GetService(Type serviceType,IfUnresolved ifUnresolved) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\JobHostServiceProvider.cs : 99
at Microsoft.Azure.WebJobs.Script.WebHost.DependencyInjection.JobHostServiceProvider.GetService(Type serviceType) at D:\a\1\s\src\WebJobs.Script.WebHost\DependencyInjection\JobHostServiceProvider.cs : 77
at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetService[T](IServiceProvider provider)
at async Microsoft.Azure.WebJobs.Script.WebHost.WebJobsScriptHostService.UnsynchronizedStartHostAsync(ScriptHostStartupOperation activeOperation,Int32 attemptCount,JobHostStartupMode startupMo

@cgillum
Copy link
Member

cgillum commented May 29, 2020

Thanks for the info.

Which version of the Durable extension are you using? The latest version is supposed to select a different port number based on availability.

In any case, we added a kill switch to this feature just in case it caused problems in unexpected scenarios. You can disable it by setting the localRpcEndpointEnabled to false in host.json.

Here is an example:

{
  "version": "2.0",
  "extensions": {
    "durableTask": {
      "localRpcEndpointEnabled": false
    }
  }
}

Try that and let us know if it resolves the issue. The side effect of this is to revert the durableClient to the old behavior of invoking the external-facing management APIs instead of using the internal ones on the local machine. In most cases, you should only notice a slight performance degradation for durableClient API calls.

@CastleArg
Copy link

Thanks will give this a try.
These are function runtime V2 apps. Should I update to runtime v3?
I'm simultaneously burnt by not being able to locally as I have node 12 installed.

@CastleArg
Copy link

{
"version": "2.0",
"extensionBundle": {
"id": "Microsoft.Azure.Functions.ExtensionBundle",
"version": "[1.*, 2.0.0)"
}
}

I am using extension bundle like so.

@CastleArg
Copy link

@cgillum setting "localRpcEndpointEnabled": false allowed it to start again. Thanks for the quick advice.

@kepikoi
Copy link

kepikoi commented May 30, 2020

This bug still affects my productive durable functions V3 instances! Earlier today all backend operations died because some other function app on my app service plan randomly snagged away port 17071 causing my production functions instance to die / not to be able to start. I'm seeing this behavior on my azure since Mai 28

@ConnorMcMahon
Copy link
Contributor Author

@kepikoi, did you try disabling the feature in your application like recommended above?

The problem for apps using Node is that you are using extension bundles, which operate on v1.x of the extension. This feature was introduced in v1.8.5 of the extension, which likely rolled out in extension bundles recently.

The fix for this is currently only on v2 of the extension. You can manually install v2 of the extension by not using extension bundles and installing via the CLI, or when version 2 of the extension bundles rolls out in the near future, you can update to that.

I am reopening the issue until we port this fix into v1 of the extension and include it in the extension bundle release.

@ConnorMcMahon ConnorMcMahon reopened this May 30, 2020
@kepikoi
Copy link

kepikoi commented Jun 2, 2020

@ConnorMcMahon I am up and running again using

"extensions": {
    "durableTask": {
      "localRpcEndpointEnabled": false
    }
  }

Good to know that manual extension install fixes it. Looks like I need to move away from extension bundles to regain control over my environments

@anthonychu
Copy link
Member

@ConnorMcMahon @cgillum Since getting an updated bundles release out has a bit of lead time, wondering if we can prioritize backporting this to v1 so we can get it out ASAP.

@ConnorMcMahon
Copy link
Contributor Author

The backport is merged and I am hoping to release 1.8.6 today, and update the extension bundles repo so it will go out on the next train.

@vesper2000
Copy link

Is there any update on this, when we can expect it?

@jawa-the-hutt
Copy link

Just wanted to comment that I experienced this issue tonight. Durable Function running on Node. Added the localRpcEndpointEnabled config and it fixed it.

Wondering if the backport to v1 was released at some point or not. Second question is ancillary and wondering if Node.js Durable Functions will be able to run on the 2.x extension at some point.

@ConnorMcMahon
Copy link
Contributor Author

The backport should still definitely be running on v1 of bundles at this point. @jawa-the-hutt, do you have an app name and timestamp?

As for running v2 of the extension, you should be able to do that easily now with extension bundles v2. V2 of the extension bundles uses v2.x of the extension.

@jawa-the-hutt
Copy link

@ConnorMcMahon

App Name: mea-browserless-fa-testsuite

The issue first came up when we auto-deployed code to the dev slot. As part of the troubleshooting, I swapped slots and the issue followed to the main slot. At that point I redeployed code and it resolved itself for a while, but reoccurred later so I stopped to do further troubleshooting and came across this issue and the fix for it.

Here's a list of the approximate GMT times based on the graph below. I think these are the times, but not 100% sure sure. Regardless, the overall timeframe between 12:27AM GMT and 4:30AM GMT is when I was working last night and was experiencing issues.

  • 12:27 AM
  • 12:33 AM
  • 12:48 AM
  • 12:57 AM
  • 1:20 AM
  • 1:26 AM
  • 1:35 AM
  • 1:44 AM
  • 1:52 AM
  • 2:08 AM
  • 2:20 AM
  • 2:52 AM
  • 3:01 AM
  • 3:22 AM
  • 3:26 AM
  • 3:31 AM
  • 3:35 AM
  • 3:39 AM
  • 4:15 AM
  • 4:28 AM

image

@ConnorMcMahon
Copy link
Contributor Author

@jawa-the-hutt

Sincere apologies for the delay here. For some reason your application was still on 1.8.5, instead of 1.8.6, which has the fix for this issue.

There was some internal issue with extension bundles regressing this version of the extension during the fall, but this should have been fixed by January when you saw this. Unfortunately I don't have enough telemetry to identify why this happened.

I noticed you switched to extension bundles v2 at some point, and that is our recommendation for all customers at this point, as it gets fixes/features much quicker.

@jawa-the-hutt
Copy link

jawa-the-hutt commented Feb 19, 2021

@ConnorMcMahon Coming back to this as we are still getting this error with the v2 bundle as far as I can tell. Just ran some things through the Function App at around 2021-02-19T13:01:02 today and logs were filled with it every time it went to scale. Want to make sure we're not missing something on our side as to a config or setting that would help with this.

Would also mention that due to some other issues, we redeployed a couple of weeks ago on a different app name: mea-browserless-fa-startTests

also, not sure if it's related but as things start to scale down, we get something like this each time: starttests-applease: RD0003FFA96AE6 failed to release its lease due to a conflict

@ConnorMcMahon
Copy link
Contributor Author

I'm taking a look, and it looks like there is still a somewhat sizeable gap between where we determine what ports are available and when we start listening to those ports. That means that the fix in v2.2.1 only reduced the surface area of this issue, and did not completely reduce the possibility. In v2.4.2 we will aim to completely eliminate this gap to completely fix this issue.

In the meantime, if you are experiencing this issue, the only solution is unfortunately to either stagger the startup of your slots, or to disable localRpc.

@ConnorMcMahon
Copy link
Contributor Author

This will be released in v2.4.2 of the extension this week. It will take some time to deploy to extension bundles.

The fix for v1 (and v1 extension bundles) is tracked at #1723.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants