-
Notifications
You must be signed in to change notification settings - Fork 268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scale controller and long running activity functions issue #353
Comments
Some comments:
|
Hi Simon, My understanding was that 10 minutes limitation is not applying on Durable scenario. On one of my apps I have ActivityTrigger functions that running for several hours and it works fine. I'm OK if it's strict limitation, but I can see that there is not. So I think this case (activity function restarts) should be documented. Thanks, |
|
Simon is correct. The scale controller was not designed to support long-running stateless function execution of any kind (whether Durable Functions, timer triggers, queue triggers, etc.). The fact that you're getting 2-3 hour executions is unfortunately a bug in the function host. I don't know the current status of that bug, but I believe it applies to precompiled functions (and fixing it now would likely cause too much grief for people who've started depending on it). In the possibly near future, we're considering supporting long-running executions for Azure Functions. That would allow your functions to execute for long periods of time and not be killed so aggressively by the scale controller. However, there would still be a chance that the VM you are running on gets reclaimed by Azure for maintenance purposes, so it would be good to write your code defensively to handle those cases regardless of timeout constraints. Until these changes happen to the Azure Functions consumption plan infrastructure, however, what you're observing is unfortunately the expected behavior. If you have more thoughts or concerns about long-running executions in general, I recommend raising them in the Azure Functions GitHub repo so you can get better feedback (you may even find some existing issues on this topic). |
Hi,
Please check following scenario:
So far it looks good, environment gradually scaled out up to 40-50 instances, and processing all Func_act_A as expected. It takes about 1 hour to complete.
During running scenario above another function Func_B triggered via Queue:
At this point scale controller started Func_act_B on one of instances (let's say INST_25) for Func_act_A created before. That's OK.
But when Func_A completed scale controller starts scaling down.
It turning off running instances:
At this point it does not matter if instance CPU 90% stable loaded. It just stops instance in the middle of Func_act_B running and stopped host.
Then environment found that Func_act_B is require to be restarted and started it on INST_15, then on INST_7 etc.
So function Func_act_B restarts until it started on stateful instance (it's probably always the same instance per function app).
As a result it consumed a lot more resorces and time to complete.
So I think there is an issue with scale controller that stopping instances without taking into account functions running there.
Thanks,
Alex
The text was updated successfully, but these errors were encountered: