-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Durable Function storage usage when idle #391
Comments
I assume you are running this function app in the Consumption plan? When using the consumption plan, behind the scenes our scale controller component will poll each queue on a 10 second interval. The poll consists of both a metadata query and a peek, though I would need to double check why it would do a peek if the metadata claims that the length is zero. If you have 8 partitions but only four queues are being scanned, then it's possible that you initially created the function app with four partitions (the default) and the scale controller has not been notified of your change to 8 partitions. If you click the "refresh" button in the Azure portal, does that cause you to see that all queues are being monitored? There are some blob operations as well for managing leases. That's likely what you're seeing. I would expect them to be for other blobs and not for taskhub.json, though. I'll need to double-check that as well. |
Thanks for the response and yes, consumption plan. Ah the scale controller, I wondered what was checking and that makes sense, would be good if you can clarify the metadata + peek element. I restarted it and I can see it polling all 8 queues now. More of a feature request but given we will be running a lot of different functions the idle storage costs will soon grow. Could there be a way for say the Scale Controller to stop polling if there's no change for X minutes (maybe a setting in host.json) and then in the Orchestrator have some sort of WakeScaleController That would save the Scale Controller the time and resources polling something that might spend days idle and also reduce the costs on the storage account? |
To clarify further, there are two different sources of polling:
I'm working on a change that will increase the maximum wait time from 10 seconds to 30 seconds. That will primarily help people who are using App Service Plans and can be included in the next release, which should hopefully be in a week or two. Fixing the scale controller logic is a bit more complicated architecturally because it has very little information about what the application is actually doing. We could simply increase the max polling time, but that could cause problems for people who depend on durable timers being triggered accurately. We could look into increasing the max delay time to 30 seconds but I wouldn't feel comfortable with more than 60 seconds. |
Hi @cgillum, We are also experiencing a similar if not the same issue. We have 170 mostly idle consumption plan functions, each with their own I confirmed that all 170 functions are using the 1.7.0 runtime, which fixes #508. The polling and costs seems high, but if we take the polling time of every 10 seconds for each queue as you mention, this looks to be in the ballpark. Is there an option to set the polling frequency perhaps? As you mention above, there was a possibility it could be increased to 30 seconds? Perhaps using V1 storage might be an option to further reduce these costs for us. |
@gorillapower Indeed, I could see how that might be a problem. V2 storage accounts are nice because of the faster performance, but the costs can be quite high from what I've observed. We don't have an option to set the scale controller polling frequency, but it's something we could look into adding as a way to help reduce storage costs. |
Hi, Coming back to this after a while and glad that things have improved, I've not been running the function I was running for a while as I had to shelve that piece of work but am coming back to it shortly. For things we want to scale quickly/responsively 10s is good but appreciate if you're idle 80% of the day then it's not so good. What would be ideal is if it were something we could set programatically with a reasonable range, say 10s to 60s, so when the orchestration function starts it could change the polling interval to 10s and back when it's done; or something to that effect? @gorillapower Incidentally when I moved to V1 storage costs dropped significantly and I didn't notice any tangible change in performance but that is somewhat anecdotal! Si |
UPDATE: as of the v1.8.0 release, the max polling delay in the runtime is now configurable. Not yet in the scale controller though. |
It does not help, I still have BlobLease calls every 10 seconds. |
The fix was specific to queue polling. It doesn’t cover blob leases. For that you’ll want to consider deploying to a new app or task hub with a reduced partition count. That will reduce the number of blob leases (and queues), further decreasing background storage transactions. |
Hi
I have a queue based orchestrator, the queue I'm polling is on another storage account and I have a storage account solely for the function.
Looking at a 24hr period when I submitted nothing to the function it was using approx £0.16 in storage costs.
I turned on the logging and can see that every 10 seconds there are multiple (10) hits on the queues. For each control and workitem Q I see 1 PeekMessage and 1 GetQueueMetadata. I'd have thought the metadata would say there are no messages so the Peek wouldn't be necessary?
Also interesting is that I have 8 partitions configured so there are 8 control Qs, 0-7, but it's only looking at 0-3 every 10s?
The blob storage is getting hit with multiple (3) blob requests for taskhub.json (2 x GetBlobProperties and 1x GetBlob).
I can supply more info if it helps but this means even an idle function will result in approx. £4.87 in storage costs each month (using the cheapest v1 storage).
I understand it needs to poll the Qs for new work but just wanted to check this level of polling and blob reading is to be expected, correct and not excessive.
Cheers
Si
The text was updated successfully, but these errors were encountered: