-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scalability limits in the Resource Processor #4293
Comments
Hi @TonyWildish-BH there are some scenarios where parallel operation's hit transient errors - #3177 (comment) When it's Azure platform issues, all we can do is try work around by blocking parallel operations or throttling. If you can reproduce the scenario and provide the specific errors then we look at next steps. Thanks. |
Here's an example. This was with the resource-processor pool upped to 5 instances, and attempting to create 25 workspaces in parallel. The first succeeded, all the others failed with this same error:
|
Ok, looks like you are hitting "Storage account management operations (list) | 100 per 5 minutes" per subscription/region. It might happening since we switched to using AD auth for terraform, as access keys are disabled in many environments, as they are not as secure (can be shared). Looks like it's being worked on here - Azure/terraform-provider-azapi#691 . If deploying this many workspaces in succession is a requirement for yourselves, suggest contributing to the issue above, as when it gets resolved we can pull the fix. |
Description
In my scalability tests last year, I ran a script that attempted to create dozens of resources - workspaces in this case. With the Resource Processor at the default setting of a max pool size of one, my workspaces were all created, but of course I had to wait a long time, as only 5 processes were running at a time.
I tried enlarging the Resource Processor pool to see if I could create more resources in parallel. I just went into the Azure portal and manually increased the pool size from max 1 to max 4, then re-ran my tests. I saw that it did indeed try to create 20 workspaces in one go, but it failed with terraform errors, the APIs were being throttled by Azure, and resources were left in a bad state. Unfortunately, I no longer have the logs, so I can't give the precise message. However, I do recall that terraform was not handling the throttling well.
Anyway, my question is: How can I increase the parallelism of the Resource Processor without terraform falling over?
This likely requires two things:
Steps
The steps I have tried are:
tre
CLI to create 20-30 workspaces in a tight loop, using--no-wait
so they queue.The text was updated successfully, but these errors were encountered: