-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent Error: Unable to locate Storage Account when RAGRS/RAZGRS account kind is used #15048
Comments
@andrey-moor I've applied the configuration as you provided above (a typo in the |
Forgot to mention the issue was reproducible only via GH actions and was intermittent. It may be down to the Azure portal replication etc. as GH hosts runners in multiple location and 1st available runner is selected for the pipeline. East US (eastus) Similar behaviour is affecting the KeyVault but with much less frequency Let me see if I can reproduce the issue and capture the debug data via the standalone pipeline, we have switched to ZGRS accounts to unblock the delivery so original config is not available. |
I'm seeing this on creation:
with 2.93.1 with ZGRS type. Related: #5299 (comment) |
@magodo sorry took me a while, I have captured the trace debug of the problem. |
suspect it is related to the call which comes back empty - https://docs.microsoft.com/en-us/rest/api/storagerp/storage-accounts/list
|
Seem the workaround is to create a TAG on the storage account (via portal), after that the issue goes away and above API returns the value. More to it it started to return value for all the storage account in that sub which were having the issue... |
The workaround came from the #11059 as we've seen similar but not that permanent issue with the keyvault. The root cause is highly likely to be the same for both of those issues. |
@andrey-dubnik Just to be sure, the |
This is correct, terraform was able to obtain the keys and all the data for the account but the list api returned blanc hence the account can't be found error. There was another account in the sub which was having an issue originally so in total there are 2 accounts in there. After tagging at least one the second account also appeared in the api call. |
I can add that tagging a storage account worked on my side as well. My storage account was created like this: resource "azurerm_storage_account" "sa" {
name = var.storage_account_name
location = var.location
account_tier = "Standard"
account_replication_type = "GRS"
resource_group_name = azurerm_resource_group.rg.name
} ///It worked 1 out of 8 times, or something, before adding a manual tag. Now it seems stable. /// Edit: Nope, it is not stable at all with the tag thing either. |
This issue is also causing sleepless nights on our side here. What I can say so far: most likely this issue is caused by some race condition hitting sorts of ARM API limits. We're experiencing this on a Terraform project with around 100 resources (1 storage account, 1 key vault, a lot of role assignments to them) and I'm not able to reproduce the issue in a smaller project. However, calling Furthermore:
|
@roehrijn , I have tried with parallelism 1 now, and it does not resolve any issue when it comes to storage accounts at least. Maybe that workaround only works for key vaults? |
Hi @mariussm, it also works for storage accounts in my environment. However, as a wrote, this is unfortunately likely to be some sort of race condition in rate limiting. That's why I think parallelism=1 is not a 100% fix/workaround. Hope MS is going to address this soon. |
@roehrijn / @andrey-dubnik - we've experienced this issue over the past month or so: We found that a call to the We have an open case with the ARM API team, and so far they've confirmed that it's an issue with the ARM API cross-region Cache not being updated quick enough. In terms of fixes / workarounds - they're currently a bit limited:
cc: @stuartleeks |
Using tags as a workaround worked so far and the portal cache was replicated. There were no re-occurrences of the issue since tagging. Since this is a provider-api scope there is no way to influence it externally. If this is a replication lag and not a permanent issue then adding a retry logic would probably help in mitigating the issue as worst case it would be 1 min SLO in oppose to an error which is already good enough. If tagging permanently fixes the issue maybe the api team can use this in the replication fix... |
Hello . I had also experienced same Intermittent issue i.e Unable to locate Storage Account for GRS S account. It working for me post resync of ARM cache Thanks |
I got this issue also. |
We're experiencing this problem in GitHub actions too.
|
I am experiencing the same problem on my local machine. Here is the repo. I run these commands and I encounter this specifically when I run the destroy commands. terraform plan -destroy -out main.destroy.tfplan
terraform apply main.destroy.tfplan
The following shows plan command. I get that error when I run the command for the first time. When I re-run the same command the second time, things run fine. Same is the case with apply command as well. |
When using GH Actions in combination with AzureRM and account_replication_type LRS you have the same problem. When I run the same terraform apply locally I don't have any issues. Seems related to GH actions. azrumrm version 3.12.0
|
Using azurerm 3.16.0, storage account type LRS, this happens fairly frequently. |
@ekristen and @paalders could you please help confirm that the storage account list API (i.e. |
Same problem in my case: Terraform v.1.2.6 It does not count if SA is LRS, od ZRS, 7 times on 10 I get error: "Unable to locate Storage Account". |
Also same issue for me on github actions + terraform cloud. Also LRS type for replication. I can see it on the azure dashboard and on my local machine the command line tool return the correct list. Terraform v.1.2.6 |
For me this kept happening when I deployed even a completely new set of resources to new rg. After deploying the storage accounts, the |
Still facing this issue intermittently on Storage Account Standard_LRS StorageV2. Terraform version: 1.7.4 Error on terraform apply Error on terraform destroy |
I got this issue today, while deploying new resources into a clean subscription. |
This comment was marked as duplicate.
This comment was marked as duplicate.
Same issue for me on two storage accounts with an account replication type of "GRS" Terraform version: 1.7.5 |
This comment was marked as duplicate.
This comment was marked as duplicate.
Same issue with type of "LRS" Terraform version: 1.7.5 |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
Same here.
It works in one run, but stopped working majority of the time.
Powershell equivalent is working fine though
Seems the unreleased version 3.99.0 has some fixes around storage account? https://github.com/hashicorp/terraform-provider-azurerm/blob/main/CHANGELOG.md#3990-unreleased Anyone know when this could be released? |
Womp womp. Looks like 3.99.0 was released last night, but it doesn't fix the issue. Still seeing
|
We got this error with 3.99 today. But our issue was, that we had setup AzureAD Authentication for the Storage Account |
I got this using
|
Lot of people are running into this issue, and for a long time. Anyone know how to summon the this repo's gods? Tried to take a look at the code base, but it's in Go and I am not aware of this language yet @manicminer sorry to ping you directly, saw your name on the latest closed PR's approver. Can you help out summon this repo's gods please |
So ultimately this is an Azure problem. Their API sucks. They decided it's better to be fast then accurate and have take eventually consistent to heart, it will be eventually consistent but that could be 30secs or 10 hours. HOWEVER, I believe the terraform provider could be better!! It could stub state, I could not error on 404 not found given or try harder to wait knowing that the Azure API is terrible. I'm a firm believer that unless it's a direct 409 conflict error, state should be stubbed with azure because of its predictable conflicting naming conventions. I also think that given its eventual consistency that the provide must try harder, wait longer, or stub state for a subsequent run and not error on things like this. |
I encountered this issue a few days ago when upgrading AzureRM from 2.90 to 3.107, I have tried the tag workaround and it does nothing to my problem. I am testing the change in a sandbox environment, in the dev environment everything is working perfectly. Sandbox environment
Dev environment
storage block
|
I am not saying that this is the solution, but, I noticed that when I ran So, I explicitly added the tenant and subscription ids to the setup of the azurerm and it seems to be working for me now. It would be good if someone else could make the test on their side to see if this consistently fixes the issue.
Update: |
Just to add a note we are seeing this with azurerm at 3.98.0. Can we get an update @tombuildsstuff as it's been a couple Months since an update :). |
In the next provider release, we'll be updating the SDK used for storage accounts to use hashicorp/go-azure-sdk. Once that's been released, and we've addressed any potential regressions, we can take another look at this issue. |
Any updates? Still facing this today |
👋 - Just hit this (again) today. Any update 🙏 ? |
Community Note
Terraform (and AzureRM Provider) Version
2.29.0
Affected Resource(s)
azurerm_storage_account intermittently produces an error on plan for the already created resource
Error: Unable to locate Storage Account
Terraform Configuration Files
Expected Behaviour
Should be no error for the RA and non RA accounts
Actual Behaviour
intermittently produces an error on plan for the already created resource
Error: Unable to locate Storage Account
Steps to Reproduce
terraform apply
terraform plan
Important Factoids
Only affecting RA storage accounts
The text was updated successfully, but these errors were encountered: