-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parallel init failure when using plugin cache #25849
Comments
This is an interesting case! The purpose of the plugin cache is to avoid having to re-download providers with every run. It's not designed to deal with concurrency like this, and I'm not surprised it's failing. The first workaround that comes to my mind would be to run a different CI stage ahead of time that just does an init, to prompt all the provider downloads to the cache, and then run all the others in parallel - maybe without an init. I'll leave this open for now for an engineer to comment more authoritatively on whether this behavior is expected, but I think that this is a case of not being designed for concurrent use of the plugin cache. If you can't find a good workaround, we can relabel this as an enhancement request, but I don't think this constitutes a bug based on my initial assessment. |
I actually thought about that - but I guess the potential issue would be if a couple of states had additional providers that weren't setup / cached in that first init. In our case, the typical providers we use are fairly consistent between the states, so doing something like that might be within the realm of possibility. Another thing I thought about was caching the provider cache directory, except that I think saving / restoring the cache would possibly take longer than simply downloading fresh every time. Interestingly, if I login via ssh, blow away the cache and the Would absolutely appreciate any other suggestions anyone's got in terms of a way to quickly and safely initialize lots of states concurrently. |
ps - initializing the states in serial with the cache is almost, but not quite, as fast as doing them in parallel without it. |
Can you have a per-state provider cache? You said you're using circle - I wonder if you can use their dependency caching on a per-state basis to avoid re-downloading providers, and not deal with locking at all. |
Yes, theoretically, but I wouldn't want to have to configure a separate cache for each state, and restoring that many different caches could be pretty slow. But the other thing is just that restoring / saving the cache sometimes can take longer than just re-downloading something. A single cache would probably be feasible; but there the problem would be deciding which of the configs to bust that cache based on (normally, I'd bust the cache based on where the provider version is declared, but in this case, that could be many different files). Anyway, I think I've got a few ideas that may help, but would be great if you can leave this open a little longer in case someone else has got an idea. |
One other weird thing is that I think all the inits are exiting 0, but the failure I'm getting actually seems to come from the |
ping @apparentlymart will probably have something interesting to say about this when he's back from time off - messaging here to ping him |
Yeah... caching won't work because there's no equivalent to a lockfile that changes [in git] when one or more of the provider versions change. So you can save / restore the cache, but no great way to bust it that I can see so far. That said, I was able to get it to go relatively quickly by increasing the CircleCI level parallelism from 3 -> 4 and having the |
Unfortunately I'm not surprised this doesn't work; it's not designed to run concurrently. We can label this as an enhancement request. We do have a It's also not concurrency-safe, but it might be a faster method of pre-populating a plugin cache. |
Thanks @mildwonkey! |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
I'm seeing an issue when running validation in CI. We have a step that initializes all the states in parallel (we also parallelize in circle using their feature for this, which is what generates the contents of
/tmp/tests_to_run
).Terraform Version
Terraform Configuration Files
most of the providers are configured something like this
Debug Output
here's what I think is the relevant debug level output; if you end up needing a trace, can work out a way to get that to you.
Expected Behavior
terraform should have initialized all the states
Actual Behavior
terraform gives the error below for 1 or 2 of the states
Steps to Reproduce
xargs -i -n 1 -P 8 sh -c 'cd "{}" && terraform init -backend=false -input=false' < /tmp/tests-to-run
This works without the plugin cache, and works for most steps with it, however, there's typically at least one failure on every run with the plugin cache enabled. I'm thinking this could be some kind of race condition (or locking issue, see below). Removing the
-P 4
or-P 8
argument to limit the parallelism to 1 in the xargs call makes it work consistently.Is there a way to make this work or do it safely? Or is this a bug? Or is it just not supported to run init in parallel this way when using the cache?
Additional Context
This is in CircleCI, and the following env vars are set:
Haven't gone back to see if this affects 0.12 as well.
References
The text was updated successfully, but these errors were encountered: