Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lock timeouts #13262

Merged
merged 12 commits into from
Apr 4, 2017
Merged

lock timeouts #13262

merged 12 commits into from
Apr 4, 2017

Conversation

jbardin
Copy link
Member

@jbardin jbardin commented Mar 31, 2017

Adds a -lock-timeout parameter, which will retry a state lock for at least the duration of the timeout.

The actual blocking retry loop is provided by the state.LockWithContext function. State implementations themselves still don't need to be able to handle blocking on locks or timeouts.

Having a way to retry a lock, up until a timeout, would allow terraform automation processes to run concurrently without immediately failing.

Some additional cleanup:

  • Add the Locker interface to the standard State interface. Since all implementations are now Lockers, we can avoid a bunch of type assertions.
  • rename "message" package to "clistate", since that is how it's imported everywhere.
  • move the force-unlock command to "plumbing"
  • Add missing -lock option to import and init
  • Catch a couple locations where locks could be attempted even with -lock=false.

Since moving to the new backends, all states (except InmemState) are
Lockers. Add the methods to the State interface to remove a heap of
assertion checks.
All states are lockers, so get rid of extra asertions.
LockWithContext will retry a lock until the context expires or is
cancelled. This will let us implement a `-lock-timeout` flag, and make
use of existing contexts when applicable.
- Have the ui Lock helper use state.LockWithContext.
- Rename the message package to clistate, since that's how it's imported
  everywhere.
- Use a more idiomatic placement of the Context in the LockWithContext
  args.
Add fields required to create an appropriate context for all calls to
clistate.Lock.

Add missing checks for Meta.stateLock, where we would attempt to lock,
even if locking should be skipped.
shouldn't be listed as a common command
Add the -lock-timeout flag to the appropriate commands.
Add the -lock flag to `init` and `import` which were missing it.
Set both stateLock and stateLockTimeout in Meta.flagsSet, and remove the
extra references for clarity.
Backoff the Lock calls exponentially, to a reasonable limit.
@jbardin jbardin changed the title [WIP] lock timeouts lock timeouts Apr 3, 2017
Copy link
Contributor

@apparentlymart apparentlymart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me!

It's unfortunate to have to use real time.Sleep in tests, but I don't have any better idea... doesn't seem worth all the complexity of abstracting over time in order to avoid that. 😖

@jbardin
Copy link
Member Author

jbardin commented Apr 3, 2017

yeah, I usually prefer to create hooks to cause time-related tests to step through the possible states without Sleep (and I've been known to habitually refactor Sleeps out of exiting tests 😉 ).

I could pretty easily drop the sleep from the in-package test (which I think I'll do now), but agree adding the hooks for external testing doesn't seem worth it, especially since that is using an external process to lock the file as well.

@jbardin
Copy link
Member Author

jbardin commented Apr 4, 2017

Just tacked on the doc changes here too

@apparentlymart
Copy link
Contributor

The docs look good to me too! 😀

@jbardin jbardin merged commit d059939 into master Apr 4, 2017
@jbardin jbardin deleted the jbardin/lock-timeouts branch April 4, 2017 18:30
@brikis98
Copy link
Contributor

@jbardin If you run the init command with -lock-timeout=60m, does that timeout apply only to the init command itself, or all future commands you call after as well (e.g. apply)? Is there any way to define the timeout in your Terraform configurations (.tf) themselves or only via the CLI? It seems very error prone to have to remember to specify the timeout for every single command...

@jbardin
Copy link
Member Author

jbardin commented Apr 17, 2017

@brikis98, the -lock-timeout flag only applies to the command being executed.

The locks by design are intended to fail fast, protecting the state files from concurrent mutations, and the lock implementations themselves don't provide any blocking behavior. This flag was primarily added for the case where automated systems may want to retry the operation automatically if a lock was already taken, and we often rely on flags over configuration for automated systems. This is not meant to solve the problem for highly concurrent systems, since the locks themselves can't block, and therefor can't provide a deterministic serialization of waiting clients.

Not providing the flag can't cause any additional errors, it just reports the lock status to the user on the first attempt.

@brikis98
Copy link
Contributor

Got it, thanks.

@ghost
Copy link

ghost commented Apr 13, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 13, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants