-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
provider/google: use a mutex to prevent concurrent sql instance operations #14424
Conversation
Thanks a lot for attacking this issue! While I don't feel 100% competent to comment on this I believe that in my case a mutex on the instance name would not be the ideal solution. While the default for I think prefixing would already improve the situation a lot even though a slightly uneasy feeling remains for me since there is still the hypothetical case where names could collide… I admit that it's probably not a problem in practice though. |
That makes sense! I added a prefix, and included the project name in the key as well just in case the same database name is being used in multiple projects with the same tf config. |
Great, that should be reasonably unique! Now another resource in the same project would have to choose that exact prefix to produce a clash. I still wonder if we should leave a small comment on the Anyways, my concrete issue should be fixed by your patch, so thanks a lot for this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Mostly just looking for a comment update, and had a couple suggestions about the mutex strategy. If I'm misunderstanding, please feel free to correct me!
builtin/providers/google/provider.go
Outdated
@@ -266,3 +267,6 @@ func handleNotFoundError(err error, d *schema.ResourceData, resource string) err | |||
|
|||
return fmt.Errorf("Error reading %s: %s", resource, err) | |||
} | |||
|
|||
// This is a global MutexKV for use within this plugin. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be cool to explain what it's locking on, and what limitation makes the lock necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, rather than having this be plugin-wide, what about making it just for this particular concept--e.g., locking insertion/removal of SQL instances? Then we don't need to worry about the prefix, we can just use something that uniquely identifies an SQL instance amongst all SQL instances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AWS plugin offers a similar Provider wide mutex, so the approach here seems consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, and good to know!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, and good to know!
@@ -67,6 +68,7 @@ func resourceSqlDatabaseCreate(d *schema.ResourceData, meta interface{}) error { | |||
} | |||
|
|||
err = sqladminOperationWait(config, op, "Insert Database") | |||
googleMutexKV.Unlock(instanceMutexKey(project, instance_name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather see this after line 60 in a defer
, to be honest, just to cover our bases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does defer
here actually result in the Unlock
not being called until resourceSqlDatabaseRead
is called? If so, then I think the unlock here is probably correct, assuming the resourceSqlDatabaseRead
is idempotent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, defer
here will result in the lock
persisting until resourceSqlDatabaseRead
completes, so probably not what we want to do.
That said, we probably do want to add an additional unlock in the error check block, yeah?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I knew defer would block until resourceSqlDatabaseRead
was finished; I checked to make sure a lock wasn't needed inside it. IMO, having the lock stuff work without the danger of missing a lock (like in the error check block that you noticed) outweighs the expanded lock time, but I'm willing to be outvoted on that by you and @danawillow.
Regardless, I think we do want to make sure the lock is released after that error check block, even though (IIUC) Terraform will exit anyways, meaning the lock would never be used again regardless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I stand corrected on the unlock in the error check block--this Unlock will fire before the error checking, so another Unlock in the error check block below is unnecessary.)
(EDIT: Ooops, we mean the error check block above. I stand incorrected.)
@@ -121,6 +124,7 @@ func resourceSqlDatabaseDelete(d *schema.ResourceData, meta interface{}) error { | |||
} | |||
|
|||
err = sqladminOperationWait(config, op, "Delete Database") | |||
googleMutexKV.Unlock(instanceMutexKey(project, instance_name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, would really like to see the defer
approach used here, for completeness.
@@ -78,6 +79,7 @@ func resourceSqlUserCreate(d *schema.ResourceData, meta interface{}) error { | |||
} | |||
|
|||
err = sqladminOperationWait(config, op, "Insert User") | |||
googleMutexKV.Unlock(instanceMutexKey(project, instance)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defer
this?
@@ -154,6 +157,7 @@ func resourceSqlUserUpdate(d *schema.ResourceData, meta interface{}) error { | |||
} | |||
|
|||
err = sqladminOperationWait(config, op, "Insert User") | |||
googleMutexKV.Lock(instanceMutexKey(project, instance)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defer
this?
@@ -187,6 +192,7 @@ func resourceSqlUserDelete(d *schema.ResourceData, meta interface{}) error { | |||
} | |||
|
|||
err = sqladminOperationWait(config, op, "Delete User") | |||
googleMutexKV.Unlock(instanceMutexKey(project, instance)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defer
this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 from me, recommend waiting for @paddycarver to review my comment on defer
builtin/providers/google/provider.go
Outdated
@@ -266,3 +267,6 @@ func handleNotFoundError(err error, d *schema.ResourceData, resource string) err | |||
|
|||
return fmt.Errorf("Error reading %s: %s", resource, err) | |||
} | |||
|
|||
// This is a global MutexKV for use within this plugin. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AWS plugin offers a similar Provider wide mutex, so the approach here seems consistent
I'm fine with it, but if @danawillow is willing to update to release the lock in the error block, that'd be even better. :) Leaving to @danawillow. |
I'm fine with the defer- releasing the lock after the read is done seems reasonable. We can always change it later. Updated to add it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Last thing--is it worth using the mutexKV
we already define in provider.go
instead of adding googleMutexKV
?
Yes of course, just needed to merge with master. Done. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
Fixes #9018.
A few implementation questions:
resource_sql_database_instance_test
the best test file for these changes? It's not about instance creation, it's about other stuff creation, but it's also not user or database specific so instance felt more general.