Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vttablet: action lock needed to update mysql port #6432

Merged
merged 1 commit into from
Jul 10, 2020

Conversation

sougou
Copy link
Contributor

@sougou sougou commented Jul 10, 2020

The restore function races with the mysql port finder, and sometimes
ends up overwriting the port with its own empty cached value.

This "big hammer" fix makes the port finder wait till it gets the
action lock before updating the mysql port. In a later cleanup,
we can use a more subtle method.

Signed-off-by: Sugu Sougoumarane ssougou@gmail.com

@sougou sougou requested a review from enisoc July 10, 2020 03:56
@@ -582,6 +582,12 @@ func (tm *TabletManager) findMysqlPort(retryInterval time.Duration) {
if err != nil {
continue
}
// We need to get the action lock to make sure no one
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the pubMu not enough?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess pubMu is only for preventing data races when updating the local tablet, but this is a higher-level semantic race?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Restore caches the tablet at the beginning and reuses it at the end. In my initial fix, I changed Restore to re-read the tablet before publishing it. But I remembered that this pattern exists everywhere else, which essentially requires the action lock to be the mutex for the tablet.

I think the more correct fix will be to change all call sites to not cache the tablet, but instead directly change it. This is the more subtle fix I had in mind.

Copy link
Member

@enisoc enisoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The restore function races with the mysql port finder, and sometimes
ends up overwriting the port with its own empty cached value.

This "big hammer" fix makes the port finder wait till it gets the
action lock before updating the mysql port. In a later cleanup,
we can use a more subtle method.

Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
@sougou sougou force-pushed the ss-fix-mysql-port branch from 37b9730 to c121f6b Compare July 10, 2020 05:18
@sougou sougou merged commit 9c86ca2 into vitessio:master Jul 10, 2020
@sougou sougou deleted the ss-fix-mysql-port branch July 10, 2020 15:42
@deepthi deepthi added this to the v7.0 milestone Jul 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants