-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mysqld.GetSchema: tolerate tables being dropped while inspecting schema #12641
Mysqld.GetSchema: tolerate tables being dropped while inspecting schema #12641
Conversation
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
hmmmm got this relevant error in
|
I'm not sure what the situation is in |
Ah, I get it. A table without |
…ped. It can also mean the table does not have PRIMARY KEY Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
OK, the assumption about the 2nd race condition was incorrect. A table that does not appear in Removed the 2nd race condition handling altogether. It a table does end up being dropped before calculating primary key, it will just have an implicit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for investigating this! Nice debugging work. I only had some minor comments.
go/vt/mysqlctl/schema.go
Outdated
// This is fine. We identify the situation and continue to remove the table from our records | ||
sqlErr, isSQLErr := mysql.NewSQLErrorFromError(err).(*mysql.SQLError) | ||
if isSQLErr && sqlErr != nil && sqlErr.Number() == mysql.ERNoSuchTable { | ||
td.Fields = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we instead remove td
from tds
here, or set td = nil
? I feel like the former would be the right thing to do so that our memory representation tracks the actual live mysqld state (which is the point of that component AFAIUI).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I rewrote this in a different way. Problem is that we're iterating tds
concurrently. Any change to tds
itself during iteration is less than ideal. So instead I'm building a validTds
slice which in turn is the final list of table definitions.
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving so that you can merge whenever all the tests are passing.
IMO, we should re-run the affected test(s) 5-10 times before merging to be sure that they now pass each time — or at least don't have the same sub-test failure which led to the investigation — and we've fully addressed the issue.
Thanks again for great investigative work!
go/vt/mysqlctl/schema.go
Outdated
@@ -109,6 +112,15 @@ func (mysqld *Mysqld) GetSchema(ctx context.Context, dbName string, request *tab | |||
|
|||
fields, columns, schema, err := mysqld.collectSchema(ctx, dbName, td.Name, td.Type, request.TableSchemaOnly) | |||
if err != nil { | |||
// there's a possible race condition: it could happen that a table was dropped in between reading |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IF you have to do another push for some reason, we can capitalize There's
here. It's annoyingly nitty, I know, so no need unless we have to do another commit/push for another reason.
I've ran this on local env some 20 times and it passed all runs. I'm signing out for the week and won't be watching CI to kick restarts. If this can wait until Sunday, I can verify and merge then. If not, anyone should feel free to test/retry and merge. |
Sounds good. You deserve a clean break for the weekend! 🙂 I can periodically re-run the test, as I'm doing that on another PR too. |
Multiple tests are consistently failing on this PR. Example:
I'm going to try a different angle to this PR. Instead of removing the dropped table from the result set, I'm going to keep it there, with empty table definition, ie empty keys, empty columns etc. |
…we keep the table, but with empty column/key/fields info Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
@mattlord @rohit-nayak-ps could you give this another look? I've simplified the change to just returning an incomplete table description. Previously, when I excluded the table description from the result table list, many tests failed, consistently, see above. Now, everything seems to pass. |
PS I'm re-running |
So with the current implementation Tests are super happy. I think this should be good. |
Looks good @shlomi-noach, thanks! |
waiting to see if the bot is able to cherry pick this |
…ma (vitessio#12641) * Mysqld.GetSchema: tolerate tables being dropped while inspecting schema Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * lack of primary key columns in STATISTICS does not mean table is dropped. It can also mean the table does not have PRIMARY KEY Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * populate validTds rather than rely on nil hints Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * re-introdce earlier check Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * use validTds, sync Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * due to many tests consistently failing, trying a different approach: we keep the table, but with empty column/key/fields info Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * grammar Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --------- Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…ma (vitessio#12641) * Mysqld.GetSchema: tolerate tables being dropped while inspecting schema Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * lack of primary key columns in STATISTICS does not mean table is dropped. It can also mean the table does not have PRIMARY KEY Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * populate validTds rather than rely on nil hints Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * re-introdce earlier check Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * use validTds, sync Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * due to many tests consistently failing, trying a different approach: we keep the table, but with empty column/key/fields info Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * grammar Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --------- Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…ma (vitessio#12641) * Mysqld.GetSchema: tolerate tables being dropped while inspecting schema Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * lack of primary key columns in STATISTICS does not mean table is dropped. It can also mean the table does not have PRIMARY KEY Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * populate validTds rather than rely on nil hints Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * re-introdce earlier check Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * use validTds, sync Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * due to many tests consistently failing, trying a different approach: we keep the table, but with empty column/key/fields info Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> * grammar Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> --------- Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…ma (#12641) (#12666) * Mysqld.GetSchema: tolerate tables being dropped while inspecting schema * lack of primary key columns in STATISTICS does not mean table is dropped. It can also mean the table does not have PRIMARY KEY * populate validTds rather than rely on nil hints * re-introdce earlier check * use validTds, sync * due to many tests consistently failing, trying a different approach: we keep the table, but with empty column/key/fields info * grammar --------- Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…ma (#12641) (#12665) * Mysqld.GetSchema: tolerate tables being dropped while inspecting schema * lack of primary key columns in STATISTICS does not mean table is dropped. It can also mean the table does not have PRIMARY KEY * populate validTds rather than rely on nil hints * re-introdce earlier check * use validTds, sync * due to many tests consistently failing, trying a different approach: we keep the table, but with empty column/key/fields info * grammar --------- Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…ma (#12641) (#12664) * Mysqld.GetSchema: tolerate tables being dropped while inspecting schema * lack of primary key columns in STATISTICS does not mean table is dropped. It can also mean the table does not have PRIMARY KEY * populate validTds rather than rely on nil hints * re-introdce earlier check * use validTds, sync * due to many tests consistently failing, trying a different approach: we keep the table, but with empty column/key/fields info * grammar --------- Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Description
Identify the two scenarios where a table may be dropped while we're inspecting the schema. Either we get a
(errno 1146)
error code, or the table does not appear ininformation_schema
even after it did, before.The fix is to silently ignore tables hitting this scenario.
Related Issue(s)
Checklist