Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DropSources: change table rename logic #7230

Merged

Conversation

rohit-nayak-ps
Copy link
Contributor

@rohit-nayak-ps rohit-nayak-ps commented Dec 30, 2020

Description

DropSources (achieved via the Complete subcommand in the v2 MoveTables command) supports a rename flag for MoveTables workflows. Tables are renamed as _<table_name> (see PR #6383) . This change uses a similar logic to that used by pt-osc when it drops tables using the rename template _<table_name>_old.

Reasons:

* we now use a more standard convention
* VReplication already ignores such tables, so they won't show up in vreplication workflows and vstream api events

Checklist

Impacted Areas in Vitess

  • Query Serving
  • VReplication
  • Cluster Management
  • Build

@rohit-nayak-ps
Copy link
Contributor Author

@shlomi-noach, in go/vt/schema/online_ddl.go:35, the regexp used for tables renamed by ptosc is ^_.*_old$. However it seems ptosc seems, by default, to use just _old without the _ prefix (eg: https://github.com/CxOrg/mysql-sync/blob/6656da214d6922cdee56fea7238b5b97de03697a/bin/percona/pt-online-schema-change#L9270). Do we override this functionality in online DDL?

@shlomi-noach
Copy link
Contributor

shlomi-noach commented Jan 3, 2021

@rohit-nayak-ps there's always a _ prefix. Take a look at:

https://github.com/percona/percona-toolkit/blob/e1b7a6a1fd5d6b05194e953fdfa7febc15ab5f17/bin/pt-online-schema-change#L10802

and then at:

https://github.com/percona/percona-toolkit/blob/e1b7a6a1fd5d6b05194e953fdfa7febc15ab5f17/bin/pt-online-schema-change#L10838-L10849

say your table is t. Then pt-online-schema-change tries _t_old, and then if that exists, triers __t_old, ___t_old etc. But there's always at least one _ prefix.

@shlomi-noach
Copy link
Contributor

@rohit-nayak-ps
Copy link
Contributor Author

say your table is t. Then pt-online-schema-change tries _t_old, and then if that exists, triers __t_old, ___t_old etc. But there's always at least one _ prefix.

ok, so the prefix is applied separately as well. Thanks for the clarification. References in blogs like https://www.percona.com/doc/percona-toolkit/3.0/pt-online-schema-change.html (grep for staff_old) confused me, so thought I would confirm.

@shlomi-noach
Copy link
Contributor

shlomi-noach commented Jan 3, 2021

References in blogs like

That documentation baffles me. I think it's a doc bug. In all my experience and experiments (just checked again, there's a _ prefix).

@rohit-nayak-ps rohit-nayak-ps marked this pull request as ready for review January 3, 2021 12:01
@rohit-nayak-ps
Copy link
Contributor Author

@tomkrouper, wanted to check that this change doesn't break anything for you, since you had implemented this originally.

@tomkrouper
Copy link
Collaborator

@tomkrouper, wanted to check that this change doesn't break anything for you, since you had implemented this originally.

I mean, I totally think it should use the gh-ost rename style opposed to pt-online-schema-change, but you know, I may be biased. 😏

In all seriousness, this will not break anything on my side. All that matters in our environment is the _ as the first char. Thanks for the heads up.

@shlomi-noach
Copy link
Contributor

shlomi-noach commented Jan 4, 2021

All that matters in our environment is the _ as the first char.

A gh-ost note tangential to this issue. Skip at will.

Related to that _ point: a revelation that RENAME TABLES locks tables in alphabetical order; users were able to break the atomicy of the cut-over due to this behavior. The unfortunate part of it is that _ preceeds the lower case abc in ascii order, which means tabled prefixed with _ are locked before tables with "normal" lower case alphabet (as is common). It turns out that things break due to this ordering. I'm not sure how to tackle that issue, but just thinking there may come a day where gh-ost would need to support a different prefix than _. See github/gh-ost#887

@shlomi-noach
Copy link
Contributor

@rohit-nayak-ps if you want the old tables to be automatically "garbage collected", then you should rename them into a different name.

Right now, these old tables will just stay in the schema forever, until some DBA will notice their volume and actively drop them. What the garbage collection (aka table lifecycle) mechanism offers you is:

  • you will rename the table to something specific (see following)
  • renamed table will be held for safe keeping for a couple days
  • then will be purged of data
  • eventually dropped.

See this code from Online DDL, which renamed the old migration tables:

renameStatement, _, err := schema.GenerateRenameStatement(artifactTable, schema.PurgeTableGCState, time.Now().UTC())
if err != nil {
return err
}
conn, err := e.pool.Get(ctx)
if err != nil {
return err
}
defer conn.Recycle()
_, err = conn.Exec(ctx, renameStatement, 1, true)

The above actually renames the table directly into the PURGE state, skipping the safekeeping HOLD state. You can choose to use the HOLD state or PURGE state at your discretion.

@rohit-nayak-ps
Copy link
Contributor Author

@rohit-nayak-ps if you want the old tables to be automatically "garbage collected", then you should rename them into a different name.

The current renaming logic is intended to leave the table around with a recognizable name and be manually removed. I simply updated the naming convention to make use of vreplication feature implemented in #7159.

But your suggestion is a good one. Maybe, by default, we can just rename the tables to use the automatic garbage collection feature instead of deleting them immediately. This will be an orthogonal feature to the existing one since GC will purge automatically after the fixed time whereas the DBA might want to keep it around longer. Will plan to implement this in a future PR.

@shlomi-noach
Copy link
Contributor

shlomi-noach commented Jan 5, 2021

@rohit-nayak-ps sounds good

…flow

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
@rohit-nayak-ps rohit-nayak-ps merged commit 2e9763e into vitessio:master Jan 5, 2021
@rohit-nayak-ps rohit-nayak-ps deleted the rn-rename-table-ptosc-logic branch January 5, 2021 13:26
@askdba askdba added this to the v9.0 milestone Jan 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants