Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Online DDL: ddl_strategy session variable and vtctl command line argument #7045

Merged

Conversation

shlomi-noach
Copy link
Contributor

Context: #6782 (comment)
Additional contexts: #6926, #6689
Extends #7042

In this PR we utilize the new ddl_strategy session variable. Consider the following:

mysql> show create table t \G

Create Table: CREATE TABLE `t` (
  `id` smallint(6) NOT NULL,
  `i` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

mysql> select @@ddl_strategy;
+----------------+
| @@ddl_strategy |
+----------------+
|                |
+----------------+

mysql> alter table t modify id bigint;
Query OK, 0 rows affected (0.03 sec)

mysql> show create table t \G

Create Table: CREATE TABLE `t` (
  `id` bigint(20) NOT NULL,
  `i` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

mysql> set @@ddl_strategy='gh-ost';
Query OK, 0 rows affected (0.00 sec)

mysql> alter table t modify id mediumint;
+--------------------------------------+
| uuid                                 |
+--------------------------------------+
| b87f9824_2814_11eb_bd3d_f875a4d24e90 |
+--------------------------------------+

mysql> show create table t \G

Create Table: CREATE TABLE `t` (
  `id` bigint(20) NOT NULL,
  `i` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

mysql> -- time passes

mysql> show create table t \G

Create Table: CREATE TABLE `t` (
  `id` mediumint(9) NOT NULL,
  `i` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

Thus, the choice whether to user "normal" DDL or Online DDL, can depend on the valud of @@ddl_strategy session variable, which is either:

  • empty (normal DD)
  • gh-ost
  • pt-osc

I'm not sure yet how to represent migration options. Would that be set @@ddl_strategy='gh-ost; max-load=Threads_running=100'? Seems quite the syntax.

- no pre-generates primitives for normal DDL and Online DDL, wrapped in a new DDL primitive
- DDL.Execute() chooses which of the underlying primitives to invoke

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…ee if it's online or not

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
@harshit-gangal
Copy link
Member

you can take the input as map in a separate session variable like ddl_strategy_settings = 'a=1,b=2'

@shlomi-noach
Copy link
Contributor Author

you can take the input as map in a separate session variable like ddl_strategy_settings = 'a=1,b=2'

Thank you. I'm thinnking two variables are difficult to maintain. The user will set one, and forget that the other one is already set from a previous cycle.

I'm thinking to simplify as set @@ddl_strategy='gh-ost --max-load=Threads_running=100' which actually reads like a normal command line execution.

@shlomi-noach
Copy link
Contributor Author

I am still unsure whether this approach of using session variables is any better for our users. Users want to be able to run exact same command on MySQL and on Vitess, such that the Vitess execution will run an online DDL.

So, the ALTER TABLE is fine. But how will they be able to run set @@ddl_strategy='gh-ost'; on their MySQL server? I feel like we;re back to square one.


//StreamExecute implements the Primitive interface
func (v *DDL) StreamExecute(vcursor VCursor, bindVars map[string]*query.BindVariable, wantields bool, callback func(*sqltypes.Result) error) error {
return vterrors.Errorf(vtrpcpb.Code_INTERNAL, "not reachable") // TODO: shlomi - have no idea if this should work
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can call Execute and pass the result in callback function. The plan is to merge Execute and StreamExecute with streaming only so we need to support then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented

Comment on lines 112 to 115
//GetFields implements the Primitive interface
func (v *DDL) GetFields(vcursor VCursor, bindVars map[string]*query.BindVariable) (*sqltypes.Result, error) {
return nil, vterrors.Errorf(vtrpcpb.Code_INTERNAL, "not reachable") // TODO: shlomi - have no idea if this should work
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will not be called. todo can be removed.

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…n-variable-use

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
@harshit-gangal
Copy link
Member

I am still unsure whether this approach of using session variables is any better for our users. Users want to be able to run exact same command on MySQL and on Vitess, such that the Vitess execution will run an online DDL.

So, the ALTER TABLE is fine. But how will they be able to run set @@ddl_strategy='gh-ost'; on their MySQL server? I feel like we;re back to square one.

Currently there are 2 ways to execute DDL in Vitess.

  1. ApplySchema command in vtctl: we can add online-ddl option there. That can internally handle how to call things
  2. Through VTGate
  • via some MySQL clients (cli or UI) - we can drive the experience there to execute the set statements first to enable online_ddl and then execute the ddl query.
  • via programatically - this we can give flag on vtgate that can be set at startup time so that all ddl are treated with that strategy and user will have option to change it with set statement.

@harshit-gangal
Copy link
Member

There are few places that would need additional test

  1. engine/ddl
  2. executor
  3. end to end test if that are not present.

@shlomi-noach
Copy link
Contributor Author

ApplySchema command in vtctl: we can add online-ddl option there. That can internally handle how to call things

How would it look like in ApplySchema in your opinion?

we can drive the experience there

How do you mean?

we can give flag on vtgate

Yes, I was also thinking about that. One problem I have now is that, say we have a command line flag that sets to gh-ost. Now the user overrides with set session ddl_strategy=''. How do we know that the user set the strategy to (therefore, we should run a normal alter table) as opposed to not setting the session variable (thus, we should run gh-ost)?

set @@ddl_strategy='gh-ost --max-load=Threads_running=100';

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…ts the default value for @@ddl_strategy

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
… to @@ddl_strategy session variable. Added unit and endtoend tests to support the new flag

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
@shlomi-noach
Copy link
Contributor Author

shlomi-noach commented Nov 22, 2020

OK, this PR now supports:

  • ddl_strategy session variable. Use as follows:
set session ddl_strategy='gh-ost';
set @@ddl_strategy='gh-ost';
set @@ddl_strategy='gh-ost --max-load=Threads_running=100';
set @@ddl_strategy='';
select @@ddl_strategy;
  • vtgate command line argument --default-ddl-strategy
    the session variable @@ddl_strategy defaults to that value
  • vtctl ApplySchema now supports --ddl_strategy command line argument
    in exact same format as @@ddl_strategy
  • onlineDDL endtoend tests cover all variations (vtgate, vtctl, with/out session variables, with/out params)

A Followup PR will remove the WITH 'gh-ost' syntax from parser, ast etc. It's too much for this single PR.

Documentation to follow.

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
@shlomi-noach shlomi-noach changed the title Online ddl session variable use Online DDL: ddl_strategy session variable and vtctl command line argument Nov 22, 2020
@shlomi-noach shlomi-noach marked this pull request as ready for review November 22, 2020 15:38
@shlomi-noach
Copy link
Contributor Author

@harshit-gangal I've made changes since your last review; see #7045 (comment)

Comment on lines +32 to +38
func TestParseDDLStrategy(t *testing.T) {
tt := []struct {
strategyVariable string
strategy sqlparser.DDLStrategy
options string
err error
}{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add an invalid parsing test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is one, at the end of the function


var _ Primitive = (*DDL)(nil)

//DDL represents the a DDL statement, either normal or online DDL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: convention in the code base is to have space between // and comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -60,6 +61,7 @@ var (
_ = flag.Bool("disable_local_gateway", false, "deprecated: if specified, this process will not route any queries to local tablets in the local cell")
maxMemoryRows = flag.Int("max_memory_rows", 300000, "Maximum number of rows that will be held in memory for intermediate results as well as the final result.")
warnMemoryRows = flag.Int("warn_memory_rows", 30000, "Warning threshold for in-memory results. A row count higher than this amount will cause the VtGateWarnings.ResultsExceeded counter to be incremented.")
defaultDDLStrategy = flag.String("default-ddl-strategy", "", "Set default strategy for DDL statements. Override with @@ddl_strategy session variable")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can drop the word default from here. This is how some of the other startup parameters behave.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Member

@harshit-gangal harshit-gangal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few nitpick comments. Everything else looks good to me.

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
@shlomi-noach
Copy link
Contributor Author

Documentation: vitessio/website#598

@askdba askdba added this to the v9.0 milestone Nov 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants