Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion/opinions: query comments or specialized SQL syntax? #6782

Closed
shlomi-noach opened this issue Sep 24, 2020 · 16 comments
Closed

Discussion/opinions: query comments or specialized SQL syntax? #6782

shlomi-noach opened this issue Sep 24, 2020 · 16 comments

Comments

@shlomi-noach
Copy link
Contributor

shlomi-noach commented Sep 24, 2020

On the topics of online schema migrations and managed DROP TABLE, there is the question of how to tell vitess we want an online migration vs normal migration, managed table drop vs. normal table drop. There's two major options, and both have pros and cons, The more I think of it, the less sure I am which approach is more correct. Let's begin with a few examples and then discuss advantages and risks.

Specialized syntax examples:

  • ALTER WITH 'gh-ost' TABLE mytable ADD COLUMN c INT
  • ALTER WITH 'gh-ost' '--critical-load Threads_running=500 --critical-load-hibernate-seconds=60' TABLE mytable ADD COLUMN c INT
  • DROP IN '24h' TABLE mytable
  • DROP IN NOWAIT TABLE mytable

Query comments examples:

  • ALTER /*vt+ WITH=gh-ost */ TABLE mytable ADD COLUMN c INT
  • ALTER /*vt+ WITH=gh-ost ARGS='--critical-load Threads_running=500 --critical-load-hibernate-seconds=60' */ TABLE mytable ADD COLUMN c INT
  • DROP /*vt+ IN=24h */ TABLE mytable
  • DROP /*vt+ IN=NOWAIT */ TABLE mytable

Now let's consider advantages and risks.

  1. Specialized syntax is format. You get a parser error if you make a mistake.
  2. In specialized syntax we must only reuse existing MySQL keywords, or else we introduce new keywords, meaning we can break some table that has a column name by same name as our new keyword.
  3. In query comments we can use whatever keywords and syntax we want.
  4. The query comments approach works for both vitess and non-vitess servers. If you run ALTER /*vt+ WITH=gh-ost */ TABLE mytable ADD COLUMN c INT on a standard MySQL server, that's just a normal ALTER TABLE. If you run this on Vitess, that's an online alter table.
    a. This is a great advantage because you can deploy the same migration command on all environments and each will behave as you expect.
    b. This is also a great risk, because if the user makes a tiny typo, e.g. ALTER /*vt WITH=gh-ost */ TABLE mytable ADD COLUMN c INT, then there's no syntax error, nothin gto tell the user "did you actually intend to..." and what happens is that Vitess does not see a Vitess hint, skips the comment, and actually runs normal ALTER TABLE in production (see 1.)
    That's actually what frightens me the most. I want users to have a great experience, and this lets users shoot themselves in the foot.

I'd like to have a reasonable discussion before making a decision, and getting as many viewpoints as possible. (4b.) is to me the main risk and reason to not using query comments. But perhaps I'm not seeing a simple way around the risk.

@shlomi-noach
Copy link
Contributor Author

  1. Specialized syntax complicates the parser, and can potentially introduce a conflict in a futuristic change of syntax in MySQL.

@cjjb
Copy link

cjjb commented Sep 25, 2020

As a developer I would request you go with the query comments because:

  1. It would make it more future proof and vendor agnostic.
  2. Future migration libraries could decide between running migrations through vitess or directly on the SQL instances.
  3. It would make it easier for local testing when running local instances of databases.
  4. To disable the in query code would be as easy as removing a single character so a simple regex could be used to enable/disable it safely for lenghty migration.

As a way of preventing certain errors in the in query comment I would argue VTExplain could be used, including the part of a warning for a missing "+". If someone is planning on sending special messages to vitess then he/she should have access to the other VT tools and would use them to test and verify.

@harshit-gangal
Copy link
Member

  1. With query comments there is no vendor lock-in.
  2. In Mysql OK Packet we can add info providing details about the alter statement executed. This information can provide details about the alter statement that ran. INFO part in OK Packet is human readable.

@shlomi-noach
Copy link
Contributor Author

@harshit-gangal question: regarding OK packet, this seems orthogonal to the question of comments/syntax, correct?

@dweitzman
Copy link
Member

Another potential risk with comments is that the official mysql CLI strips comments by default. That seems like it could lead to accidents or confusion if removing comments could cause different or incorrect behavior.

@shlomi-noach
Copy link
Contributor Author

@dweitzman thank you! In interesting timing I just found out the same, thanks @systay for pairing with me to find this.

This is a serious problem. To reiterate, the standard mysql client strips out query comments before sending the query to the server (vtgate in our case). All comments are stripped out, except:

  • /*! mysql special comments */
  • if you run mysql -c

This behavior is only (to the best of my knowledge right now) found in the mysql client, and not in the various database connectors (golang, ruby, etc.).

For both DROP TABLE and ALTER TABLE statements, however, it is reasonable to expect users to use the mysql client. DROP and ALTER are common DBA tasks, and many run those manually over SSH/mysql connection.

Thus, if we're to use query comments, we will have to use the MySQL special comment syntax, /*! ... */, and not the Vitess syntax /*vt+ ... */ which is implicitly stripped out.

Related side note on MySQL comments

The MySQL syntax, which includes an exclamation mark, is also a bit problematic with mysql command line when executed in sh/bash and with double quotes. The following invocation: mysql -e "drop /*! some-hint */ table abc" causes the exclamation mark to extrapolate previous command into the string; the correct form must be mysql -e 'drop /*! some-hint */ table abc.

This is simple enough with DROP TABLE, but not so much with ALTER TABLE, where single quotes are commonly found in the ALTER statement; as the reader may know, escaping single quotes within single quotes in bash is complicated in itself.

This is all to say: the end user's experience is going to become more unpleasant.

summary of this comment

We must either:

  • Use MySQL special query comment syntax:/*! ... */, or
  • Use special SQL syntax

@shlomi-noach
Copy link
Contributor Author

If we're to use MySQL special query comments, things get even more complicated. That's because we'll need to strip them out -- MySQL accepts those comments as query text. But now we must remove those, and only if we match them to be vitess hints.

@shlomi-noach
Copy link
Contributor Author

I have to say I can't converge on the correct approach. The fact that the standard mysql client strips comments by default is a big deal.

@shlomi-noach
Copy link
Contributor Author

I've had this idea of embedding comments within comments by way of bypassing the MySQL comment issue, e.g. drop /*! /* our hints here */ */ table or drop /*! /* our hints here */ table. None of this works as desired.

@shlomi-noach
Copy link
Contributor Author

Controversial idea: what if, in ALTER TABLE, we actually default to online DDL? e.g. controlled by vtablet --default-ddl=gh-ost or similar, and then also support ALTER WITH 'pt-osc' TABLE or ALTER WITH 'standard' TABLE... syntax?

@shlomi-noach
Copy link
Contributor Author

shlomi-noach commented Nov 8, 2020

Another idea: how about this gets sorted out by a session variable? That is, a special session variable intercepted by Vitess (and does not get forwarded to MySQL). Consider:

SET SESSION vt_safe_drop_table=1;
DROP TABLE my_table;
SET SESSION vt_online_ddl='gh-ost';
ALTER TABLE my_table ADD COLUMN i INT;

@harshit-gangal
Copy link
Member

I like this approach.

SET SESSION vt_online_ddl='gh-ost';
ALTER TABLE my_table ADD COLUMN i INT;

@shlomi-noach
Copy link
Contributor Author

#7042 added:

  • new @@ddl_strategy session variable
  • new -ddl-strategy command line flag for vtgate (implies the default value for @@ddl_strategy)
  • new -ddl_strategy command line flag for vtctl

@shlomi-noach
Copy link
Contributor Author

#7069 removed WITH ... syntax from queries.

@shlomi-noach
Copy link
Contributor Author

Documentation PR: vitessio/website#598

@shlomi-noach
Copy link
Contributor Author

We've made a decision with online DDL, and will use similar approach to safe DROP TABLE. In terms of the discussion I think we're done and I'm closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants