-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make mysqldump work with gitbase #361
Comments
We need to support the following things:
I remember the discussion around supporting MySQL Workbench and we ended up closing it as it was not prioritary. Should we keep working on this, then, given it's the same constraints we had in that issue? |
@erizocosmico gitbase interoperability is a mid-priority (P1) objective for Q3, but the tools that should be initially included in this interoperability are still not defined.
|
Supported so far:
However, these are way harder to "hack" (right now I just have a special case for them to get past them, but that should not go into the code): SELECT LOGFILE_GROUP_NAME, FILE_NAME, TOTAL_EXTENTS, INITIAL_SIZE, ENGINE, EXTRA FROM INFORMATION_SCHEMA.FILES WHERE FILE_TYPE = 'UNDO LOG' AND FILE_NAME IS NOT NULL AND LOGFILE_GROUP_NAME IS NOT NULL GROUP BY LOGFILE_GROUP_NAME, FILE_NAME, ENGINE, TOTAL_EXTENTS, INITIAL_SIZE ORDER BY LOGFILE_GROUP_NAME``` SELECT DISTINCT TABLESPACE_NAME, FILE_NAME, LOGFILE_GROUP_NAME, EXTENT_SIZE, INITIAL_SIZE, ENGINE FROM INFORMATION_SCHEMA.FILES WHERE FILE_TYPE = 'DATAFILE' ORDER BY TABLESPACE_NAME, LOGFILE_GROUP_NAME These are all the queries being run: /*!40100 SET @@SQL_MODE='' */
/*!40103 SET TIME_ZONE='+00:00' */
SHOW VARIABLES LIKE 'gtid\_mode'
SELECT LOGFILE_GROUP_NAME, FILE_NAME, TOTAL_EXTENTS, INITIAL_SIZE, ENGINE, EXTRA FROM INFORMATION_SCHEMA.FILES WHERE FILE_TYPE = 'UNDO LOG' AND FILE_NAME IS NOT NULL AND LOGFILE_GROUP_NAME IS NOT NULL GROUP BY LOGFILE_GROUP_NAME, FILE_NAME, ENGINE, TOTAL_EXTENTS, INITIAL_SIZE ORDER BY LOGFILE_GROUP_NAME
SELECT DISTINCT TABLESPACE_NAME, FILE_NAME, LOGFILE_GROUP_NAME, EXTENT_SIZE, INITIAL_SIZE, ENGINE FROM INFORMATION_SCHEMA.FILES WHERE FILE_TYPE = 'DATAFILE' ORDER BY TABLESPACE_NAME, LOGFILE_GROUP_NAME
SHOW DATABASES
SHOW VARIABLES LIKE 'ndbinfo\_version'
SHOW CREATE DATABASE IF NOT EXISTS ``
show tables
LOCK TABLES `blobs` READ /*!32311 LOCAL */,`commit_blobs` READ /*!32311 LOCAL */,`commit_files` READ /*!32311 LOCAL */,`commit_trees` READ /*!32311 LOCAL */,`commits` READ /*!32311 LOCAL */,`files` READ /*!32311 LOCAL */,`ref_commits` READ /*!32311 LOCAL */,`refs` READ /*!32311 LOCAL */,`remotes` READ /*!32311 LOCAL */,`repositories` READ /*!32311 LOCAL */,`tree_entries` READ /*!32311 LOCAL */ And then
Although that query has never been executed in the server (at least it never got to the engine). According to the manual, this is the reason this error could happen, which is very very helpful https://dev.mysql.com/doc/refman/8.0/en/commands-out-of-sync.html So, I'm kind of in a dead end here. Any thoughts? @smola @ajnavarro |
I suggest using a more dumb thing such as |
I will try with that. I tried with |
UPDATE: same errors. Dump is completed, but the result is riddled with stuff like this:
|
@ajnavarro just a friendly ping that this is an OKR for this Q. |
Actually, the description is really broad:
We discovered that each tool is doing totally different queries, We focused to be compatible with MariaDB JDBC driver, to make it compatible with JVM applications like Spark. Anyways, we'll have a look to see if we can make it work without implementing a lot of new statements. |
Ran a mysql server with the log on just to get all the queries we will need to support for mysqldump to work. This is the full list of queries executed by mysqldump: /*!40100 SET @@SQL_MODE='' */
/*!40103 SET TIME_ZONE='+00:00' */
/*!80000 SET SESSION information_schema_stats_expiry=0 */
SET SESSION NET_READ_TIMEOUT= 700, SESSION NET_WRITE_TIMEOUT= 700
SHOW VARIABLES LIKE 'gtid\_mode'
SELECT LOGFILE_GROUP_NAME, FILE_NAME, TOTAL_EXTENTS, INITIAL_SIZE, ENGINE, EXTRA FROM INFORMATION_SCHEMA.FILES WHERE FILE_TYPE = 'UNDO LOG' AND FILE_NAME IS NOT NULL AND LOGFILE_GROUP_NAME IS NOT NULL GROUP BY LOGFILE_GROUP_NAME, FILE_NAME, ENGINE, TOTAL_EXTENTS, INITIAL_SIZE ORDER BY LOGFILE_GROUP_NAME
SELECT DISTINCT TABLESPACE_NAME, FILE_NAME, LOGFILE_GROUP_NAME, EXTENT_SIZE, INITIAL_SIZE, ENGINE FROM INFORMATION_SCHEMA.FILES WHERE FILE_TYPE = 'DATAFILE' ORDER BY TABLESPACE_NAME, LOGFILE_GROUP_NAME
SHOW DATABASES
SHOW VARIABLES LIKE 'ndbinfo\_version'
SHOW CREATE DATABASE IF NOT EXISTS `foo`
show tables
LOCK TABLES `bar` READ /*!32311 LOCAL */
show table status like 'bar'
SET SQL_QUOTE_SHOW_CREATE=1
SET SESSION character_set_results = 'binary'
show create table `bar`
SET SESSION character_set_results = 'utf8mb4'
show fields from `bar`
show fields from `bar`
SELECT /*!40001 SQL_NO_CACHE */ * FROM `bar`
SET SESSION character_set_results = 'binary'
use `foo`
select @@collation_database
SHOW TRIGGERS LIKE 'bar'
SET SESSION character_set_results = 'utf8mb4'
SET SESSION character_set_results = 'binary'
SELECT COLUMN_NAME, JSON_EXTRACT(HISTOGRAM, '$."number-of-buckets-specified"') FROM information_schema.COLUMN_STATISTICS WHERE SCHEMA_NAME = 'foo' AND TABLE_NAME = 'bar'
SET SESSION character_set_results = 'utf8mb4'
UNLOCK TABLES Maybe we can reduce those queries with some flags. |
would be great to find some flags that reduce the number of executed queries. |
The most I've been able to reduce it is by using |
Queries performed by mysqldump with outputsThese are the queries mysqldump performs and the outputs a real mysql server would output. /*!40100 SET @@SQL_MODE='' */ Output: no rows /*!40103 SET TIME_ZONE='+00:00' */ Output: no rows /*!80000 SET SESSION information_schema_stats_expiry=0 */ Output: no rows SET SESSION NET_READ_TIMEOUT= 700, SESSION NET_WRITE_TIMEOUT= 700 Output: no rows SHOW VARIABLES LIKE 'gtid\_mode' Output:
SELECT LOGFILE_GROUP_NAME, FILE_NAME, TOTAL_EXTENTS, INITIAL_SIZE, ENGINE, EXTRA FROM INFORMATION_SCHEMA.FILES WHERE FILE_TYPE = 'UNDO LOG' AND FILE_NAME IS NOT NULL AND LOGFILE_GROUP_NAME IS NOT NULL GROUP BY LOGFILE_GROUP_NAME, FILE_NAME, ENGINE, TOTAL_EXTENTS, INITIAL_SIZE ORDER BY LOGFILE_GROUP_NAME Output: no rows SELECT DISTINCT TABLESPACE_NAME, FILE_NAME, LOGFILE_GROUP_NAME, EXTENT_SIZE, INITIAL_SIZE, ENGINE FROM INFORMATION_SCHEMA.FILES WHERE FILE_TYPE = 'DATAFILE' ORDER BY TABLESPACE_NAME, LOGFILE_GROUP_NAME Output: no rows SHOW DATABASES Output:
SHOW VARIABLES LIKE 'ndbinfo\_version' Output: no rows SHOW CREATE DATABASE IF NOT EXISTS `foo` Output:
show tables Output: tables LOCK TABLES `bar` READ /*!32311 LOCAL */ Output: 0 rows show table status like 'bar' Output:
SET SQL_QUOTE_SHOW_CREATE=1 Output: no rows SET SESSION character_set_results = 'binary' Output: no rows show create table `bar` Output:
SET SESSION character_set_results = 'utf8mb4' Output: no rows show fields from `bar` Output:
SELECT /*!40001 SQL_NO_CACHE */ * FROM `bar` Output: everything in the table SET SESSION character_set_results = 'binary' Output: 0 rows use `foo` Output: 0 rows select @@collation_database Output:
SET SESSION character_set_results = 'utf8mb4' Output: no rows SET SESSION character_set_results = 'binary' Output: no rows SELECT COLUMN_NAME, JSON_EXTRACT(HISTOGRAM, '$."number-of-buckets-specified"') FROM information_schema.COLUMN_STATISTICS WHERE SCHEMA_NAME = 'foo' AND TABLE_NAME = 'bar' Output: no rows SET SESSION character_set_results = 'utf8mb4' Output: no rows UNLOCK TABLES Output: no rows Things that need to be implemented
@ajnavarro this is the full list of queries with their outputs and from them all the things we would need to implement for this to, in theory, be able to work correctly. Should we move forward with this, then? |
@erizocosmico totally. Could you open several issues to be able to parallelize work? (some of that issues can be marked as |
Sure |
Closing, this was already merged |
Right now if you try to do a mysqldump, you will have the next error:
The text was updated successfully, but these errors were encountered: