-
Notifications
You must be signed in to change notification settings - Fork 482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of https://blueprints.launchpad.net/percona-server/+spec/backup-safe-binlog-info #122
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
https://blueprints.launchpad.net/percona-server/+spec/backup-safe-binlog-info One inefficiency of the backup locks feature is that even though LOCK TABLES FOR BACKUP as a light-weight FTWRL alternative does not affect DML statements updating InnoDB tables, LOCK BINLOG FOR BACKUP does affect them by blocking commits. XtraBackup uses LOCK BINLOG FOR BACKUP to: 1. retrieve consistent binary log coordinates with SHOW MASTER STATUS. More precisely, binary log coordinates must be consistent with the REDO log copy and non-transactional tables. Therefore, no updates can be done to non-transactional tables (this is achieved by an active LOCK TABLES FOR BACKUP lock), and no commits can be performed between SHOW MASTER STATUS and finalizing the redo log copy, which is achieved by LOCK BINLOG FOR BACKUP. 2. retrieve consistent master connection information for a replication slave. More precisely, the binary log coordinates on the master as reported by SHOW SLAVE STATUS must be consistent with the REDO log copy, so LOCK BINLOG FOR BACKUP also block the I/O replication thread. 3. For a GTID-enabled PXC node, the last binary log file must be included into an SST snapshot. Which is a rather artificial limitation on the WSREP side, but still XtraBackup obeys it by blocking commits with LOCK BINLOG FOR BACKUP to ensure the integrity of the binary log file copy. Depending on the write rate on the server, finalizing the REDO log copy may take a long time, so blocking commits for that duration may still affect server availability considerably. This task is to make the necessary server-side change to make it possible for XtraBackup to avoid LOCK BINLOG FOR BINLOG in case percona#1, when cases percona#2 and percona#3 do not apply, i.e. when no --slave-info is requested by the XtraBackup options and the server is not a GTID-enabled PXC node. Lifting limitations for cases percona#2 and percona#3 is also possible, but that is outside the scope of this task. The idea of the optimization is that even though InnoDB provides a transactional storage for the binary log information (i.e. current file name and offset), it cannot be fully trusted by XtraBackup, because that information is only updated on an InnoDB commit operation. Which means if the last operation before LOCK TABLES FOR BACKUP was an update to a non-transactional storage engine, and no InnoDB commits occur before the backup is finalized by XtraBackup, the InnoDB system header will contain stale binary log coordinates. One way to fix that would be to force binlog coordinates update in the InnoDB system header on each update, regardless of the involved storage engine(s). This is what a Galera node does to ensure XID consistency which is stored in the same way as binary log coordinates: it forces XID update in the InnoDB system header on each TOI operation, in particular on each non-transactional update. Another approach is less invasive: XtraBackup blocks all non-transactional updates with LOCK TABLES FOR BACKUP anyway, so instead of having all non-transactional updates flush binlog coordinates to InnoDB unconditionally, LTFB can be modified to flush (and redo-log) the current binlog coordinates to InnoDB. In which case binlog coordinates provided by InnoDB will be consistent with REDO log under any circumstances. This patch implements the latter approach.
george-lorch
pushed a commit
to george-lorch/percona-server
that referenced
this pull request
May 7, 2016
Summary: RocksDB has some static object dependencies, and without following dependencies, it hits SIGABRT. This diff has two fixes. 1. Updating RocksDB revision. D51789 in RocksDB changes the problematic mutex from global variable to local (function-level) static variable. 2. Changing MyRocks no to call ddl_manager.persist_stats() by kill_server_thread (which calls rocksdb_done_func()). By calling ddl_manager.persist_stats(), rocksdb::ThreadLocalPtr::StaticMeta::OnThreadExit() (which calls pthread_mutex_lock on "static port::Mutex mutex") is called at the end of the thread. On the other hand, there is a race between main thread and kill_server_thread. main thread calls pthread_mutex_destroy for the mutex. To guarantee not to call pthread_mutex_lock() after pthread_mutex_destroy(), it is necessary not to call ddl_manager.persist_stats() from kill_server_thread. This diff fixes the issue. @update-submodule: rocksdb Test Plan: mtr cardinality --repeat=100 Reviewers: hermanlee4, spetrunia, jkedgar Reviewed By: jkedgar Subscribers: yhchiang, webscalesql-eng Differential Revision: https://reviews.facebook.net/D51933
george-lorch
pushed a commit
to george-lorch/percona-server
that referenced
this pull request
May 9, 2016
Summary: RocksDB has some static object dependencies, and without following dependencies, it hits SIGABRT. This diff has two fixes. 1. Updating RocksDB revision. D51789 in RocksDB changes the problematic mutex from global variable to local (function-level) static variable. 2. Changing MyRocks no to call ddl_manager.persist_stats() by kill_server_thread (which calls rocksdb_done_func()). By calling ddl_manager.persist_stats(), rocksdb::ThreadLocalPtr::StaticMeta::OnThreadExit() (which calls pthread_mutex_lock on "static port::Mutex mutex") is called at the end of the thread. On the other hand, there is a race between main thread and kill_server_thread. main thread calls pthread_mutex_destroy for the mutex. To guarantee not to call pthread_mutex_lock() after pthread_mutex_destroy(), it is necessary not to call ddl_manager.persist_stats() from kill_server_thread. This diff fixes the issue. @update-submodule: rocksdb Test Plan: mtr cardinality --repeat=100 Reviewers: hermanlee4, spetrunia, jkedgar Reviewed By: jkedgar Subscribers: yhchiang, webscalesql-eng Differential Revision: https://reviews.facebook.net/D51933
inikep
pushed a commit
to inikep/percona-server
that referenced
this pull request
Apr 23, 2020
Summary: RocksDB has some static object dependencies, and without following dependencies, it hits SIGABRT. This diff has two fixes. 1. Updating RocksDB revision. D51789 in RocksDB changes the problematic mutex from global variable to local (function-level) static variable. 2. Changing MyRocks no to call ddl_manager.persist_stats() by kill_server_thread (which calls rocksdb_done_func()). By calling ddl_manager.persist_stats(), rocksdb::ThreadLocalPtr::StaticMeta::OnThreadExit() (which calls pthread_mutex_lock on "static port::Mutex mutex") is called at the end of the thread. On the other hand, there is a race between main thread and kill_server_thread. main thread calls pthread_mutex_destroy for the mutex. To guarantee not to call pthread_mutex_lock() after pthread_mutex_destroy(), it is necessary not to call ddl_manager.persist_stats() from kill_server_thread. This diff fixes the issue. update-submodule: rocksdb Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D51933 fbshipit-source-id: ca6e0722d88
inikep
pushed a commit
to inikep/percona-server
that referenced
this pull request
Feb 24, 2021
Summary: RocksDB has some static object dependencies, and without following dependencies, it hits SIGABRT. This diff has two fixes. 1. Updating RocksDB revision. D51789 in RocksDB changes the problematic mutex from global variable to local (function-level) static variable. 2. Changing MyRocks no to call ddl_manager.persist_stats() by kill_server_thread (which calls rocksdb_done_func()). By calling ddl_manager.persist_stats(), rocksdb::ThreadLocalPtr::StaticMeta::OnThreadExit() (which calls pthread_mutex_lock on "static port::Mutex mutex") is called at the end of the thread. On the other hand, there is a race between main thread and kill_server_thread. main thread calls pthread_mutex_destroy for the mutex. To guarantee not to call pthread_mutex_lock() after pthread_mutex_destroy(), it is necessary not to call ddl_manager.persist_stats() from kill_server_thread. This diff fixes the issue. update-submodule: rocksdb Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D51933 fbshipit-source-id: ca6e0722d88
inikep
pushed a commit
to inikep/percona-server
that referenced
this pull request
Nov 15, 2021
Summary: RocksDB has some static object dependencies, and without following dependencies, it hits SIGABRT. This diff has two fixes. 1. Updating RocksDB revision. D51789 in RocksDB changes the problematic mutex from global variable to local (function-level) static variable. 2. Changing MyRocks no to call ddl_manager.persist_stats() by kill_server_thread (which calls rocksdb_done_func()). By calling ddl_manager.persist_stats(), rocksdb::ThreadLocalPtr::StaticMeta::OnThreadExit() (which calls pthread_mutex_lock on "static port::Mutex mutex") is called at the end of the thread. On the other hand, there is a race between main thread and kill_server_thread. main thread calls pthread_mutex_destroy for the mutex. To guarantee not to call pthread_mutex_lock() after pthread_mutex_destroy(), it is necessary not to call ddl_manager.persist_stats() from kill_server_thread. This diff fixes the issue. update-submodule: rocksdb Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D51933 fbshipit-source-id: e1317bab787
ldonoso
pushed a commit
to ldonoso/percona-server
that referenced
this pull request
Mar 15, 2022
Summary: RocksDB has some static object dependencies, and without following dependencies, it hits SIGABRT. This diff has two fixes. 1. Updating RocksDB revision. D51789 in RocksDB changes the problematic mutex from global variable to local (function-level) static variable. 2. Changing MyRocks no to call ddl_manager.persist_stats() by kill_server_thread (which calls rocksdb_done_func()). By calling ddl_manager.persist_stats(), rocksdb::ThreadLocalPtr::StaticMeta::OnThreadExit() (which calls pthread_mutex_lock on "static port::Mutex mutex") is called at the end of the thread. On the other hand, there is a race between main thread and kill_server_thread. main thread calls pthread_mutex_destroy for the mutex. To guarantee not to call pthread_mutex_lock() after pthread_mutex_destroy(), it is necessary not to call ddl_manager.persist_stats() from kill_server_thread. This diff fixes the issue. @update-submodule: rocksdb Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D51933
inikep
pushed a commit
to inikep/percona-server
that referenced
this pull request
Apr 9, 2024
Summary: RocksDB has some static object dependencies, and without following dependencies, it hits SIGABRT. This diff has two fixes. 1. Updating RocksDB revision. D51789 in RocksDB changes the problematic mutex from global variable to local (function-level) static variable. 2. Changing MyRocks no to call ddl_manager.persist_stats() by kill_server_thread (which calls rocksdb_done_func()). By calling ddl_manager.persist_stats(), rocksdb::ThreadLocalPtr::StaticMeta::OnThreadExit() (which calls pthread_mutex_lock on "static port::Mutex mutex") is called at the end of the thread. On the other hand, there is a race between main thread and kill_server_thread. main thread calls pthread_mutex_destroy for the mutex. To guarantee not to call pthread_mutex_lock() after pthread_mutex_destroy(), it is necessary not to call ddl_manager.persist_stats() from kill_server_thread. This diff fixes the issue. @update-submodule: rocksdb Reviewed By: jkedgar Differential Revision: https://reviews.facebook.net/D51933
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
One inefficiency of the backup locks feature is that even though LOCK
TABLES FOR BACKUP as a light-weight FTWRL alternative does not affect
DML statements updating InnoDB tables, LOCK BINLOG FOR BACKUP does
affect them by blocking commits.
XtraBackup uses LOCK BINLOG FOR BACKUP to:
STATUS. More precisely, binary log coordinates must be consistent
with the REDO log copy and non-transactional tables. Therefore, no
updates can be done to non-transactional tables (this is achieved by
an active LOCK TABLES FOR BACKUP lock), and no commits can be
performed between SHOW MASTER STATUS and finalizing the redo log
copy, which is achieved by LOCK BINLOG FOR BACKUP.
slave. More precisely, the binary log coordinates on the master as
reported by SHOW SLAVE STATUS must be consistent with the REDO log
copy, so LOCK BINLOG FOR BACKUP also block the I/O replication thread.
included into an SST snapshot. Which is a rather artificial
limitation on the WSREP side, but still XtraBackup obeys it by
blocking commits with LOCK BINLOG FOR BACKUP to ensure the integrity
of the binary log file copy.
Depending on the write rate on the server, finalizing the REDO log copy
may take a long time, so blocking commits for that duration may still
affect server availability considerably.
This task is to make the necessary server-side change to make it
possible for XtraBackup to avoid LOCK BINLOG FOR BINLOG in case #1, when
cases #2 and #3 do not apply, i.e. when no --slave-info is requested by
the XtraBackup options and the server is not a GTID-enabled PXC node.
Lifting limitations for cases #2 and #3 is also possible, but that is
outside the scope of this task.
The idea of the optimization is that even though InnoDB provides a
transactional storage for the binary log information (i.e. current file
name and offset), it cannot be fully trusted by XtraBackup, because that
information is only updated on an InnoDB commit operation. Which means
if the last operation before LOCK TABLES FOR BACKUP was an update to a
non-transactional storage engine, and no InnoDB commits occur before the
backup is finalized by XtraBackup, the InnoDB system header will contain
stale binary log coordinates.
One way to fix that would be to force binlog coordinates update in the
InnoDB system header on each update, regardless of the involved storage
engine(s). This is what a Galera node does to ensure XID consistency
which is stored in the same way as binary log coordinates: it forces XID
update in the InnoDB system header on each TOI operation, in particular
on each non-transactional update.
Another approach is less invasive: XtraBackup blocks all
non-transactional updates with LOCK TABLES FOR BACKUP anyway, so instead
of having all non-transactional updates flush binlog coordinates to
InnoDB unconditionally, LTFB can be modified to flush (and redo-log) the
current binlog coordinates to InnoDB. In which case binlog coordinates
provided by InnoDB will be consistent with REDO log under any
circumstances.
This patch implements the latter approach.