Skip to content

Commit

Permalink
FB8-144: Send events to async slaves after ACKed by at least one semi…
Browse files Browse the repository at this point in the history
…-sync slave (facebook#1006) (facebook#1006)

Summary:
Jira issue: https://jira.percona.com/browse/FB8-144

Reference Patch: facebook@cc4e803
Reference Patch: facebook@aad3ce2a54e
Reference Patch: facebook@b42b911fa15

---------- facebook@cc4e803 ----------

Send events to async slaves after ACKed by at least one semi-sync slave
Added a new mysql variable `wait_for_semi_sync_ack` to control
sending binlog events to async slaves. When this variables is set,
events are sent to async slave only after it is ACKed by at least one
semi-sync slave (if any).

Originally Reviewed By: Tema

---------- facebook@aad3ce2a54e ----------

Fixed rpl_wait_for_semi_sync_ack
Following changes are made to fix the feature
 - Condition variable is registered in dump thread's THD before waiting so that
   the thread can respond to kill command
 - rpl_wait_for_semi_sync_ack is respected only when semi-sync master is enabled
   (i.e rpl_semi_sync_master_enabled = 1)
 - Added a status variable Rpl_semi_sync_master_ack_waits which counts the
   number of times we waited for an ACK (useful for benchmarking)

Originally Reviewed By: hermanlee

---------- facebook@b42b911fa15 ----------

Fix rpl_wait_for_semi_sync_ack feature
Fixed the following:
1. Initializing last acked position to what is retrived from engine during
server startup. This makes sure that lagging async slaves are able to catchup
until the last acked position after master restarts.
2. Resetting last acked posistion when `RESET MASTER` is issued. This makes sure
that after the binlogs are reset we wait for acks.
3. Signalling/updating last acked positions only on events that were actually
acked by the semi-sync slave (like the Xid event of the last trx in a group
commit). This is done by signalling inside of the plugin
(ReplSemiSyncMaster::reportReplyBinlog).
4. Signalling/updating on trxs skipped on semi-sync slave connection while
searching for first gtid connection

Originally Reviewed By: hermanlee

--------------------------------------------------------------------

Wrapping last semi-sync acked pos in std::atomic to avoid locking in some scenarios

We'll now lock the mutex only if we need to wait for last acked
pos to update. Otherwise, we just check the current dump thread pos
against the last acked pos atomically and if current is less that last
acked we sent the event without locking the mutex.
--------------------------------------------------------------------

Pull Request resolved: facebook#1006

Differential Revision: D16267709

Pulled By: abhinav04sharma

---------------------------------------------------------------------

Fix LLVM codegen for struct st_filenum_pos atomic operations (facebook#1183)

Summary:
LLVM has an issue where atomic operations on a struct with 32-bit fields are
compiled using libatomic library calls instead of direct assembly, as if the
whole struct were 32-bit aligned, i.e. its objects could cross machine word
boundary: https://bugs.llvm.org/show_bug.cgi?id=45055.

Workaround this issue by aligning the first 32-bit field at 64 bits.

This allows not linking mysys with libatomic.

Pull Request resolved: facebook#1183

Reviewed By: abhinav04sharma

Differential Revision: D34379183

Pulled By: hermanlee
  • Loading branch information
inikep committed Aug 6, 2024
1 parent 9e689a8 commit 0df8cf8
Show file tree
Hide file tree
Showing 31 changed files with 1,009 additions and 32 deletions.
4 changes: 4 additions & 0 deletions mysql-test/r/mysqld--help-notwin.result
Original file line number Diff line number Diff line change
Expand Up @@ -1958,6 +1958,9 @@ The following options may be given as the first argument:
--rpl-stop-slave-timeout=#
This option is deprecated. Use rpl_stop_replica_timeout
instead.
--rpl-wait-for-semi-sync-ack
Wait for events to be acked by a semi-sync slave before
sending them to the async slaves
--safe-user-create Don't allow new user creation by the user who has no
write privileges to the mysql.user table.
--schema-definition-cache=#
Expand Down Expand Up @@ -2872,6 +2875,7 @@ rpl-receive-buffer-size 2097152
rpl-send-buffer-size 2097152
rpl-stop-replica-timeout 31536000
rpl-stop-slave-timeout 31536000
rpl-wait-for-semi-sync-ack FALSE
safe-user-create FALSE
schema-definition-cache 256
secondary-engine-cost-threshold 100000
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -188,6 +188,7 @@ SET PERSIST_ONLY rpl_stop_slave_timeout = @@GLOBAL.rpl_stop_slave_timeout;
Warnings:
Warning 1287 '@@rpl_stop_slave_timeout' is deprecated and will be removed in a future release. Please use rpl_stop_replica_timeout instead.
Warning 1287 '@@rpl_stop_slave_timeout' is deprecated and will be removed in a future release. Please use rpl_stop_replica_timeout instead.
SET PERSIST_ONLY rpl_wait_for_semi_sync_ack = @@GLOBAL.rpl_wait_for_semi_sync_ack;
SET PERSIST_ONLY session_track_gtids = @@GLOBAL.session_track_gtids;
SET PERSIST_ONLY skip_replica_start = @@GLOBAL.skip_replica_start;
SET PERSIST_ONLY skip_slave_start = @@GLOBAL.skip_slave_start;
Expand Down Expand Up @@ -385,6 +386,7 @@ RESET PERSIST rpl_semi_sync_source_wait_point;
RESET PERSIST rpl_semi_sync_source_whitelist;
RESET PERSIST rpl_send_buffer_size;
RESET PERSIST rpl_stop_replica_timeout;
RESET PERSIST rpl_wait_for_semi_sync_ack;
RESET PERSIST session_track_gtids;
RESET PERSIST skip_replica_start;
RESET PERSIST slave_compression_lib;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,7 @@ SET PERSIST rpl_stop_slave_timeout = @@GLOBAL.rpl_stop_slave_timeout;
Warnings:
Warning 1287 '@@rpl_stop_slave_timeout' is deprecated and will be removed in a future release. Please use rpl_stop_replica_timeout instead.
Warning 1287 '@@rpl_stop_slave_timeout' is deprecated and will be removed in a future release. Please use rpl_stop_replica_timeout instead.
SET PERSIST rpl_wait_for_semi_sync_ack = @@GLOBAL.rpl_wait_for_semi_sync_ack;
SET PERSIST session_track_gtids = @@GLOBAL.session_track_gtids;
SET PERSIST skip_replica_start = @@GLOBAL.skip_replica_start;
ERROR HY000: Variable 'skip_replica_start' is a read only variable
Expand Down Expand Up @@ -399,6 +400,7 @@ RESET PERSIST IF EXISTS rpl_stop_replica_timeout;
RESET PERSIST IF EXISTS rpl_stop_slave_timeout;
Warnings:
Warning 3615 Variable rpl_stop_slave_timeout does not exist in persisted config file
RESET PERSIST IF EXISTS rpl_wait_for_semi_sync_ack;
RESET PERSIST IF EXISTS session_track_gtids;
RESET PERSIST IF EXISTS skip_replica_start;
Warnings:
Expand Down
33 changes: 33 additions & 0 deletions mysql-test/suite/rpl/r/rpl_semi_sync_master_error_handling.result
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
include/rpl_init.inc [topology=1->2,1->3]
Warnings:
Note #### Sending passwords in plain text without SSL/TLS is extremely insecure.
Note #### Storing MySQL user name or password information in the connection metadata repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START REPLICA; see the 'START REPLICA Syntax' in the MySQL Manual for more information.
Warnings:
Note #### Sending passwords in plain text without SSL/TLS is extremely insecure.
Note #### Storing MySQL user name or password information in the connection metadata repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START REPLICA; see the 'START REPLICA Syntax' in the MySQL Manual for more information.
include/rpl_connect.inc [creating master]
include/rpl_connect.inc [creating master1]
include/rpl_connect.inc [creating semi_sync_slave]
include/rpl_connect.inc [creating async_slave]
call mtr.add_suppression("Read semi-sync reply magic number error.");
call mtr.add_suppression("A message intended for a client cannot be sent there as no client-session is attached");
"Creating schema"
CREATE TABLE t1(a INT) engine = InnoDB;
include/sync_slave_sql_with_master.inc
include/sync_slave_sql_with_master.inc
"The semi sync slave will error out before sending ACK"
SET @@GLOBAL.DEBUG= '+d,error_before_semi_sync_reply';
"Inserting a row on the master"
INSERT INTO t1 VALUES(1);
"Waiting for the semi-sync slave to stop"
include/wait_for_slave_io_to_stop.inc
"Waiting for the async dump thread to wait for ACK"
include/assert.inc [Table in semi-sync slave should be empty.]
include/assert.inc [Table in async slave should be empty.]
"Starting semi-sync slave and cleaning up"
SET @@GLOBAL.DEBUG= '-d,error_before_semi_sync_reply';
START REPLICA;
DROP TABLE t1;
include/sync_slave_sql_with_master.inc
include/sync_slave_sql_with_master.inc
include/rpl_end.inc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
$SEMISYNC_MASTER_PLUGIN_OPT --plugin-load=rpl_semi_sync_master=$SEMISYNC_MASTER_PLUGIN;rpl_semi_sync_slave=$SEMISYNC_SLAVE_PLUGIN
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
$SEMISYNC_SLAVE_PLUGIN_OPT $SEMISYNC_SLAVE_PLUGIN_LOAD
24 changes: 24 additions & 0 deletions mysql-test/suite/rpl/t/rpl_semi_sync_master_error_handling.cnf
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
!include ../my.cnf

[mysqld.1]
rpl_semi_sync_master_enabled=1
rpl_semi_sync_master_timeout=86400000 # 1 day
rpl_wait_for_semi_sync_ack=1
log_slave_updates
gtid_mode=ON
enforce_gtid_consistency=ON

[mysqld.2]
rpl_semi_sync_slave_enabled=1
log_slave_updates
gtid_mode=ON
enforce_gtid_consistency=ON

[mysqld.3]
rpl_semi_sync_slave_enabled=0
log_slave_updates
gtid_mode=ON
enforce_gtid_consistency=ON

[ENV]
SERVER_MYPORT_3= @mysqld.3.port
78 changes: 78 additions & 0 deletions mysql-test/suite/rpl/t/rpl_semi_sync_master_error_handling.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
--source include/have_debug.inc

--let $rpl_topology= 1->2,1->3
--source include/rpl_init.inc

--let $rpl_connection_name= master
--let $rpl_server_number= 1
--source include/rpl_connect.inc

--let $rpl_connection_name= master1
--let $rpl_server_number= 1
--source include/rpl_connect.inc

--let $rpl_connection_name= semi_sync_slave
--let $rpl_server_number= 2
--source include/rpl_connect.inc

--let $rpl_connection_name= async_slave
--let $rpl_server_number= 3
--source include/rpl_connect.inc

--connection master
call mtr.add_suppression("Read semi-sync reply magic number error.");
call mtr.add_suppression("A message intended for a client cannot be sent there as no client-session is attached");

--echo "Creating schema"
CREATE TABLE t1(a INT) engine = InnoDB;

--let $sync_slave_connection= semi_sync_slave
--source include/sync_slave_sql_with_master.inc
--connection master
--let $sync_slave_connection= async_slave
--source include/sync_slave_sql_with_master.inc

--echo "The semi sync slave will error out before sending ACK"
--connection semi_sync_slave
SET @@GLOBAL.DEBUG= '+d,error_before_semi_sync_reply';

--echo "Inserting a row on the master"
--connection master
send INSERT INTO t1 VALUES(1);

--echo "Waiting for the semi-sync slave to stop"
--connection semi_sync_slave
--source include/wait_for_slave_io_to_stop.inc

--echo "Waiting for the async dump thread to wait for ACK"
--connection master1
--let $wait_condition= select count(*)= 1 from information_schema.processlist where state like '%Waiting for semi-sync ACK from slave%'
--source include/wait_condition.inc

--connection semi_sync_slave
--let $assert_text= Table in semi-sync slave should be empty.
--let $assert_cond= "[SELECT COUNT(*) FROM t1]" = "0"
--source include/assert.inc

--connection async_slave
--let $assert_text= Table in async slave should be empty.
--let $assert_cond= "[SELECT COUNT(*) FROM t1]" = "0"
--source include/assert.inc

--echo "Starting semi-sync slave and cleaning up"
--connection semi_sync_slave
SET @@GLOBAL.DEBUG= '-d,error_before_semi_sync_reply';
START REPLICA;

# Cleanup
--connection master
--reap
DROP TABLE t1;

--let $sync_slave_connection= semi_sync_slave
--source include/sync_slave_sql_with_master.inc
--connection master
--let $sync_slave_connection= async_slave
--source include/sync_slave_sql_with_master.inc

--source include/rpl_end.inc
166 changes: 166 additions & 0 deletions mysql-test/suite/rpl_gtid/r/rpl_wait_for_semi_sync_ack.result
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
include/rpl_init.inc [topology=1->2,1->3]
Warnings:
Note #### Sending passwords in plain text without SSL/TLS is extremely insecure.
Note #### Storing MySQL user name or password information in the connection metadata repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START REPLICA; see the 'START REPLICA Syntax' in the MySQL Manual for more information.
Warnings:
Note #### Sending passwords in plain text without SSL/TLS is extremely insecure.
Note #### Storing MySQL user name or password information in the connection metadata repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START REPLICA; see the 'START REPLICA Syntax' in the MySQL Manual for more information.
include/rpl_default_connections.inc
include/rpl_connect.inc [creating async_slave]
include/rpl_connect.inc [creating semi_sync_slave]
[connection master]
call mtr.add_suppression("Run function 'wait_for_semi_sync_ack' in plugin 'rpl_semi_sync_master' failed");
call mtr.add_suppression("Error while waiting for semi-sync ACK on dump thread");
call mtr.add_suppression("A message intended for a client cannot be sent there as no client-session is attached");
call mtr.add_suppression("Timeout waiting for reply of binlog");
call mtr.add_suppression("Slave SQL.*Request to stop slave SQL Thread received while applying a group that has non-transactional changes");
[connection semi_sync_slave]
set @@global.debug= '+d,before_semi_sync_reply';
[connection master]
"Store the last acked pos"
create table t1 (a int);
[connection semi_sync_slave]
set debug_sync='now WAIT_FOR semi_sync_reply_reached';
[connection async_slave]
include/assert.inc [Async Slave: Should not contain any tables]
[connection master1]
"Last acked pos should not move"
include/assert.inc [Last acked pos should not move]
[connection semi_sync_slave]
set debug_sync='now SIGNAL semi_sync_reply_continue';
[connection master]
include/assert.inc [Master: Should contain t1]
include/sync_slave_sql_with_master.inc
include/assert.inc [Async Slave: Should contain t1]
[connection master]
create table t2(a int);
[connection semi_sync_slave]
set debug_sync='now WAIT_FOR semi_sync_reply_reached';
[connection master1]
"Switching off rpl_semi_sync_master_enabled while async thread is waiting for ack"
set @@global.rpl_semi_sync_master_enabled = 0;
"Waiting till async slave is caught up"
include/sync_slave_sql_with_master.inc
include/assert.inc [Async Slave: should have t1 and t2]
[connection semi_sync_slave]
set debug_sync='now SIGNAL semi_sync_reply_continue';
"Switching rpl_semi_sync_master_enabled back on"
[connection master]
set @@global.rpl_semi_sync_master_enabled = 1;
[connection master]
"Waiting till semi-sync slave is caught up"
include/sync_slave_sql_with_master.inc
include/assert.inc [Semi-sync Slave: should have t1 and t2]
[connection master]
create table t3(a int);
[connection semi_sync_slave]
set debug_sync='now WAIT_FOR semi_sync_reply_reached';
[connection master1]
"Switching off rpl_wait_for_semi_sync_ack while async thread is waiting for ack"
set @@global.rpl_wait_for_semi_sync_ack = 0;
"Waiting till async slave is caught up"
include/sync_slave_sql_with_master.inc
[connection semi_sync_slave]
set debug_sync='now SIGNAL semi_sync_reply_continue';
"Switching rpl_wait_for_semi_sync_ack back on"
[connection master]
set @@global.rpl_wait_for_semi_sync_ack = 1;
[connection master]
"Waiting till semi-sync slave is caught up"
include/sync_slave_sql_with_master.inc
include/assert.inc [Semi-sync Slave: should have t1, t2 and t3]
[connection semi_sync_slave]
set @@global.debug= '-d,before_semi_sync_reply';
"Stopping async slave to simulate lag"
[connection async_slave]
include/stop_slave.inc
"Generating traffic on the master"
[connection master]
create table t4(a int);
insert into t4 values(1);
insert into t4 values(2);
flush logs;
insert into t4 values(3);
insert into t4 values(4);
flush logs;
include/sync_slave_sql_with_master.inc
[connection semi_sync_slave]
include/stop_slave_io.inc
[connection master]
"Restarting master"
include/rpl_restart_server.inc [server_number=1]
[connection semi_sync_slave]
include/start_slave_io.inc
"Starting async slave"
[connection async_slave]
include/start_slave.inc
"Waiting till async slave is caught up"
[connection master]
include/sync_slave_sql_with_master.inc
include/assert.inc [Async Slave: t4 should have 4 entries]
[connection semi_sync_slave]
include/stop_slave.inc
[connection async_slave]
include/stop_slave.inc
[connection master]
set @gtid_exec= @@global.gtid_executed;
RESET BINARY LOGS AND GTIDS;
include/assert.inc [Last acked pos should be empty]
set @@global.gtid_purged= @gtid_exec;
purge binary logs to 'binlog';
[connection semi_sync_slave]
include/start_slave.inc
[connection async_slave]
include/start_slave.inc
[connection semi_sync_slave]
set @@global.debug= '+d,before_semi_sync_reply';
[connection master]
create table t5 (a int);
[connection semi_sync_slave]
set debug_sync='now WAIT_FOR semi_sync_reply_reached';
[connection async_slave]
include/assert.inc [Async Slave: Should not contain t5]
[connection semi_sync_slave]
set debug_sync='now SIGNAL semi_sync_reply_continue';
set @@global.debug= '-d,before_semi_sync_reply';
[connection master]
include/assert.inc [Master: Should contain t5]
include/sync_slave_sql_with_master.inc
include/assert.inc [Async Slave: Should contain t5]
"Stopping async slave to simulate lag"
[connection async_slave]
include/stop_slave.inc
"Generating traffic on the master"
[connection master]
create table t6 (a int);
insert into t6 values(1);
insert into t6 values(2);
insert into t6 values(3);
insert into t6 values(4);
include/sync_slave_sql_with_master.inc
"Blocking semi-sync slave just before sending an ack"
[connection semi_sync_slave]
set @@global.debug= '+d,before_semi_sync_event';
[connection master]
insert into t6 values(5) # this transaction will be blocked;
[connection semi_sync_slave]
set debug_sync='now wait_for semi_sync_event_reached';
"Restarting the semi-sync slave"
[connection master]
include/rpl_stop_server.inc [server_number=3]
include/rpl_start_server.inc [server_number=3]
"Starting async slave"
[connection async_slave]
include/start_slave.inc
[connection master]
[connection semi_sync_slave]
"Waiting till async slave is caught up"
[connection master]
include/sync_slave_sql_with_master.inc
[connection master]
drop table t1, t2, t3, t4, t5, t6;
[connection master]
include/sync_slave_sql_with_master.inc
[connection master]
include/sync_slave_sql_with_master.inc
include/rpl_end.inc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
$SEMISYNC_MASTER_PLUGIN_OPT --plugin-load=rpl_semi_sync_master=$SEMISYNC_MASTER_PLUGIN;rpl_semi_sync_slave=$SEMISYNC_SLAVE_PLUGIN
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
$SEMISYNC_SLAVE_PLUGIN_OPT $SEMISYNC_SLAVE_PLUGIN_LOAD
14 changes: 14 additions & 0 deletions mysql-test/suite/rpl_gtid/t/rpl_wait_for_semi_sync_ack.cnf
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
!include ../my.cnf

[mysqld.1]
rpl_semi_sync_master_enabled= 1
rpl_wait_for_semi_sync_ack= 1
rpl_semi_sync_master_timeout= 10000000
rpl_semi_sync_master_wait_no_slave= 1

[mysqld.3]
rpl_semi_sync_slave_enabled= 1
rpl_semi_sync_master_wait_no_slave= 1

[ENV]
SERVER_MYPORT_3= @mysqld.3.port
Loading

0 comments on commit 0df8cf8

Please sign in to comment.