FB8-144: Send events to async slaves after ACKed by at least one semi-sync slave #1006

inikep · 2019-03-28T12:55:56Z

Jira issue: https://jira.percona.com/browse/FB8-144

Reference Patch: cc4e803
Reference Patch: aad3ce2a54e
Reference Patch: b42b911fa15

---------- cc4e803 ----------

Send events to async slaves after ACKed by at least one semi-sync slave

Summary:
Added a new mysql variable wait_for_semi_sync_ack to control
sending binlog events to async slaves. When this variables is set,
events are sent to async slave only after it is ACKed by at least one
semi-sync slave (if any).

Originally Reviewed By: Tema

fbshipit-source-id: 9cad7e5

---------- aad3ce2a54e ----------

Fixed rpl_wait_for_semi_sync_ack

Summary:
Following changes are made to fix the feature

Condition variable is registered in dump thread's THD before waiting so that
the thread can respond to kill command
rpl_wait_for_semi_sync_ack is respected only when semi-sync master is enabled
(i.e rpl_semi_sync_master_enabled = 1)
Added a status variable Rpl_semi_sync_master_ack_waits which counts the
number of times we waited for an ACK (useful for benchmarking)

Squash with D6022785

Originally Reviewed By: hermanlee

fbshipit-source-id: 64710f9

---------- b42b911fa15 ----------

Fix rpl_wait_for_semi_sync_ack feature

Summary:
Fixed the following:

Initializing last acked position to what is retrived from engine during
server startup. This makes sure that lagging async slaves are able to catchup
until the last acked position after master restarts.
Resetting last acked posistion when RESET MASTER is issued. This makes sure
that after the binlogs are reset we wait for acks.
Signalling/updating last acked positions only on events that were actually
acked by the semi-sync slave (like the Xid event of the last trx in a group
commit). This is done by signalling inside of the plugin
(ReplSemiSyncMaster::reportReplyBinlog).
Signalling/updating on trxs skipped on semi-sync slave connection while
searching for first gtid connection

Originally Reviewed By: hermanlee

fbshipit-source-id: f8917a9

TO DO: Because of conflicts I removed changes in mysql-test/t/all_persisted_variables.test.
This test has to be modified at let $total_persistent_vars=XXX; (+1 increase)
and it should be re-recorded.

sql/binlog.h

sql/rpl_binlog_sender.cc

sql/sys_vars.cc

sql/binlog.h

sql/rpl_binlog_sender.cc

percona-ysorokin

LGTM

hermanlee

This approach is definitely simpler than the original one. I'd go with this.

hermanlee · 2019-04-15T16:52:55Z

sql/binlog.cc

@@ -8167,6 +8168,13 @@ void MYSQL_BIN_LOG::process_after_commit_stage_queue(THD *thd, THD *first) {
      Thd_backup_and_restore switch_thd(thd, head);
      bool all = head->get_transaction()->m_flags.real_commit;
      (void)RUN_HOOK(transaction, after_commit, (head, all));
+
+      if (rpl_wait_for_semi_sync_ack) {


If signal_semi_sync_ack doesn't get called on every process_after_commit, I think there's a potential problem here. If no further writes come into the system, but rpl_wait_for_semi_sync_ack is enabled, wouldn't async slaves be held back at the last_acked checkpoint that is much much older than the latest semi-sync acked checkpoint?

It seems like we should update the last_acked position for each commit.

It also seems like the original code might have this issue too.

You are right. I followed the original code and I assumed that async slaves will catch up when writes come into the system. I think that the reason behind checking rpl_wait_for_semi_sync_ack was not to unnecessary call signal_semi_sync_ack() when rpl_wait_for_semi_sync_ack is disabled and may not be enabled at all.
Should I remove if (rpl_wait_for_semi_sync_ack)?

@hermanlee I'm not sure i understand this statement: " last_acked checkpoint that is much much older than the latest semi-sync acked checkpoint"

I think the scenario is this:

rpl_wait_for_semi_sync_ack is disabled
master processes writes and generates 1GB worth of binlog entries
update binlog entry is not called, so the async slave's last_acked checkpoint occurs before the new binlog entries
writes stop
rpl_wait_for_semi_sync_ack is enabled
async slaves connects, but last_acked checkpoint before the latest 1GB of binlog entries

I think that case is handled in update_rpl_wait_for_semi_sync_ack() ?

@inikep even if we remove the rpl_wait_for_semi_sync_ack check the signalling will only happen when something is written to the binlog, correct? So will the async slaves not wait until something is new is written to the binlog (in @hermanlee example above) ?

Nevermind, looks like last_acked in maintained in mysql_bin_log which is a global singleton, so this should work. I'll let @hermanlee comment on the proposed solution.

I have a new idea. What do you think about calling signal_semi_sync_ack() only when semisync_master.so is loaded (installed)? It can be easily done using plugin_dl_find() from sql_plugin.cc.

I think updating unconditionally would be fine.

@abhinav04sharma is the recent problem found with ABS an issue with this port to 8.0?

That issue was around group commit, it should not be a problem in this patch since we're signalling inside ordered_commit().

Regarding when to call signal_semi_sync_ack(), i think even we always call it (i.e. even if rpl_wait_for_semi_sync_ack = 0) we'll still have the same problem if the master restarts since the last_acked will be reset and we'll need at least one write to start sending events to lagging async slaves... Not sure how we can solve this.

sql/binlog.cc

sql/binlog.h

inikep · 2019-05-15T15:53:34Z

PR was updated.
In the event reading loop (i.e send_events) I added a check if a given event was already executed (included in executed_gtids). If it was executed then we may call signal_semi_sync_ack.
Currently last_ack will be updated only with rpl_wait_for_semi_sync_ack turned on when:

new writes come into the system
a new semi-sync slave starts asking for binlogs

hermanlee · 2019-05-15T17:17:21Z

I talking to @abhinav04sharma, there is probably an issue if RESET MASTER is called. The binlog files/offsets get reset, but last_ack does not. So last_ack could be very far in the "future". I think we may also need to reset last_ack when RESET MASTER is performed.

abhinav04sharma · 2019-05-15T19:16:37Z

sql/rpl_binlog_sender.cc

+
+    if (rpl_wait_for_semi_sync_ack && m_is_semi_sync_slave &&
+        is_event_executed(event_ptr))
+      mysql_bin_log.signal_semi_sync_ack(log_file, log_pos);


AFAIK (at least in 5.6) gtid_executed can contain un-acked transactions because it's updated after the flush stage in group commit. So this might not be correct.

We're signalling on the GTID event, so the rest of the transaction will not be signaled, correct?

This code calls signal_semi_sync_ack() only for the following events:

SET @@SESSION.GTID_NEXT=`...`

It means that in case of server restart and a semi-sync slave starts asking for binlogs, async slaves will be updated only to the start of the last transaction. The rest of the transaction will be updated when new writes come into the system.
Another issue is that when new writes come into the system async slaves can be updated up to the start of the last transaction which may be not acked yet.

The first issue can be fixed, but the second (updating up to the start of the last transaction which may be not acked yet) is hard to fix.

inikep · 2019-05-17T14:51:49Z

I have 3 proposals how to solve a problem when the master restarts and the last_acked will be reset and we'll need at least one write to start sending events to lagging async slaves:

assume that all GTIDs in gtid_executed are commited except the last GTID (it should be true if semi-sync replication was enabled) - disadvantage: slaves will not get the last transaction until new writes come into the system
keep last_acked in a configuration file with all variables set with "SET PERSIST" (it can be updated when master stops or with every call to signal_semi_sync_ack)
switch back to a fully synchronized wait_for_semi_sync_ack which is already implemented at WIP FB8-144: Send events to async slaves after ACKed by at least one semi-sync slave #986

hermanlee · 2019-05-22T21:45:18Z

I think the case of the master restarting, it would be fine to set last_acked to the gtid_executed value once server recovery completes (prepared transactions in the storage engine were either rolled forward or rolled back). @abhinav04sharma , do you think this is would work. Since the storage engine and binlog will be in sync, and the executed set is just the binlog entries in the binlog, it would be fine to send those entries to the async slaves.

If a semi-sync client never acked the binlog entries, our plan is to trim the binlog entries on the server so that they match the storage engine position, at which point, the gtid_executed set would still be the set of acked binlogs.

abhinav04sharma · 2019-05-23T22:17:21Z

I think that should be fine. I have a patch up for review which does this along with handling RESET MASTER (in 5.6). I'll mention the patch in the PR as soon as it lands. ~Abhinav

…

On Wed, May 22, 2019 at 2:45 PM Herman Lee ***@***.***> wrote: I think the case of the master restarting, it would be fine to set last_acked to the gtid_executed value once server recovery completes (prepared transactions in the storage engine were either rolled forward or rolled back). @abhinav04sharma <https://github.com/abhinav04sharma> , do you think this is would work. Since the storage engine and binlog will be in sync, and the executed set is just the binlog entries in the binlog, it would be fine to send those entries to the async slaves. If a semi-sync client never acked the binlog entries, our plan is to trim the binlog entries on the server so that they match the storage engine position, at which point, the gtid_executed set would still be the set of acked binlogs. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1006?email_source=notifications&email_token=AAZOQWCLT2LS3HCDODXPYZTPWW5HHA5CNFSM4HCALUMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWAOEBI#issuecomment-494985733>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAZOQWHI5CJW5F35DALYPJLPWW5HHANCNFSM4HCALUMA> .

inikep · 2019-06-28T14:47:50Z

Last changes in the PR:

Ported "Fix rpl_wait_for_semi_sync_ack feature" from b42b911
Resetting last_acked to the current binlog with a position of 0 for both: a restart/crash and "RESET MASTER".
Signalling/updating last_acked only in process_after_commit_stage_queue

sql/binlog.cc

…-sync slave (facebook#1006) (facebook#1006) Summary: Jira issue: https://jira.percona.com/browse/FB8-144 Reference Patch: facebook@cc4e803 Reference Patch: facebook@aad3ce2a54e Reference Patch: facebook@b42b911fa15 ---------- facebook@cc4e803 ---------- Send events to async slaves after ACKed by at least one semi-sync slave Added a new mysql variable `wait_for_semi_sync_ack` to control sending binlog events to async slaves. When this variables is set, events are sent to async slave only after it is ACKed by at least one semi-sync slave (if any). Originally Reviewed By: Tema ---------- facebook@aad3ce2a54e ---------- Fixed rpl_wait_for_semi_sync_ack Following changes are made to fix the feature - Condition variable is registered in dump thread's THD before waiting so that the thread can respond to kill command - rpl_wait_for_semi_sync_ack is respected only when semi-sync master is enabled (i.e rpl_semi_sync_master_enabled = 1) - Added a status variable Rpl_semi_sync_master_ack_waits which counts the number of times we waited for an ACK (useful for benchmarking) Originally Reviewed By: hermanlee ---------- facebook@b42b911fa15 ---------- Fix rpl_wait_for_semi_sync_ack feature Fixed the following: 1. Initializing last acked position to what is retrived from engine during server startup. This makes sure that lagging async slaves are able to catchup until the last acked position after master restarts. 2. Resetting last acked posistion when `RESET MASTER` is issued. This makes sure that after the binlogs are reset we wait for acks. 3. Signalling/updating last acked positions only on events that were actually acked by the semi-sync slave (like the Xid event of the last trx in a group commit). This is done by signalling inside of the plugin (ReplSemiSyncMaster::reportReplyBinlog). 4. Signalling/updating on trxs skipped on semi-sync slave connection while searching for first gtid connection Originally Reviewed By: hermanlee -------------------------------------------------------------------- Wrapping last semi-sync acked pos in std::atomic to avoid locking in some scenarios We'll now lock the mutex only if we need to wait for last acked pos to update. Otherwise, we just check the current dump thread pos against the last acked pos atomically and if current is less that last acked we sent the event without locking the mutex. -------------------------------------------------------------------- Pull Request resolved: facebook#1006 Differential Revision: D16267709 Pulled By: abhinav04sharma --------------------------------------------------------------------- Fix LLVM codegen for struct st_filenum_pos atomic operations (facebook#1183) Summary: LLVM has an issue where atomic operations on a struct with 32-bit fields are compiled using libatomic library calls instead of direct assembly, as if the whole struct were 32-bit aligned, i.e. its objects could cross machine word boundary: https://bugs.llvm.org/show_bug.cgi?id=45055. Workaround this issue by aligning the first 32-bit field at 64 bits. This allows not linking mysys with libatomic. Pull Request resolved: facebook#1183 Reviewed By: abhinav04sharma Differential Revision: D34379183 Pulled By: hermanlee

…-sync slave (percona#1006) (percona#1006) Summary: Jira issue: https://jira.percona.com/browse/FB8-144 Reference Patch: facebook/mysql-5.6@cc4e803 Reference Patch: facebook/mysql-5.6@aad3ce2a54e Reference Patch: facebook/mysql-5.6@b42b911fa15 ---------- facebook/mysql-5.6@cc4e803 ---------- Send events to async slaves after ACKed by at least one semi-sync slave Added a new mysql variable `wait_for_semi_sync_ack` to control sending binlog events to async slaves. When this variables is set, events are sent to async slave only after it is ACKed by at least one semi-sync slave (if any). Originally Reviewed By: Tema ---------- facebook/mysql-5.6@aad3ce2a54e ---------- Fixed rpl_wait_for_semi_sync_ack Following changes are made to fix the feature - Condition variable is registered in dump thread's THD before waiting so that the thread can respond to kill command - rpl_wait_for_semi_sync_ack is respected only when semi-sync master is enabled (i.e rpl_semi_sync_master_enabled = 1) - Added a status variable Rpl_semi_sync_master_ack_waits which counts the number of times we waited for an ACK (useful for benchmarking) Originally Reviewed By: hermanlee ---------- facebook/mysql-5.6@b42b911fa15 ---------- Fix rpl_wait_for_semi_sync_ack feature Fixed the following: 1. Initializing last acked position to what is retrived from engine during server startup. This makes sure that lagging async slaves are able to catchup until the last acked position after master restarts. 2. Resetting last acked posistion when `RESET MASTER` is issued. This makes sure that after the binlogs are reset we wait for acks. 3. Signalling/updating last acked positions only on events that were actually acked by the semi-sync slave (like the Xid event of the last trx in a group commit). This is done by signalling inside of the plugin (ReplSemiSyncMaster::reportReplyBinlog). 4. Signalling/updating on trxs skipped on semi-sync slave connection while searching for first gtid connection Originally Reviewed By: hermanlee -------------------------------------------------------------------- Wrapping last semi-sync acked pos in std::atomic to avoid locking in some scenarios We'll now lock the mutex only if we need to wait for last acked pos to update. Otherwise, we just check the current dump thread pos against the last acked pos atomically and if current is less that last acked we sent the event without locking the mutex. -------------------------------------------------------------------- Pull Request resolved: facebook/mysql-5.6#1006 Differential Revision: D16267709 Pulled By: abhinav04sharma --------------------------------------------------------------------- Fix LLVM codegen for struct st_filenum_pos atomic operations (percona#1183) Summary: LLVM has an issue where atomic operations on a struct with 32-bit fields are compiled using libatomic library calls instead of direct assembly, as if the whole struct were 32-bit aligned, i.e. its objects could cross machine word boundary: https://bugs.llvm.org/show_bug.cgi?id=45055. Workaround this issue by aligning the first 32-bit field at 64 bits. This allows not linking mysys with libatomic. Pull Request resolved: facebook/mysql-5.6#1183 Reviewed By: abhinav04sharma Differential Revision: D34379183 Pulled By: hermanlee

…-sync slave (facebook#1006) (facebook#1006) Summary: Jira issue: https://jira.percona.com/browse/FB8-144 Reference Patch: facebook@cc4e803 Reference Patch: facebook@aad3ce2a54e Reference Patch: facebook@b42b911fa15 ---------- facebook@cc4e803 ---------- Send events to async slaves after ACKed by at least one semi-sync slave Added a new mysql variable `wait_for_semi_sync_ack` to control sending binlog events to async slaves. When this variables is set, events are sent to async slave only after it is ACKed by at least one semi-sync slave (if any). Originally Reviewed By: Tema ---------- facebook@aad3ce2a54e ---------- Fixed rpl_wait_for_semi_sync_ack Following changes are made to fix the feature - Condition variable is registered in dump thread's THD before waiting so that the thread can respond to kill command - rpl_wait_for_semi_sync_ack is respected only when semi-sync master is enabled (i.e rpl_semi_sync_master_enabled = 1) - Added a status variable Rpl_semi_sync_master_ack_waits which counts the number of times we waited for an ACK (useful for benchmarking) Originally Reviewed By: hermanlee ---------- facebook@b42b911fa15 ---------- Fix rpl_wait_for_semi_sync_ack feature Fixed the following: 1. Initializing last acked position to what is retrived from engine during server startup. This makes sure that lagging async slaves are able to catchup until the last acked position after master restarts. 2. Resetting last acked posistion when `RESET MASTER` is issued. This makes sure that after the binlogs are reset we wait for acks. 3. Signalling/updating last acked positions only on events that were actually acked by the semi-sync slave (like the Xid event of the last trx in a group commit). This is done by signalling inside of the plugin (ReplSemiSyncMaster::reportReplyBinlog). 4. Signalling/updating on trxs skipped on semi-sync slave connection while searching for first gtid connection Originally Reviewed By: hermanlee -------------------------------------------------------------------- Wrapping last semi-sync acked pos in std::atomic to avoid locking in some scenarios We'll now lock the mutex only if we need to wait for last acked pos to update. Otherwise, we just check the current dump thread pos against the last acked pos atomically and if current is less that last acked we sent the event without locking the mutex. -------------------------------------------------------------------- Pull Request resolved: facebook#1006 Differential Revision: D16267709 Pulled By: abhinav04sharma --------------------------------------------------------------------- Fix LLVM codegen for struct st_filenum_pos atomic operations (facebook#1183) Summary: LLVM has an issue where atomic operations on a struct with 32-bit fields are compiled using libatomic library calls instead of direct assembly, as if the whole struct were 32-bit aligned, i.e. its objects could cross machine word boundary: https://bugs.llvm.org/show_bug.cgi?id=45055. Workaround this issue by aligning the first 32-bit field at 64 bits. This allows not linking mysys with libatomic. Pull Request resolved: facebook#1183 Reviewed By: abhinav04sharma Differential Revision: D34379183 Pulled By: hermanlee

facebook-github-bot added the CLA Signed label Mar 28, 2019

laurynas-biveinis suggested changes Mar 29, 2019

View reviewed changes

percona-ysorokin suggested changes Mar 29, 2019

View reviewed changes

sql/binlog.h Show resolved Hide resolved

sql/rpl_binlog_sender.cc Outdated Show resolved Hide resolved

inikep force-pushed the FB8-144-commited branch from 2b7a5a6 to 4da7f7d Compare April 4, 2019 08:36

percona-ysorokin approved these changes Apr 4, 2019

View reviewed changes

hermanlee reviewed Apr 15, 2019

View reviewed changes

abhinav04sharma reviewed Apr 15, 2019

View reviewed changes

sql/binlog.cc Outdated Show resolved Hide resolved

sql/binlog.h Outdated Show resolved Hide resolved

sql/binlog.h Outdated Show resolved Hide resolved

inikep force-pushed the FB8-144-commited branch 2 times, most recently from 2e76810 to aa7e8f3 Compare April 26, 2019 09:51

abhinav04sharma reviewed May 8, 2019

View reviewed changes

sql/binlog.h Outdated Show resolved Hide resolved

inikep force-pushed the FB8-144-commited branch 2 times, most recently from 08f8521 to 535ff2d Compare May 13, 2019 11:53

hermanlee reviewed May 14, 2019

View reviewed changes

sql/binlog.h Outdated Show resolved Hide resolved

inikep force-pushed the FB8-144-commited branch 2 times, most recently from 603e2f7 to 8c7f5c0 Compare May 15, 2019 15:46

abhinav04sharma reviewed May 15, 2019

View reviewed changes

inikep force-pushed the FB8-144-commited branch 3 times, most recently from edb23e6 to 01b5191 Compare June 28, 2019 14:37

inikep force-pushed the FB8-144-commited branch 3 times, most recently from 091963d to 29937b7 Compare July 1, 2019 12:46

abhinav04sharma reviewed Jul 1, 2019

View reviewed changes

sql/binlog.cc Outdated Show resolved Hide resolved

sql/binlog.cc Outdated Show resolved Hide resolved

inikep force-pushed the FB8-144-commited branch from 29937b7 to 8f13031 Compare July 3, 2019 13:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FB8-144: Send events to async slaves after ACKed by at least one semi-sync slave #1006

FB8-144: Send events to async slaves after ACKed by at least one semi-sync slave #1006

inikep commented Mar 28, 2019 •

edited

Loading

percona-ysorokin left a comment

hermanlee left a comment

hermanlee Apr 15, 2019

inikep Apr 25, 2019

abhinav04sharma May 8, 2019

hermanlee May 8, 2019

abhinav04sharma May 8, 2019

abhinav04sharma May 13, 2019

abhinav04sharma May 13, 2019

inikep May 14, 2019

hermanlee May 14, 2019

abhinav04sharma May 14, 2019

inikep commented May 15, 2019 •

edited

Loading

hermanlee commented May 15, 2019

abhinav04sharma May 15, 2019

abhinav04sharma May 15, 2019

inikep May 16, 2019

inikep May 16, 2019

inikep commented May 17, 2019

hermanlee commented May 22, 2019

abhinav04sharma commented May 23, 2019 via email

inikep commented Jun 28, 2019

FB8-144: Send events to async slaves after ACKed by at least one semi-sync slave #1006

FB8-144: Send events to async slaves after ACKed by at least one semi-sync slave #1006

Conversation

inikep commented Mar 28, 2019 • edited Loading

percona-ysorokin left a comment

Choose a reason for hiding this comment

hermanlee left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

inikep commented May 15, 2019 • edited Loading

hermanlee commented May 15, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

inikep commented May 17, 2019

hermanlee commented May 22, 2019

abhinav04sharma commented May 23, 2019 via email

inikep commented Jun 28, 2019

inikep commented Mar 28, 2019 •

edited

Loading

inikep commented May 15, 2019 •

edited

Loading