-
Notifications
You must be signed in to change notification settings - Fork 814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mysql] fix replication service check. #2603
Conversation
@@ -535,17 +535,17 @@ def _collect_metrics(self, host, db, tags, options, queries): | |||
# slaves will only be collected iff user has PROCESS privileges. | |||
slaves = self._collect_scalar('Slaves_connected', results) | |||
|
|||
if slave_running is not None: | |||
if slave_running and not slaves: # slave |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It think slaves
can only be a float
, wouldn't it be better to do a slaves == 0.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can also be a , so I'll fix that, good catch.None
👍 |
[mysql] fixing comparisons, slaves will be a float.
ffddf47
to
e614fa0
Compare
slave_running_status = AgentCheck.OK | ||
else: | ||
slave_running_status = AgentCheck.WARNING | ||
else: | ||
slave_running_status = AgentCheck.CRITICAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if on standalone, they should obviously disable replication
on the YAML. But this would still make sense.
Way easier to read, thanks @truthbk! |
Thank you @degemer ! |
The `performance_schema.threads` table is not available on MySQL < `5.6.0`. As a result, retrieving the slaves' statuses triggers the following error: ``` ProgrammingError: (1146, u"Table 'performance_schema.threads' doesn't exist") ``` This is a regression introduced by #2603
Why
The service check wasn't doing the right thing when running on the master. Should address issue #2596.
What
Couple things:
SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST WHERE COMMAND LIKE '%Binlog dump%'
because looking at the worker threads could be inaccurate (it's blocking though, but we should be fine).Slave_running
will also be available on the master node, so we have to also check the number of slaves is 0 (or None). That should guarantee being on a slave, and not falling through that logic branch on a master (and thus report CRITICAL, when we're fine).For now we're keeping the replication check on the master in case a customer would like to run the
dd-agent
just on that node and still get replication insights.