Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mysql] fix replication service check. #2603

Merged
merged 2 commits into from
Jun 15, 2016
Merged

[mysql] fix replication service check. #2603

merged 2 commits into from
Jun 15, 2016

Conversation

truthbk
Copy link
Member

@truthbk truthbk commented Jun 15, 2016

Why

The service check wasn't doing the right thing when running on the master. Should address issue #2596.

What

Couple things:

  • We now default to collecting the number of slaves by looking at SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST WHERE COMMAND LIKE '%Binlog dump%' because looking at the worker threads could be inaccurate (it's blocking though, but we should be fine).
  • Slave_running will also be available on the master node, so we have to also check the number of slaves is 0 (or None). That should guarantee being on a slave, and not falling through that logic branch on a master (and thus report CRITICAL, when we're fine).

For now we're keeping the replication check on the master in case a customer would like to run the dd-agent just on that node and still get replication insights.

@@ -535,17 +535,17 @@ def _collect_metrics(self, host, db, tags, options, queries):
# slaves will only be collected iff user has PROCESS privileges.
slaves = self._collect_scalar('Slaves_connected', results)

if slave_running is not None:
if slave_running and not slaves: # slave
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It think slaves can only be a float, wouldn't it be better to do a slaves == 0. ?

Copy link
Member Author

@truthbk truthbk Jun 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can also be a None, so I'll fix that, good catch.

@degemer
Copy link
Member

degemer commented Jun 15, 2016

👍

[mysql] fixing comparisons, slaves will be a float.
@truthbk truthbk force-pushed the jaime/mysql_replfix branch from ffddf47 to e614fa0 Compare June 15, 2016 21:14
slave_running_status = AgentCheck.OK
else:
slave_running_status = AgentCheck.WARNING
else:
slave_running_status = AgentCheck.CRITICAL
Copy link
Member Author

@truthbk truthbk Jun 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if on standalone, they should obviously disable replication on the YAML. But this would still make sense.

@degemer
Copy link
Member

degemer commented Jun 15, 2016

Way easier to read, thanks @truthbk!
:shipit:

@truthbk
Copy link
Member Author

truthbk commented Jun 15, 2016

Thank you @degemer !

@truthbk truthbk merged commit 283fe30 into master Jun 15, 2016
@truthbk truthbk deleted the jaime/mysql_replfix branch June 15, 2016 22:32
@truthbk truthbk modified the milestone: 5.9.0 Aug 10, 2016
yannmh added a commit that referenced this pull request Feb 15, 2017
The `performance_schema.threads` table is not available on MySQL <
`5.6.0`. As a result, retrieving the slaves' statuses triggers the
following error:

```
ProgrammingError: (1146, u"Table 'performance_schema.threads' doesn't
exist")
```

This is a regression introduced by
#2603
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants