-
Notifications
You must be signed in to change notification settings - Fork 814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mysql] enhancing the catalog of metrics reported. #2116
Conversation
c25c519
to
092afc8
Compare
e16020d
to
c02acf7
Compare
@@ -6,6 +6,7 @@ | |||
|
|||
# 3p | |||
import pymysql | |||
import psutil # permissions issues? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should add a try except ImportError here. psutil might not be available on source installs that don't have a compiler
@truthbk can you describe what metrics have been removed (if any) and added ? This PR is huge and can potentially break a lot of things for users as it's one of the most used integrations. So the more details we have about what it does the better :) |
|
||
def __init__(self, name, init_config, agentConfig, instances=None): | ||
AgentCheck.__init__(self, name, init_config, agentConfig, instances) | ||
self.mysql_version = {} | ||
self.greater_502 = {} | ||
self.host = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You shouldn't store instance level variables at the AgentCheck level. You might end up breaking multi instances configuration
@remh updated PR description with new metric list. Let me know if you want me to put it somewhere else too. |
@leucos I agree that to monitor query cache fragmentation you need to be able to monitor the ratio of Qcache_free_blocks and Qcache_free_memory so it would be beneficial to report Qcache_free_blocks for users of MySQL query cache. |
I've only started reviewing the PR, some great work here :) I've added a bunch of comments on metrics that are reported by MySql as "counters since the server started", so that should be rates instead of gauges. That said a lot of these metrics are useful as gauges when computed in ratios with other gauges, so we need to be clear on the type we actually want to send these metrics as. Related to the discussion in #2173. |
'Innodb_os_file_reads': ('mysql.innodb.os_file_reads', RATE), | ||
'Innodb_os_file_writes': ('mysql.innodb.os_file_writes', RATE), | ||
'Innodb_os_log_pending_fsyncs': ('mysql.innodb.os_log_pending_fsyncs', RATE), | ||
'Innodb_os_log_pending_writes': ('mysql.innodb.os_log_pending_writes', RATE), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Innodb_os_log_pending_fsyncs
and Innodb_os_log_pending_writes
are probably gauges (http://dev.mysql.com/doc/refman/5.7/en/server-status-variables.html#statvar_Innodb_os_log_pending_fsyncs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
One last thing: could you document the There are still a 2 comments from @remh that are not addressed, but apart from that it looks great! |
5ba8134
to
43a2fbf
Compare
@olivielpeau comments have been addressed, and ssl section added to config file. Thanks a ton for the long review! |
|
||
# 3p | ||
import pymysql | ||
try: | ||
import psutil # permissions issues? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean exactly by this comment? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nothing - will be removed. psutil
has like two different levels of usage stats - not all stats are available if you don't have privileges, but we should handle that case just fine.
f7f502b
to
a39e38c
Compare
Awesome, looks good! Could you squash your commits? Let's merge once the CI passes 🎉 |
0f23096
to
1b72f70
Compare
…cation, caching, ssl. The following changes have been made to the mysql check: a. Its heavily inspired by MySQL cacti template provided by Percona which contains an extensive set of checks that are important to measure the performance and health of a MySQL server. For reference please look at this URL: http://www.percona.com/doc/percona-monitoring-plugins/1.1/cacti/mysql-templates.html#sample-graphs b. Many of the important metrics exposed by Percona Server / MariaDB Server are not available in MySQL Server as status variables, these have been added by parsing the output of "SHOW ENGINE INNODB STATUS" Collect Binary log stats only when binlogging is enabled mysql.net.max_connections_available was improperly set to be a RATE instead of a GAUGE The metric mysql.innodb.row_lock_waits was incorrectly being sent as GAUGE while its a RATE Float values in InnoDB status were not being parsed correctly [mysql] Cleaned up, rebased and added testing to PR for Innodb Status parsing for Percona metrics Check for mysql version 5.6.6 metrics Add a new check mysql_ps which collects metrics available with performance_schema. Currrently it only collects 95th percentile query run time - Added a new check that reports avg MySQL query response time per schema. This will be really helpful for those applications that are multi-tenant to know which schema makes up for most of the query response time. The average query response time per schema was not being computed correctly [mysql] Added Galera specific gauges. [mysql] Consolidating multiple efforts into revamped MySQL check. Close connection if anything happens. Try to report usage stats using psutil. Cleaning up collection of query_run_time_avg - putting it into PERFORMANCE_VARS. Add mysql.replication.slave_running service check [mysql] Multiple fixes, clean-up, resilience for different versions. Don't check db if db is None. More characters we need to strip of rows. Minor improvements to mysql parser. Actually collect extra innodb stats. Fixing generator. Add some more logging. Metrics unavailable in 5.5 - considering case. Support multiple MySQL instances on same host. Older MySQL's will not support some performance queries. Performance metrics may be unavailable in an inactive DB. Service check should no loger send a 'host' tag, but a 'server' tag instead. We can grab OSX stats from PS - but only total cpu_time no break down. Adding extra_innodb_metrics to test. Updating ruby script to create performance_schema stats. test already in main mysql.py test - removing. Adding support for per schema size reporting. Add test support for schema size metrics. Adding some resilience for None metrics [mysql_test] Both performance metrics will be missing if ver<5.6.0. We also get metrics for information and performance schemas for schema size. [mysql] Fixing minor issues. [mysql] handle with care: psutil could be unavaiable in source installs. [mysql] Adds new query cache metrics This commit adds 7 new query cache (QCache) metrics to the mysql monitor: - Qcache_free_blocks - Qcache_free_memory - Qcache_inserts - Qcache_lowmem_prunes - Qcache_not_cached - Qcache_queries_in_cache - Qcache_total_blocks These metrics are very helpful tuning the query cache. [mysql] Adding handle_prepare metrics. [mysql] adding qcache utilization stats - agent computed. [mysql] adding synthetic vars to test. [mysql] Adding SSL support. [mysql] removing mysql_ps.yaml.example, never used, just debris from PR merge. [mysql] some clean-up. [mysql] adding context managers for cursors, check perf schema is enabled. Touch up.. [mysql] removing unused schema dict flakes. [mysql] convert strings to ints and floats. [mysql] make sure performance schema is enabled before grabbing performance metrics. [mysql] if we've got no hits, also report instant usage as 0. [mysql] adding variables and performance note to mysql yaml sample config. [mysql] use _is_affirmative, use defaultdict instead of whitelist. [mysql] removing whitelist [mysql] qcache stats are per instance. [mysql] helper functions to keep track of instance specific vars and stats. [mysql] cleaning up mysql SSL connecting code (ssl param can be None). [mysql] fixing types for MySQL metrics. [mysql] adding count and monotonic_count as possible metric types. [mysql] adding configuration note. [mysql] use contextmanager to ensure close() of db connection. [mysql] adding ssl section to mysql sample config file. [mysql] do not compute metrics if source metrics unavailable. [mysq] use correct service_check tags for Replication check. [mysql] use psutil to grab pid instead of spawning a subprocess. Nitpicks. [mysql] Ensure db conenction ready. Fix test. Always send replica status gauge. [mysql] fix test - replication service_check tags. [mysql] always send deprecated gauge for slave status. [mysql] some more nitpicking.
1b72f70
to
5268a8b
Compare
Let's 🎉 🚀 🎆 |
[mysql] enhancing the catalog of metrics reported.
MySQL Check Enhancement
This PR combines multiple efforts by @ovaistariq (huge effort!) @garnermccloud @c960657 @zdannar @polynomial @leucos to greatly improve the catalog of MySQL metrics reported. Metrics have been added to provide additional insight into:
Additional work has gone into improving the test and attempt to test the entire catalog.
This overhaul will now allow us to provide the following new metrics (hang tight):
New Status Metrics
MyISAM Metrics
Global Variable Metrics
InnoDB Metrics
BINLog Metrics
Additional (and optional) Status Metrics
Optional Status metrics available for MySQL >= 5.6.6
Optional InnoDB Metrics
Galera Cluster Metrics
Performance Metrics
Schema Metrics
Replication Metrics
Qcache Utilization Metrics
PRs referenced: