-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding a generic performance counter checker and metric retriever #5
Conversation
@magmax Can you take a look at the Travis output please Thanks! |
I'm afraid no. I have no idea about what is wrong with:
Can you help me with this? |
Using my google kung fu I found a recent similar complaint to Rubocop (which is the test breaking by the way). The issue outcome seems to leave it in Rubocop and just disable in .rubocop.yml on a case by case basis. example of disabling the next check The example of the issue is not 100% like yours... if I had to guess its the use of next unless v && k
next unless data You could try switching those two to if statements or something |
I really like the idea of grouping metrics together. I also like the idea of grouping a metric and check (I was wondering if that was doable but had not investigated yet). Combining the two (errr four) ... I am not so crazy about. It looks like in your example in your first comment if you got a critical, would it be clear on which measure(s)? Maybe putting scheme in the output? What if there were multiple failures, would there be multiple lines of what errored? I would love to hear about some examples of how you are using and the ouput of the messages you would see in sensu. Of course even if my concerns are valid I think this is really cool and could not use the check part or have logical groupings like all cpu mearsures in one, disk in another, and memory in another and so on. Also for the kicker, I was just looking at all the stuff you can get out of typeperf. Something like this would prevent us from having 20 different typeperf related checks all passing in a different counter/object. Great idea @magmax |
I really, really tried to fix the problem with several options, but no luck. I do not want to add any exception to style without others permission. Thank you @hurrycaine . I really hate to check it twice, because sometimes I see alerts, but my graphs doesn't show them, and others the opposite happens. In addition, there are some PC that are hard to retrieve, and require a lot of time or resources. Why asking for them twice? I know about the mess you can see in the alerts. But:
IMHO, the minimum/maximum pattern is so used that it should be used as part of the configuration. Indeed, I see the necessity of taking just the metric and decide about the alert in the Sensu core, configuring it in the configuration XML instead. I really hated to add a new file to configure my plugin... but I thought that it was just "one" solution. Maybe you can think in other better :) |
@magmax that is frustrating! I will see if I can find some time today to reproduce and see if I can fix it. I just skimmed the style guide and I think I gave bad advice, looks like we both lack style :O) It did give me some more ideas. I agree the check X and metric X seemed inefficient to me as well. I think some would moan the yaml file and I think we could add some command line arguments. If we really wanted. I have some odd cases where some servers min/maxes may need to differ and it looks like from the sensu docs a check definition on a agent/client overrides the one on the server. So this yaml file might be helpful for me. Let me try using this check for a bit instead of throwing out theoretical suggestions. |
@magmax If we can't figure out the issue, we can add an exception and then circle around to it when time arises. I think overall the idea is certainly interesting and maybe even useful ;) I will try and find some time this weekend maybe and take a look at it as well. |
I think I am close on figuring out the style issue though IMO it might be harder to follow. So the style is secondary, the check actually working is the important point, I can't get the check to work. A little belated I tried the check wanting to compare my refactor to make sure the logic was the same. @magmax I am getting a "undefined method 'shift" error when I run the original (and mine) I am curious if you are using this check as its currently written? If so which version of windows? This shift error is on the row object, it appears it should be a hash or an array but its just one line. CSV.parse(io.read, headers: true) do |row|
puts " row = #{row}"
# row.shift
# row.each do |k, v| It looks like you want to skip the first line since it is blank. Doing something like this (or make a counter) CSV.parse(io.read, headers: true).each_with_index do |row, i|
next if i == 0 |
Here is what I did yesterday trying to fix the syntax issue before I tested the actual check. I am not sure it as clean as it was before First download Rubocop gem and run it ... see their site. These come from the Rubocop style guide
This is all good, however our problem is we had multiple checks so what I had to do was make 2 small methods with a guard clause. I also had to add two new variables for a min_ok and max_ok because this new way the is_ok could be set to false by the min but then set back to true by max method. As I just wrote this I guess we could go next on the loop if min trips the status to false because nothing should be below min and above max |
the If you find a version that has no check-style problems and do the same... come on. About changing the configuration program by command line arguments: I thought that, but it is a mess. Think about adding PF, the prefix for graphite and their mins and maxs. I already think it is a mess just with 1 counter! |
I agree the command line options would only work on metric/check combo not a list of metric/checks, did not think that through when I said that. There does not appear to be a shift method for your row. I dug a little deeper into the CSV class and it looks like you would use shift or parse not both together. When you have CSV.parse(io.read, headers: true) do |row| This already is looping through the "CSV" so there is nothing to shift, shift grabs the next "row". I did some checks to make the sure the checks fired if they are below the min or above the max. I also made the decision to not check for max if the value was below min. |
I'd love to get this merged. Can we have a bit more documentation and some testing? |
I do not know why is it failing in rubocop. Maybe it is the "break" I'm using, but it is exactly what I want. By the way... Finally I moved to https://github.com/carllindelof/sensu-client and wrote an extension in order to keep performance counters run in a thread instead of a process. Result: Less CPU and RAM consumption. Anyways... I think this branch is ready to be merged but the Rubocop warn. |
closing due to inactivity, please feel free to re-open if you plan on getting this over the line. |
Hello.
I've being doing a lot of separated plugins to ask for performance counters, but I had a lot of performance problems: metrics required near 80% CPU.
So I decided to create an unique checker/metric retriever that can do both in just one step. And here it is.
The idea is to have a YAML file that configures which metrics to retrieve, how should they be stored and the limits of a check. Everything but the performance counter is optional.
So, you can do things like these:
what will print all the performance counters and errors if there is any, and will return an error code if necessary.
I'm not a good Ruby programmer, but I did my best. Please, feel free to improve it. Indeed, I do not know what do you think about this idea.