-
Notifications
You must be signed in to change notification settings - Fork 0
Conversation
check_graphite.gemspec
Outdated
@@ -20,6 +20,7 @@ Gem::Specification.new do |s| | |||
|
|||
# specify any dependencies here; for example: | |||
# s.add_development_dependency "rspec" | |||
s.add_runtime_dependency "linear-regression" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much code would it add if the linear regression code was inlined (or rewritten in this gem)? I'm asking mostly so that we don't add dependencies unless absolutely necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, philosophy. Define absolutely necessary. (If it added significant code, I would not have used the external module.) You are right however that the module needs to have its version fixed in the gemspec.
lib/attime.rb
Outdated
raise "Unparseable time period #{text}" unless match | ||
sign, scalar, unit = match[1..3] | ||
raise "Missing scalar in time #{text}" if scalar == "" | ||
raise "Missing unit in #{text}" if unit == "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use scalar.empty?
instead of comparing to an empty string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
lib/attime.rb
Outdated
lookup(x) | ||
else | ||
x | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rewrite like this:
case x
when String
lookup(x)
when nil
raise "Bad unit #{unit}"
else
x
end
to avoid a very easy-to-overlook trailing unless
on the line just before the if
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely. Thx.
lib/attime.rb
Outdated
"year" => 365 * DAY, | ||
"years" => "year", | ||
} | ||
EXPR = /([+-])?([0-9]*)(.*)/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be /\A([+-])?([0-9]*)(.*)\Z/
(i.e. anchored at the start and end of the string so that "1s 2d 3y" doesn't match?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation will yield "Bad unit s 2d 3y" which I feel is a good enough error message for this. .*
at end is already anchored to end of input (unless there are newlines). However, the current implementation does not handle floats: 1.5days
=> Bad unit .5days
which are easier to support than fail intelligently. Fixed.
lib/attime.rb
Outdated
raise "Missing scalar in time #{text}" if scalar == "" | ||
raise "Missing unit in #{text}" if unit == "" | ||
time + Integer(scalar) * lookup(unit) * (sign == "-" ? -1 : 1) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm on the fence about trailing if
s and unless
s. I think they have their place and can add readability, but using them too much in a method, especially with calculations in between, IMO makes the code noisy.
Why does the expression even allow an empty scalar and/or unit? Does it really add that much to the error message to differentiate between the cases? Could it be enough to just say that the input is malformed when the regex doesn't match?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, giving intelligent error messages on option parsing for data where it is not trivial to know what is possible is critical.
p = Regression::CorrelationCoefficient.new(xs, ys).pearson | ||
store_value options.name, value | ||
# Unfortunately, nagios_check converts both primary and | ||
# secondary values to float. Hence manual assignment instead: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this explanation belongs in a test and/or a commit message. Comments usually lie because they remain after the code is changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree. While comments should be used sparingly, If I break or circumvent a contract, I need to explain why. If you try to change this to store_value 'p-value', p.abs
you will get an exception from the undefined
test. Creating a test with the title "make sure the contract is circumvented" is just guessing on the reason for a future breakage; there is no way a test can be that surgical.
If a developer disregards an immediately adjacent comment, that is no different from disregarding an immediately adjacent line of code. I don't really see how a review can objectively object to a useful comment when the review could equally have objected to that comment if it was orphaned.
lib/check_graphite/projection.rb
Outdated
def self.included(base) | ||
base.on '--projection FUTURE_TIMEFRAME', :default => '2days' do |timeframe| | ||
options.send('processor=', method(:projected_value)) | ||
options.send('timeframe=', timeframe) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Ruby-ish thing to do is to use :processor=
and :timeframe=
(i.e. symbols not strings when referring to methods).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. Fixed.
spec/attime_spec.rb
Outdated
end | ||
|
||
def timedelta(n) | ||
Time.now + n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's hope there's no GC pauses during the test run... since .attime
takes the current time as second argument I think you should have a let :now
and use that here and in the .attime
call to make sure that time doesn't elapse during the test run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It hardly takes 0.9 secs to GC a test execution? But you are right, it makes for better specs. In fact, attime() should probably require the time since that is how it is used in production code.
spec/attime_spec.rb
Outdated
@@ -0,0 +1,31 @@ | |||
require 'attime' | |||
|
|||
describe '#attime' do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be:
describe CheckGraphite do
describe '.attime' do
or
describe 'CheckGraphite.attime' do
I.e. the description should include the module/class name, and in Ruby you refer to class/module methods as .foo
and only instance methods as #foo
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Survived from a previous incarnation; changed to second version.
spec/projection_spec.rb
Outdated
}) | ||
c = CheckGraphite::Command.new | ||
STDOUT.should_receive(:puts).with(match(/ze-name.*in 5min.*p-value=0.9/)) | ||
lambda { c.run }.should raise_error SystemExit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of get why this is needed, but boy is this a brittle test... both the stub of stdout and then rescuing SystemExit
... It's not very much less ugly, but perhaps Kernel.should_receive(:exit).with(0)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is copied from check_graphite_spec.rb. I do not approve of the style, but solving the i/o problem would require to introduce a way to inject a StringIO. raise_error SystemExit
is not so fragile because you will probably notice if it is not satisfied (but not_to raise_error SystemExit
would be essentially worthless).
In general this is great 👍 , but I had a few comments. |
945b80d
to
1bb2015
Compare
- support float timeframes e.g. --projection 1.5days - style improvements as suggested by iconara on PR #1
Thanks for taking the time to review this. It was concieved in relative haste and is somewhat shoddy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm missing a test.
if v.nan? | ||
'undefined' | ||
else | ||
BigDecimal.new(v, 3).to_s('F') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This thing needs a test. It needs to be documented why the formatting uses three significant digits (and not five, three or 99). Also, why not for example sprintf('%.3f', v)
(because I assume this method only ever expects numbers in the 0.0-1.0 range)?
This line has actually been bugging me for a few days, but it was not until today when it just popped into my head that I realised why. It's one of those things that you kind of gloss over, like "yeah, sure and then there's some formatting". But it's not just formatting, it's very deliberate formatting, but with no test to motivate it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sprintf() can only do number of decimals, but here we deal with numbers that can be any float, so there is no way to know how many decimals are relevant, if any. Thus we instead do number of significant digits. Why 3? It works for our use case; it will need to be configurable eventually, but I think it would be a rare use case that cares about changes < 1/1000. I don't think the helper merits being named a test unit in its own right. However, this means that the "integration" tests that tests against the full output should check that the number is formatted correctly. Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turned out to make for messy rebases, so this is actually fixed on b2b1738 not in this PR.
1bb2015
to
3e6219e
Compare
- support float timeframes e.g. --projection 1.5days - style improvements as suggested by iconara on PR #1
- support float timeframes e.g. --projection 1.5days - style improvements as suggested by iconara on PR #1
3e6219e
to
a42d93b
Compare
Support alerting against a linear projection into the future of the time series retrieved from Graphite. E.g.
--projection 30min
means that a future value will be calculated using linear regression and that value will be offered to nagios_check for evaluation against warning/critical thresholds. The check also presents the p-value of that projection so that ops people has a chance of judging whether the alarm is the result of noisy data.Using a simple projection is superior in some use cases where the rate of change is relatively stable, because it saves us from having to estimate the rate of change. Instead we can just say e.g.: "Give me 6 hours advance notice on the disk filling up."