-
Notifications
You must be signed in to change notification settings - Fork 107
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to add to the tests that this function doesn't modify the inputs?
expr/func_aspercent.go
Outdated
if len(totals) == 1 { | ||
totalsSerie = totals[0] | ||
} else if len(totals) == len(series) { | ||
sort.Slice(series, func(i, j int) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a comment about why there are sorted.
return nil, err | ||
} | ||
|
||
var outSeries []models.Series |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you do a
defer cache[Req{}] = append(cache[Req{}], outSeries...)
here instead of all the individual cache steps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very nice, I like it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defer args are evaluated at call time, so this won't work as expected.
edit: looks like i'm wrong: https://play.golang.org/p/t8RZ1v8fLXT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You forgot to call foo() in your snippet.
see https://play.golang.org/p/Qh4rp1p2q73
expr/func_aspercent.go
Outdated
serie1.Target = fmt.Sprintf("asPercent(%s,%s)", serie1.Target, serie2.Target) | ||
serie1.Tags = map[string]string{"name": serie1.Target} | ||
for i := range serie1.Datapoints { | ||
serie1.Datapoints[i].Val = computeAsPercent(serie1.Datapoints[i].Val, serie2.Datapoints[i].Val) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this violates https://github.com/grafana/metrictank/blob/master/expr/NOTES#L35
serie1.Target = fmt.Sprintf("asPercent(%s,%s)", serie1.Target, serie2.Target) | ||
serie1.Tags = map[string]string{"name": serie1.Target} | ||
for i := range serie1.Datapoints { | ||
serie1.Datapoints[i].Val = computeAsPercent(serie1.Datapoints[i].Val, serie2.Datapoints[i].Val) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this violates https://github.com/grafana/metrictank/blob/master/expr/NOTES#L35
} else { | ||
totalVal = s.totalFloat | ||
} | ||
serie.Datapoints[i].Val = computeAsPercent(serie.Datapoints[i].Val, totalVal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this violates https://github.com/grafana/metrictank/blob/master/expr/NOTES#L35
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed all and added a test
expr/func_aspercent_test.go
Outdated
// Test if original series was modified | ||
for i, orig := range originalSeries { | ||
inSerie := in[i] | ||
if orig.Target != inSerie.Target { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use something like
Line 186 in 0428430
if !reflect.DeepEqual(err, c.expErr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because the original series does get modified by metrictank (a "name" tag gets added). Makes sense to compare relevant values only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scratch that. the real problem was that reflect.DeepEqual(math.NaN(), math.NaN()) == false
, which is not the case when comparing series
expr/func_aspercent_test.go
Outdated
originalSeries[i].Interval = serie.Interval | ||
originalSeries[i].QueryPatt = serie.QueryPatt | ||
originalSeries[i].Target = serie.Target | ||
originalSeries[i].Datapoints = getCopy(serie.Datapoints) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just
originalSeries[i] = serie
originalSeries[i].Datapoints = getCopy(serie.Datapoints)
Gets all the things like tags etc.
expr/func_aspercent.go
Outdated
@@ -130,6 +130,8 @@ func (s *FuncAsPercent) Exec(cache map[Req][]models.Series) ([]models.Series, er | |||
if len(totals) == 1 { | |||
totalsSerie = totals[0] | |||
} else if len(totals) == len(series) { | |||
// Sorted to match the input series with the total series based on Target. | |||
// Mimicks Graphite's implementation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, but 'Mimic' the common modern spelling
expr/func_aspercent.go
Outdated
|
||
func copyDatapoints(serie *models.Series) { | ||
out := pointSlicePool.Get().([]schema.Point) | ||
for _, p := range serie.Datapoints { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out = append(out, serie.Datapoints...)
Note: this requires Go version 1.10+ because it uses math.Round() @shanson7 any outstanding comments? |
Any chance this can get merged to master? |
…an optional argument
do you plan to make more changes or dyou consider this complete? |
This is good to go. I just rebased it to make the merge cleaner. And I didn't use much, just added an additional argument type (that you originally made). It would be a lot of work to reuse the old commits with little to no payoff imo, since I rewrote most of it. |
for _, argExp = range argsExp { | ||
if pos >= len(e.args) { | ||
break // no more args specified. we're done. | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the removal of cutoff
here is this safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Before it assumed that series arguments can only be non-optional arguments and in the beginning. i.e function(serie, int, string, opt=string)
.
With my change the following is possible:
function(serie, int, serie, opt=serie)
Like before, this part only consumes series, and just skips over everything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good :) the argument consumption/iterating code is probably my least favorite code of MT. FWIW.
if you can handle this, you can handle anything else 👯♂️
expr/func_aspercent.go
Outdated
totalSeriesLists := groupSeriesByKey(totals, s.nodes, &keys) | ||
totalSeries = getTotalSeries(totalSeriesLists) | ||
} else { | ||
return nil, errors.New("total must be None or a seriesList") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like it's fairly easy to trigger this by specifying a serieslist pattern that doesn't match any series.
maybe in that case we can provide a better error message? (basically above at if len(totals) == 0 {
maybe directly return an error ? what does graphite do in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean here. This error message gets triggered if they pass in a number. In the case where they pass in a nodes argument, only a series or nothing should be passed in for the total
argument. First branch is triggered if neither is passed in. Second branch is triggered if a series is passed in. Which leaves the case where a number is passed in (i.e math.IsNaN(s.totalFloat) = false
)
Graphite throws that exact error message is that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you specify a serieslist pattern that doesn't match any series, then I believe this will happen:
if s.totalSeries != nil {
totals, err = s.totalSeries.Exec(cache)
if err != nil {
return nil, err
}
if len(totals) == 0 {
totals = nil <---- this right here
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, Graphite crashes if you do that.
Let me see if I can handle it in some clean way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so this was just unnecessary:
if len(totals) == 0 {
totals = nil
}
So I removed it. Now if a series returns empty, it does not assume that there were no arguments
return context | ||
} | ||
|
||
func (s *FuncAsPercent) Exec(cache map[Req][]models.Series) ([]models.Series, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function is rather complex. can we split it up in smaller functions?
expr/func_aspercent.go
Outdated
return outSeries, err | ||
} | ||
|
||
func (s *FuncAsPercent) execWithNodes(series []models.Series, totals []models.Series) ([]models.Series, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
series, totals []models.Series
. also the other function
expr/func_aspercent.go
Outdated
} | ||
} | ||
|
||
func deepCopySerieElements(serie *models.Series) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this API is a bit strange. would it not make more sense to return a new series that is a deep copy?
even though that is slightly more work, seems worth it.
in fact, maybe add a Copy()
method to the serie type in the models package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also looking at sumSeries, the copying/newly-allocating seems a bit too eager.
we should only need to allocate point slices:
- for the totals slice: when we need to sum up values and we need a place to store the totals (not when totals results in a single series, then we can just read from it)
- for each output series.(if it is different from the input slice)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with # 1 is that later I modify the total series in some cases such as this (line 99):
// No series for total series
if _, ok := metaSeries[key]; !ok {
serie2 := totalSeries[key]
serie2.QueryPatt = fmt.Sprintf("asPercent(MISSING,%s)", serie2.QueryPatt)
serie2.Target = fmt.Sprintf("asPercent(MISSING,%s)", serie2.Target)
serie2.Tags = map[string]string{"name": serie2.Target}
for i := range serie2.Datapoints {
serie2.Datapoints[i].Val = math.NaN()
}
outSeries = append(outSeries, serie2)
continue
}
I guess I could make a copy right there, not sure which one would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll be mostly afk until Thursday.
Please read expr/NOTES carefully if you have not already done so. All this complexity is to make sure we never overwrite data in the memory AggMetric, chunk cache etc.
Some of these fields are by value so harmless but the tags map is interesting as well as it might open an avenue to modify the tags in the MemoryIdx. Not sure if we currently take that into account everywhere or maybe we already have a provision for that. On phone so can't check right now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, well I added a Series.Copy(emptyDatapoints []schema.Point)
function (the argument is there so that pointSlicePool
can be used if needed). And I only copy the series if I modify values that shouldn't be modified.
Hopefully that's sufficient and what you were looking for 😃
some unit tests for ArgIn would be nice. |
{Val: float64(199) * 100, Ts: 30}, | ||
{Val: float64(29) / 2 * 100, Ts: 40}, | ||
{Val: float64(80) / 3.0 * 100, Ts: 50}, | ||
{Val: float64(250) / 4 * 100, Ts: 60}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't out2 be the same as in the previous function? because both cases have asPercent(d,a)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because when totals is a seriesList with len(totals) == len(series)
, the series first get sorted by tag before getting matched. So, in this test case it would be asPercent(d,c) and asPercent(b,a). This behavior is same in Graphite
expr/func_aspercent.go
Outdated
totalVal = totalsSerie.Datapoints[i].Val | ||
} else { | ||
totalVal = s.totalFloat | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't totalFloat pretty much guaranteed to be NaN here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because totalFloat is still a valid option here. (the else
part of the if statement right before this)
expr/func_aspercent.go
Outdated
for i := range serie.Datapoints { | ||
var totalVal float64 | ||
if len(totalsSerie.Datapoints) > i { | ||
totalVal = totalsSerie.Datapoints[i].Val |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we may want to always just assume this branch is taken and just write this code line directly without the if/else.
that way if we ever have a bug where len(totalSerie.Datapoints) != len(serie.Datapoints)
we can troubleshoot it instead of trying to hide such error case and returning incorrect data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is more of a check of whether there is a totalSeries at all. I'll change it to > 0
, that way we get an index out of range exception if there's a bug.
expr/func_aspercent.go
Outdated
sort.Slice(totals, func(i, j int) bool { | ||
return totals[i].Target < totals[j].Target | ||
}) | ||
for i, serie1 := range series { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lots of duplication here wrt the similar code block further down.
can't we do all the if/else stuff here to just set up the right totals and series variables,
and then do the same processing at the end irrespective of which scenario it was?
expr/func_aspercent.go
Outdated
} else if totals != nil { | ||
totalSeriesLists := groupSeriesByKey(totals, s.nodes, &keys) | ||
totalSeries = getTotalSeries(totalSeriesLists) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- we compute totals even for keys we won't need (e.g. keys not in the input series)
- for each missing case we repeatedly get a slice and fill it with NaN's. we could reuse the same slice in this case. my suggestion would be declare a
var nones []schema.Point
. then whenever you need it, if it's nil, instantiate it. if it's not, just reuse it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think reusing a nones
slice would play nicely with the pointSlicePool, would it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh right, because at the end the same slice would get added to the pool multiple times, which is bad. so nevermind that then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On phone now
best would be to just add the slice to the cache thing once I think (that way we can reuse them)
hi @stivenbb my first pass of review comments is now done :p let me know when you've addressed everything or if you have any questions. |
@Dieterbe ok, I think I've addressed all the requested changes. Let me know if I missed something |
|
math.Round |
I think the last things to do here are :
|
* easier to follow code * consistency with execWitNodes * bugfix: pool-obtained sumseries should be recycled later
Ok, just cherry picked. Will work on not computing totals when unnecessary. |
oh i realized my code |
cassandra test seems to be failing after I cherry-picked... Do you know what that could be about? |
looks like a flakey test. rerunning tests should fix it. but circleCI is not letting me. can you retrigger on https://circleci.com/workflow-run/ba5e3453-48a0-4731-9c34-22381695eb63 ? if not, your next push will. |
Ok, should be good to go now. LMK if my change looks good. |
great work @stivenbb, thank you very much for your work on this. |
Native implementation of asPercent() Graphite function. (http://graphite.readthedocs.io/en/latest/functions.html#graphite.render.functions.asPercent)
Added a new argument type
ArgIn
that allows multiple other argument types. This was necessary for the total argument. Some of the code borrowed from an abandoned PR: #672In terms of speed improvement:
On average, the native implementation was 11x faster, median was 4x faster, p95 was 14x faster, p99 was 14x faster and max was over 10x faster