Duplicate `Put` in point slice pool from `mergeSeries` #1875

shanson7 · 2020-08-10T10:16:38Z

Describe the bug

In certain scenarios during a render request a slice can be returned to the pool 2 times. We saw corrupted responses because of this and so deployed a change that (within a single render) detects this case, does not return the slice to the pool twice and logs additional information. With this info I think I've found where it happens. In the (likely rare) case where there are identical time-series with 2 different intervals, mergeSeries will merge them together to form a single normalized value.

It then puts all the other series back into the pool and only keeps the first one. However, normalize already puts the new normalized value into the datamap. So, in the scenario where there is series foo.bar with intervals 60 (let's call this foo60) and 30 (foo30) (in this order), the process flow goes something like:

mergeSeries calls normalize(datamap, [foo60, foo30])
normalize keeps foo60 (nothing to do)
normalize copies foo30, consolidates it to 60 seconds and adds it to the datamap (let's call this foo60_2)
- Note: At this point we dropped the slice from foo30 to be GC'd. It did not go into the pool
mergeSeries then iterates over the points in foo60 checking if it can fill NaN values from foo60_2
Finally, mergeSeries adds foo60_2.Datapoints into the pool, even though it is already in the datamap!

Helpful Information
Metrictank Version: v1.0-52 (custom build off of commit e085017)
Golang Version: go1.12.7

The text was updated successfully, but these errors were encountered:

shanson7 · 2020-08-10T12:35:20Z

For completeness, the reverse case ([foo30, foo60]) is also an issue:

mergeSeries calls normalize(datamap, [foo30, foo60])
normalize keeps foo60 (nothing to do)
normalize copies foo30, consolidates it to 60 seconds and adds it to the datamap (let's call this foo60_2)
- Note: At this point we dropped the slice from foo30 to be GC'd. It did not go into the pool
mergeSeries then iterates over the points in foo60_2 checking if it can fill NaN values from foo60_2
mergeSeries adds foo60.Datapoints into the pool (this is ok)
executePlan adds foo60_2 to the datamap for a second time.

This version is the one that the code I added to our rollout is actually detecting. The one in the original description won't be detected and the pool will be corrupted until next clean or until detected.

Dieterbe · 2020-08-11T11:22:56Z

Very nice catch!

The expr.Normalize functions were originally meant to be used within the expr package, inside of a plan.Run() call.
that means they should follow this regimen:

any unused series should be left alone (may be referenced or read from later, e.g. different function processing chain)
any newly created series requires an entry in datamap, to make sure it'll be reclaimed later (after plan.Run, because the series might need to be included in the response body)

When we added normalization into mergeSeries - in #1674 - we reused the same expr.Normalize functions but from a different call site, prior to plan.Run(), where the expectations are different (where we will still add all our series that we'll use to the datamap, or if we don't need them put them in the pool), and thus that behavior doesn't make sense anymore.

Let me dig a bit more into this and figure out a clean way to resolve this

shanson7 added the bug label Aug 10, 2020

Dieterbe mentioned this issue Aug 11, 2020

Fix same models.Series getting added to the sync.Pool (or datamap) twice #1879

Merged

Dieterbe closed this as completed in #1879 Aug 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicate `Put` in point slice pool from `mergeSeries` #1875

Duplicate `Put` in point slice pool from `mergeSeries` #1875

shanson7 commented Aug 10, 2020

shanson7 commented Aug 10, 2020 •

edited

Loading

Dieterbe commented Aug 11, 2020 •

edited

Loading

Duplicate Put in point slice pool from mergeSeries #1875

Duplicate Put in point slice pool from mergeSeries #1875

Comments

shanson7 commented Aug 10, 2020

shanson7 commented Aug 10, 2020 • edited Loading

Dieterbe commented Aug 11, 2020 • edited Loading

Duplicate `Put` in point slice pool from `mergeSeries` #1875

Duplicate `Put` in point slice pool from `mergeSeries` #1875

shanson7 commented Aug 10, 2020 •

edited

Loading

Dieterbe commented Aug 11, 2020 •

edited

Loading