introduce lastSave and adjust lastUpdate checks: #571

Dieterbe · 2017-03-20T11:44:06Z

we now use:

lastSave = last save TS
lastUpdate = last point seen

using lastUpdate for everything like we did causes issues for
backfills (#568)
and other scenarios where the data TS is offset against wall clock.

index filtering -> compare lastUpdate to from
index pruning -> checks lastSave
cassandra updating -> checks lastSave

we now use: * lastSave = last save TS * lastUpdate = last point seen using lastUpdate for everything like we did causes issues for backfills (#568) and other scenarios where the data TS is offset against wall clock. index filtering -> compare lastUpdate to from index pruning -> checks lastSave cassandra updating -> checks lastSave

replay · 2017-03-20T13:01:04Z

This seems to rely on schema.MetricDefinition to get a new field LastSave. I know that this is maintained by Raintank, but can we just add a field there? Do we need to update it's version number if we do that? Or do you plan to only keep this change in our vendor copy of it and not update upstream?

replay · 2017-03-20T13:52:14Z

Do we actually need to save LastSave into cassandra? Wouldn't it be sufficient to just set it to LastUpdate on load?

replay · 2017-03-20T20:08:21Z

idx/cassandra/cassandra.go

 		def := schema.MetricDefinitionFromMetricData(data)
+		if inMemory && existing.LastUpdate > def.LastUpdate {


@Dieterbe If we already know at this point that the metric will be rejected once we try adding it, shouldn't we just return an error here and then make the input.DefaultHandler.Process() abort instead of letting it continue?

that's a good suggestion. though note that right now there's an important metric tank.metrics_too_old being incremented in the AggMetric.Add code. it's important we communicate this back to the user. i suppose we could increment that metric from here and then do as you say. (this would also bypass the usage tracking for these points - I think we should account for all points even if we reject them - though IIRC we plan to move the usage tracking to tsdb-gw so I think that's ok)

existing.LastUpdate will be greater then def.LastUpdate when processing a backlog at startup. We need to accept those metrics to refill the aggMetric chunks

Just to make sure i understand that right... if we're processing the backlog then MT will come up, read the index from cassandra, and then read all the insertions from kafka and apply them. if other nodes have already updated the cassandra index then the LastUpdate property will have been set by them and that's why it will be ahead of the data coming in from Kafka, right?
Good point...

metrictank writes update to cassandra with LastUpdate = 10

metrictank is restarted

metrictank loads the index from cassandra, lastUpdate=10

metrictank starts ingesting metrics from 6hours ago.

first point seen has Time=0, so MetricDefinitionFromMetricData(data) will set LastUpdate to 0.

existing.LastUpdate=10, def.LastUpdate=0. 10 > 0

K, so I'll just leave that as it is. Then the metric should still be processed even though its LastUpdate is < existing.LastUpdate.

Dieterbe force-pushed the fix-568 branch from c25d83d to 2ef81de Compare March 20, 2017 12:00

do not change schema, just use lastupdate time as lastsave time

3a623a4

replay force-pushed the fix-568 branch from 49f687c to 3a623a4 Compare March 20, 2017 15:32

replay reviewed Mar 20, 2017

View reviewed changes

woodsaj mentioned this pull request Mar 21, 2017

update idx handling #574

Merged

Dieterbe closed this Mar 28, 2017

Dieterbe deleted the fix-568 branch September 18, 2018 09:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

introduce lastSave and adjust lastUpdate checks: #571

introduce lastSave and adjust lastUpdate checks: #571

Dieterbe commented Mar 20, 2017

replay commented Mar 20, 2017 •

edited

Loading

replay commented Mar 20, 2017 •

edited

Loading

replay Mar 20, 2017 •

edited

Loading

Dieterbe Mar 20, 2017

woodsaj Mar 20, 2017

replay Mar 20, 2017 •

edited

Loading

woodsaj Mar 20, 2017

replay Mar 21, 2017 •

edited

Loading

		def := schema.MetricDefinitionFromMetricData(data)
		if inMemory && existing.LastUpdate > def.LastUpdate {

introduce lastSave and adjust lastUpdate checks: #571

introduce lastSave and adjust lastUpdate checks: #571

Conversation

Dieterbe commented Mar 20, 2017

replay commented Mar 20, 2017 • edited Loading

replay commented Mar 20, 2017 • edited Loading

replay Mar 20, 2017 • edited Loading

Choose a reason for hiding this comment

Dieterbe Mar 20, 2017

Choose a reason for hiding this comment

woodsaj Mar 20, 2017

Choose a reason for hiding this comment

replay Mar 20, 2017 • edited Loading

Choose a reason for hiding this comment

woodsaj Mar 20, 2017

Choose a reason for hiding this comment

replay Mar 21, 2017 • edited Loading

Choose a reason for hiding this comment

replay commented Mar 20, 2017 •

edited

Loading

replay commented Mar 20, 2017 •

edited

Loading

replay Mar 20, 2017 •

edited

Loading

replay Mar 20, 2017 •

edited

Loading

replay Mar 21, 2017 •

edited

Loading