-
Notifications
You must be signed in to change notification settings - Fork 107
introduce lastSave and adjust lastUpdate checks: #571
Conversation
we now use: * lastSave = last save TS * lastUpdate = last point seen using lastUpdate for everything like we did causes issues for backfills (#568) and other scenarios where the data TS is offset against wall clock. index filtering -> compare lastUpdate to from index pruning -> checks lastSave cassandra updating -> checks lastSave
This seems to rely on |
Do we actually need to save |
def := schema.MetricDefinitionFromMetricData(data) | ||
if inMemory && existing.LastUpdate > def.LastUpdate { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Dieterbe If we already know at this point that the metric will be rejected once we try adding it, shouldn't we just return an error here and then make the input.DefaultHandler.Process()
abort instead of letting it continue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a good suggestion. though note that right now there's an important metric tank.metrics_too_old
being incremented in the AggMetric.Add code. it's important we communicate this back to the user. i suppose we could increment that metric from here and then do as you say. (this would also bypass the usage tracking for these points - I think we should account for all points even if we reject them - though IIRC we plan to move the usage tracking to tsdb-gw so I think that's ok)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
existing.LastUpdate will be greater then def.LastUpdate when processing a backlog at startup. We need to accept those metrics to refill the aggMetric chunks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure i understand that right... if we're processing the backlog then MT will come up, read the index from cassandra, and then read all the insertions from kafka and apply them. if other nodes have already updated the cassandra index then the LastUpdate
property will have been set by them and that's why it will be ahead of the data coming in from Kafka, right?
Good point...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- metrictank writes update to cassandra with LastUpdate = 10
- metrictank is restarted
- metrictank loads the index from cassandra, lastUpdate=10
- metrictank starts ingesting metrics from 6hours ago.
- first point seen has Time=0, so MetricDefinitionFromMetricData(data) will set LastUpdate to 0.
- existing.LastUpdate=10, def.LastUpdate=0. 10 > 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
K, so I'll just leave that as it is. Then the metric should still be processed even though its LastUpdate
is < existing.LastUpdate
.
we now use:
using lastUpdate for everything like we did causes issues for
backfills (#568)
and other scenarios where the data TS is offset against wall clock.
index filtering -> compare lastUpdate to from
index pruning -> checks lastSave
cassandra updating -> checks lastSave