Try to align runs with frequency #765

mem · 2024-07-03T21:41:48Z

When the agent restarts, we might end up in a situation where we are running a check too often. For example, if a check is configured to run once every 10 minutes, and the check ran 1 minute ago, if we run immediately, we will have two executions within that 10 minute window.

Since we have to publish samples once every two minutes, we cannot wait for 9 minutes, because we end up with a gap. Instead, do run immediately to avoid that.

But if the check ran 9 minutes ago, we can align with the expectation by waiting for 1 minute instead of a random value. Presumably the check ran 9 minutes ago, and the sample was replicated 7, 5, 3 and 1 minutes ago. If we wait for 1 minute, we would end up running the check when it was expected to run.

In order to actually fix this issue the agents would have to persist data across runs. It might be possible to do this by offloading publishing to another service.

Fixes: #739

When the agent restarts, we might end up in a situation where we are running a check too often. For example, if a check is configured to run once every 10 minutes, and the check ran 1 minute ago, if we run immediately, we will have two executions within that 10 minute window. Since we have to publish samples once every two minutes, we cannot wait for 9 minutes, because we end up with a gap. Instead, do run immediately to avoid that. But if the check ran 9 minutes ago, we can align with the expectation by waiting for 1 minute instead of a random value. Presumably the check ran 9 minutes ago, and the sample was replicated 7, 5, 3 and 1 minutes ago. If we wait for 1 minute, we would end up running the check when it was expected to run. In order to actually fix this issue the agents would have to persist data across runs. It might be possible to do this by offloading publishing to another service. Fixes: #739 Signed-off-by: Marcelo E. Magallon <marcelo.magallon@grafana.com>

roobre

Looking good, mostly a couple naming nits.

My head hurts from doing time arithmetic.

roobre · 2024-07-09T12:42:08Z

internal/scraper/scraper.go

+func timeFromNs(ns float64) time.Time {
+	sec := int64(math.Floor(ns / 1e9))
+	nsec := int64(math.Mod(ns, 1e9))
+	return time.Unix(sec, nsec)
+}


Isn't this redundant? I thought you could just time.Unix(0, ns):
https://go.dev/play/p/zIkG3JFBBvL

The docs state you can:

// It is valid to pass nsec outside the range [0, 999999999].

roobre · 2024-07-12T13:40:13Z

internal/scraper/scraper.go

+	return time.Unix(sec, nsec)
+}
+
+func computeOffset(offset, frequency time.Duration, t0, now time.Time) time.Duration {


Small nit, but I'd suggest naming t0 created instead. t0 suggest some beginning of times, but doesn't seem immediately obvious which one.

Same about offset, I'm not being able to figure out what that is 🤔

roobre · 2024-07-12T13:45:17Z

internal/scraper/scraper.go

+	// The check was created more than the frequency ago, so we need to
+	// compute the time until the next time the check should run.
+	//
+	// Compute the number of runs since t0, add one for the next run and
+	// multiply by the frequency in order to obtain its timestamp. Finally,
+	// compute the remaining time until that timestamp.
+
+	runs := (now.UnixMilli() - t0.UnixMilli()) / frequency.Milliseconds()
+
+	timeUntilNextRun := t0.Add(time.Duration(runs+1) * frequency).Sub(now)


I think I barely grasp what we're doing here, but not enough to try and find gaps in the logic. Looks right but I'm not positive about that.

mem requested a review from a team as a code owner July 3, 2024 21:41

roobre reviewed Jul 12, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try to align runs with frequency #765

Try to align runs with frequency #765

mem commented Jul 3, 2024

roobre left a comment

roobre Jul 9, 2024

roobre Jul 12, 2024

roobre Jul 12, 2024

roobre Jul 12, 2024

Try to align runs with frequency #765

Are you sure you want to change the base?

Try to align runs with frequency #765

Conversation

mem commented Jul 3, 2024

roobre left a comment

Choose a reason for hiding this comment

roobre Jul 9, 2024

Choose a reason for hiding this comment

roobre Jul 12, 2024

Choose a reason for hiding this comment

roobre Jul 12, 2024

Choose a reason for hiding this comment

roobre Jul 12, 2024

Choose a reason for hiding this comment