Skip to content
This repository has been archived by the owner on Nov 13, 2021. It is now read-only.

Fixes/for upstream #69

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

randakar
Copy link

This is a renewed pull request for the 'millisecond' precision fixes.

I added a fix for the delta_sigma bug to the previous branch but it appears somebody else beat me to it, here: #59 . I don't have anything to add to this, so in order to not pollute the branch I'm leaving it out.

This pull request contains two commits, each one dealing with the fact that the data we're using uses ms precision, but contains measurements that come in every 5 minutes - from a set of monitoring stations, so they come in in bunches.

Granularity detection assumes that if you have 'X' precision your data also contains one measurement per X, but that is clearly not always the case, as the above testcase demonstrates. Worse, the code doesn't even deal with the possibility that the granularity detection might say "ms". The switch in question simply doesn't handle it, causing the code to blow up with an error. Ditto for seconds.

The basic "it blows up" thing is easily fixed, but the deeper issue of handling measurement intervals that are disparate from the measurement's precision is not. That requires a rethink of the code.

…y are missing.

If the granularity detection results in "sec" or "ms" granularity
detect_anons will blow up with a message stating 'period' was not
provided. Which is correct. This provides at least some defaults.

This is a bit tricky, since the number of samples really isn't guaranteed
to to be 1 measurement per whatever the 'gran' variable says it is.
However, this code appears to assume that. Leading to "fun" when the
precision is in ms but the measurements fire a lot less often - say, once
per five minutes (that's the testcase in question).

There are two ways to deal with this, neither of them implemented by this patch:
- Look at the delta between the first and the last timestamp and the number
of records and make an educated guess from there.
- Add 'period' as a function argument and let the caller decide, using the
hardcoded values simply as educated guesses.

These approaches can be combined, but that is left as an exercise for the reader.
@CLAassistant
Copy link

CLAassistant commented Jul 18, 2019

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants