-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Continuous Query with derivative fails silently #3247
Comments
Possible dupe of #3221 |
For me (running 0.9.1 installed via brew on MacOS 10.10.4 w/ go 1.4.2), continuous queries work except for
|
@kruckenb if you run the derivative query ad hoc (not as a CQ) does it return the expected results? E.g. |
@beckettsean Yes
Re: #3000 I repeated the steps above but inserted 1.1, 2.2, 3.3, 4.4, 5.5. Results look the same, so hard to say whether float data is the cause:
|
Thanks, @kruckenb, I think we can rule out #3000 being a contributor to the behavior you're seeing. I wonder if the first point written had an integer (likely What happens if you insert data directly into |
@beckettsean I tried the following with rate value=1 and rate value=1.0, same results either way:
Doesn't look like int vs float is the issue. |
@beckettsean This issue (at least, the one I reported in the comment above #3247 (comment)) appears to be fixed in 0.9.2. I'm not sure which commit fixes it, but it works. |
@beckettsean I spoke too soon. In 0.9.2 CQ with |
@beckettsean After more fiddling, I think CQ |
I can confirm that CQ with derivate were not working at all in 0.9.1. I just insatlled 0.9.2-rc1 and they are executed now. However all values are 0, so they are still broken:
|
@gucki @beckettsean Yes I ran into the same thing. The problem is with the way the CQ service sets the time range for the CQ (here https://github.com/influxdb/influxdb/blob/master/services/continuous_querier/service.go#L233). If the The problem with that is that you'll end up with 0/nil measurements periodically, because the CQ setTimeRange window won't line up with every sampling interval, so you end up with empty buckets. I think the fix is to get rid of the recomputing intervals (
|
I did all my previous tests with the default configuration:
I changed that too and restarted influx:
However it didn't change anything, still only 0 values:
There are even some duplicate timestamps now, I reported that already in other issues: |
@gucki I found that I had to set the That fixed most of the problem, but I still had buckets that were empty because of how the intervals are computed. #3383 fixes that. |
@gucki I noticed that dups happen when influx is restarted. Not sure if that explains all of your dups. |
I constantly get invalid/ dup values (#3373) without restarting influx. I just tried to use an aggregated table for the derivative CQ in order to avoid the sample rate inaccuraties (the aggregated table only contains well aligned/ rounded timestamps).
But now the CQ doesn't seem to run at all again. In the logs I can see:
Using your idea of using an increased group by indeed works:
|
Besides fixing any actual bugs, generally what people want for fixing derivative calculation (CQ or not) is probably #3273. |
not fixed yet? We have this problem in 0.9.3 -- 0.9.4.2. continuous query definition: switch_traffic_1m_mean CREATE CONTINUOUS QUERY switch_traffic_1m_mean ON livecloud BEGIN SELECT derivative(last(counter), 1s) AS "rate" INTO "livecloud"."rp_1d".switch_traffic_1m_mean FROM "livecloud"."default".switch_traffic_raw GROUP BY time(1m), * END the ad hoc cli result is correct, but all 'rate' in CQ result are zero: [raw data]
QL: SELECT * FROM switch_traffic_raw WHERE time >= 1444400400s AND time <= 1444400580s AND switch_id = '2' AND instance_id = '5' AND instance_type='rx_bytes'
name: switch_traffic_raw
------------------------
time counter instance_id instance_name instance_type switch_id switch_mip
2015-10-09T14:20:29Z 568090710 5 MEth0/0/0 rx_bytes 2 69.28.56.252
2015-10-09T14:21:29Z 568101438 5 MEth0/0/0 rx_bytes 2 69.28.56.252
2015-10-09T14:22:28Z 568119778 5 MEth0/0/0 rx_bytes 2 69.28.56.252
[LAST result]
QL: SELECT last(counter) AS rate FROM switch_traffic_raw WHERE time >= 1444400400s AND time <= 1444400580s AND switch_id = '2' AND instance_id = '5' AND instance_type='rx_bytes' GROUP BY time(1m), *
name: switch_traffic_raw
tags: instance_id=5, instance_name=MEth0/0/0, instance_type=rx_bytes, switch_id=2, switch_mip=69.28.56.252
time rate
---- ----
2015-10-09T14:20:00Z 568090710
2015-10-09T14:21:00Z 568101438
2015-10-09T14:22:00Z 568119778
2015-10-09T14:23:00Z 568130654
[DERIVATIVE result]
QL: SELECT derivative(last(counter),1s) AS rate FROM switch_traffic_raw WHERE time >= 1444400400s AND time <= 1444400580s AND switch_id = '2' AND instance_id = '5' AND instance_type='rx_bytes' GROUP BY time(1m), *
name: switch_traffic_raw
tags: instance_id=5, instance_name=MEth0/0/0, instance_type=rx_bytes, switch_id=2, switch_mip=69.28.56.252
time rate
---- ----
2015-10-09T14:21:00Z 178.8
2015-10-09T14:22:00Z 305.6666666666667
2015-10-09T14:23:00Z 181.26666666666668
[CQ result]
QL: SELECT * FROM livecloud.rp_1d.switch_traffic_1m_mean WHERE time >= 1444400400s AND time <= 1444400580s AND switch_id = '2' AND instance_id = '5' AND instance_type='rx_bytes'
name: switch_traffic_1m_mean
----------------------------
time instance_id instance_name instance_type rate switch_id switch_mip
2015-10-09T14:20:00Z 5 MEth0/0/0 rx_bytes 0 2 69.28.56.252
2015-10-09T14:21:00Z 5 MEth0/0/0 rx_bytes 0 2 69.28.56.252
2015-10-09T14:22:00Z 5 MEth0/0/0 rx_bytes 0 2 69.28.56.252 |
@sharang please fix your formatting and/or post large chunks as gists - scrolling through pages of cruft makes the issue hard to read. |
@pdf formatting fixed |
@jwilder still an issue in 0.9.5 nightlies. Can you investigate? |
Seems to still be broken as of 0.9.5-nightly-6ecb62e.
whereas:
The odd thing is that derivative points calculated by testcq1 include plugin_instance vda1 and vdb1, which are no longer being inserted into the source measurement - as if testcq1 has warped back in time. |
WorkaroundHere is a small hack to circumvent the problem: Using cron instead of CQ to calculate the derivatives using a time window larger than the
Using the script:
--> One needs to use date modulo 1m, otherwise the window will be shifted somewhat and not show the correct values. Log
|
@bbczeuz I see nothing wrong with your setup. I suspect that |
I think the following code can fix this bug, diff --git a/services/continuous_querier/service.go b/services/continuous_querier/service.go
index 9e218bd..8694bfc 100644
--- a/services/continuous_querier/service.go
+++ b/services/continuous_querier/service.go
@@ -274,7 +274,12 @@ func (s *Service) ExecuteContinuousQuery(dbi *meta.DatabaseInfo, cqi *meta.Conti
startTime = startTime.Add(-interval)
}
- if err := cq.q.SetTimeRange(startTime, startTime.Add(interval)); err != nil {
+ if cq.q.HasDerivative() {
+ err := cq.q.SetTimeRange(startTime.Add(-interval), startTime.Add(interval))
+ } else {
+ err := cq.q.SetTimeRange(startTime, startTime.Add(interval))
+ }
+ if err != nil {
s.Logger.Printf("error setting time range: %s\n", err)
}
@@ -297,7 +302,12 @@ func (s *Service) ExecuteContinuousQuery(dbi *meta.DatabaseInfo, cqi *meta.Conti
}
newStartTime := startTime.Add(-interval)
- if err := cq.q.SetTimeRange(newStartTime, startTime); err != nil {
+ if cq.q.HasDerivative() {
+ err := cq.q.SetTimeRange(newStartTime.Add(-interval), startTime)
+ } else {
+ err := cq.q.SetTimeRange(newStartTime, startTime)
+ }
+ if err != nil {
s.Logger.Printf("error setting time range: %s\n", err)
return err
} |
FWIW, issue is still present in 0.10.0-nightly-6ccc416. CQ appears to run, but all values are zero.
Raw data is sampled every minute, so I also tried a CQ with |
not fixed yet? this bug is still present in almost all branches. @beckettsean @otoolep |
@sharang it is not fixed yet, that's why the issue is still open. |
@beckettsean fixed in v0.10.0 and submit a pull request #5698 |
Sadly still not fixed, discussion in #5733 |
For aggregate queries, derivatives will now alter the start time to one interval behind and will use that interval to find the derivative of the first point instead of giving no value for that interval. Null values will still be discarded so if the interval before the one you are querying is null, then it will be discarded like if it were in the middle of the query. You can use `fill(0)` to fill in these values. This does not apply to raw queries yet. Also modified the derivative and difference aggregates to use the stream iterator instead of the reduce slice iterator for space efficiency. Fixes #3247. Contributes to #5943.
For aggregate queries, derivatives will now alter the start time to one interval behind and will use that interval to find the derivative of the first point instead of giving no value for that interval. Null values will still be discarded so if the interval before the one you are querying is null, then it will be discarded like if it were in the middle of the query. You can use `fill(0)` to fill in these values. This does not apply to raw queries yet. Also modified the derivative and difference aggregates to use the stream iterator instead of the reduce slice iterator for space efficiency. Fixes #3247. Contributes to #5943.
For aggregate queries, derivatives will now alter the start time to one interval behind and will use that interval to find the derivative of the first point instead of giving no value for that interval. Null values will still be discarded so if the interval before the one you are querying is null, then it will be discarded like if it were in the middle of the query. You can use `fill(0)` to fill in these values. This does not apply to raw queries yet. Also modified the derivative and difference aggregates to use the stream iterator instead of the reduce slice iterator for space efficiency. Fixes #3247. Contributes to #5943.
For aggregate queries, derivatives will now alter the start time to one interval behind and will use that interval to find the derivative of the first point instead of giving no value for that interval. Null values will still be discarded so if the interval before the one you are querying is null, then it will be discarded like if it were in the middle of the query. You can use `fill(0)` to fill in these values. This does not apply to raw queries yet. Also modified the derivative and difference aggregates to use the stream iterator instead of the reduce slice iterator for space efficiency. Fixes #3247. Contributes to #5943.
For aggregate queries, derivatives will now alter the start time to one interval behind and will use that interval to find the derivative of the first point instead of giving no value for that interval. Null values will still be discarded so if the interval before the one you are querying is null, then it will be discarded like if it were in the middle of the query. You can use `fill(0)` to fill in these values. This does not apply to raw queries yet. Also modified the derivative and difference aggregates to use the stream iterator instead of the reduce slice iterator for space efficiency. Fixes #3247. Contributes to #5943.
Is this in the nightly build, because I don't see this resolved in 0.13.0~n201604180800. A newly created CQ with a derivative still has a time range generated for the grouping: [query] 2016/04/18 09:11:00 SELECT derivative(mean(value)) AS value INTO network."default".btest01 FROM network."default".vsvrTotalRequests WHERE time >= '2016-04-18T16:10:00Z' AND time < '2016-04-18T16:11:00Z' GROUP BY time(1m), * |
@paulstuart the actual query won't be modified to change the times. The time change is done inside of the query engine itself and doesn't cause the condition to be rewritten. |
Created this continue query on my database running 0.9.1.
create continuous query deriv on increase_test begin select derivative(value) into derivative_value from increasing_value group by time(1m) end
The series increasing_value is a monotonicially increasing value incremented at 5 second intervals via a shell script. The result is the series derivative_value is never populated and no errors are recorded in the log.
The text was updated successfully, but these errors were encountered: