Flux stops reading results on first shard with no data #20035

Techmaster21 · 2020-11-13T18:36:35Z

Steps to reproduce:

Download tests.zip, it contains two annotated csv files, test1.csv and test4.csv
Add test1.csv to influxdb using influx write Note that there are 80 records in this file spanning a year. Confirm that there are now 80 records in this measurement using the following query

from(bucket: "test")
  |> range(start: -1y)
  |> filter(fn: (r) => r["_measurement"] == "Measurement1")
  |> drop(columns: ["team", "scope"])
  |> count()

Now add test4.csv to influxdb. Run the query above again. Note that there are now only 18 records, even though test4 only adds records to Measurement4.

Expected behavior:
When two records have the same timestamp and tags, I expect one to overwrite the other without affecting other measurements.

Actual behavior:
One overwrites the other, but then messes up data from other measurements.

Environment info:

System info: Linux 4.19.76-linuxkit x86_64
InfluxDB version: InfluxDB 2.0.1 (git: 37cc047) build_date: 2020-11-11T03:53:31Z
Other relevant environment details: Running using the docker image on Windows 10 using HyperV containers

The text was updated successfully, but these errors were encountered:

danxmoran · 2020-11-16T17:37:25Z

Reproduced, investigating now...

danxmoran · 2020-11-16T17:46:03Z

Strange data point: If I remove drop(columns: ["team", "scope"]) from the query, then the output after writing test4 is:

Result: _result
Table: keys: [_start, _stop, _field, _measurement]
                   _start:time                      _stop:time           _field:string     _measurement:string                  _value:int
------------------------------  ------------------------------  ----------------------  ----------------------  --------------------------
2019-11-17T11:43:41.599734000Z  2020-11-16T17:43:41.599734000Z                  timing            Measurement1                          18
Table: keys: [_start, _stop, _field, _measurement]
                   _start:time                      _stop:time           _field:string     _measurement:string                  _value:int
------------------------------  ------------------------------  ----------------------  ----------------------  --------------------------
2019-11-17T11:43:41.599734000Z  2020-11-16T17:43:41.599734000Z                  timing            Measurement1                          33
Table: keys: [_start, _stop, _field, _measurement]
                   _start:time                      _stop:time           _field:string     _measurement:string                  _value:int
------------------------------  ------------------------------  ----------------------  ----------------------  --------------------------
2019-11-17T11:43:41.599734000Z  2020-11-16T17:43:41.599734000Z                  timing            Measurement1                          29

The counts sum up to 80

danxmoran · 2020-11-16T19:54:09Z

@Techmaster21 I'm not sure if this is the only error, but I think part of the problem is that the headers in your CSVs are malformed, so the CSV->LP converter is dropping a lot of data before sending to influxd. Could you try replacing the initial commas in your headers with spaces? i.e.

- #group,FALSE,FALSE,TRUE,TRUE,FALSE,FALSE,TRUE,TRUE,TRUE,TRUE,TRUE
+ #group FALSE,FALSE,TRUE,TRUE,FALSE,FALSE,TRUE,TRUE,TRUE,TRUE,TRUE

Techmaster21 · 2020-11-16T20:17:38Z

@danxmoran I tried that and got the following error when attempting to add the CSV to influx:

 info    WARNING: at most one dateTime column is expected, 'table' column is ignored
Error: Failed to write data: line 5: column '_time': strconv.ParseFloat: parsing "2020-03-12T18:04:08.95Z": invalid syntax.

I would like to point out that the csv files I included are directly from the "download csv" button in the UI - with some simplification to find a more minimal case, but I never touched the headers.

Furthermore, I have some new information that suggests this is unrelated to CSV issues or the existence of a duplicate timestamp.
If you add the following to a bucket (I used the UI -> buckets -> add data -> line protocol -> set to milliseconds -> enter manually)

Measurement1 timing=56.96377314814815 1597944509870
Measurement1 timing=0.0 1598992534397
Measurement1 timing=12.91412037037037 1597862094673
Measurement1 timing=33.96895833333333 1601412291807
Measurement1 timing=0.013784722222222223 1598991442593
Measurement1 timing=21.070949074074075 1598373708313
Measurement1 timing=9.086516203703704 1597869159047
Measurement1 timing=13.835648148148149 1597944002557
Measurement1 timing=42.023587962962964 1592323982273
Measurement1 timing=33.82604166666667 1598373617570
Measurement1 timing=50.13054398148148 1592324004273
Measurement1 timing=113.10400462962963 1604520205927
Measurement1 timing=0.999988425925926 1603806142050

You get all thirteen datapoints when running the following:

from(bucket: "test")
  |> range(start: -1y)
  |> filter(fn: (r) => r["_measurement"] == "Measurement1")
  |> filter(fn: (r) => r["_field"] == "timing")

However if you add the following two datapoints to the same bucket

Measurement2 timing=0.9970833333333333 1593012434807
Measurement2 timing=124.21572916666666 1594822596920

Then re-running the previous query results in only 2 datapoints.

Also to note, my original issue was not occurring using line protocol or CSV files, but rather the Java client, but in the interest of providing a minimal example (and the fact that the results are consistent regardless of the means of entering the data), I converted it to a CSV

danxmoran · 2020-11-16T20:23:23Z

Thanks for all the extra info, I'd better file an extra issue to either fix the CSV docs or fix the download logic 😓 I'll keep digging on this weird behavior

danxmoran · 2020-11-16T20:35:22Z

~~I'm unable to reproduce using influx write and influx query. Will try via the UI next.~~ Was able to reproduce by using --precision ms

danxmoran · 2020-11-16T20:43:11Z

Reproduced in the UI. Was also able to reproduce via the CLI if I used the right precision.

stuartcarnie · 2020-11-17T03:57:07Z

Hi @Techmaster21, thanks for the report!

For completeness, I want to clarify some details based on the tests.zip file.

The two CSV files import data in to separate measurements.
test1.csv imports 80 unique records into Measurement1

test4.csv imports 38 records, however, lines 19 and 20 are the same timestamp and series:

,result,table,_start,_stop,_time,_value,_field,_measurement,closed,scope,team
,,0,2019-11-14T12:01:16.341288Z,2020-11-13T18:01:16.341288Z,2020-08-27T17:49:59.56Z,7.235046296,timing,Measurement4,FALSE,MyScope,MyTeam
,,0,2019-11-14T12:01:16.341288Z,2020-11-13T18:01:16.341288Z,2020-08-27T17:49:59.56Z,6.945694444,timing,Measurement4,FALSE,MyScope,MyTeam

The result is 37 rows into Measurement4.

I queried the data with Flux and then InfluxQL to reduce the likeliness that ingest is an issue.

Flux query of `Measurement1`:

Returned a sum of 18. As @danxmoran mentioned earlier (and I can confirm) removing the drop returns three unique rows that sum to the correct amount.

curl -H 'content-type:application/vnd.flux' -H "Authorization: Token ${INFLUX_TOKEN}" ${INFLUX_HOST}/api/v2/query\?org\=my-org --data-binary 'from(bucket: "my-bucket") |> range(start: -1y) |> filter(fn: (r) => r._measurement == "Measurement1") |> drop(columns: ["team", "scope"]) |> count()'

,result,table,_start,_stop,_field,_measurement,_value
,_result,0,2019-11-17T21:48:16.04191Z,2020-11-17T03:48:16.04191Z,timing,Measurement1,18

Flux query of `Measurement4`:

Returned a sum of 1

curl -H 'content-type:application/vnd.flux' -H "Authorization: Token ${INFLUX_TOKEN}" ${INFLUX_HOST}/api/v2/query\?org\=my-org --data-binary 'from(bucket: "my-bucket") |> range(start: -1y) |> filter(fn: (r) => r._measurement == "Measurement4") |> drop(columns: ["team", "scope"]) |> count()'

,result,table,_start,_stop,_field,_measurement,_value
,_result,0,2019-11-17T21:53:08.244168Z,2020-11-17T03:53:08.244168Z,timing,Measurement4,1

I re-ran the queries with InfluxQL and can confirm the data was imported and written correctly.

InfluxQL query of `Measurement1`:

curl -u ${INFLUXV1_USER}:${INFLUXV1_PASS} -H Content-Type:application/vnd.influxql -H Accept:text/plain ${INFLUX_HOST}/query\?db=my-bucket --data-binary "select count(timing) from Measurement1"

name: Measurement1
time                          count
----                          -----
1970-01-01 00:00:00 +0000 UTC 80

InfluxQL query of `Measurement4`

curl -u ${INFLUXV1_USER}:${INFLUXV1_PASS} -H "jaeger-debug-id: stuart" -H Content-Type:application/vnd.influxql -H Accept:text/plain ${INFLUX_HOST}/query\?db=my-bucket --data-binary "select count(timing) from Measurement4"

name: Measurement4
time                          count
----                          -----
1970-01-01 00:00:00 +0000 UTC 37

Closes #20035

stuartcarnie · 2020-11-17T06:25:34Z

@Techmaster21 fix coming with PR #20064

Closes #20035

Techmaster21 changed the title ~~Adding a record with the same timestamp and tags corrupts the database~~ Adding a record with the same timestamp and tags corrupts the bucket Nov 14, 2020

psteinbachs assigned danxmoran Nov 16, 2020

danxmoran added the area/2.x OSS 2.0 related issues and PRs label Nov 16, 2020

danxmoran added the waiting/investigation label Nov 16, 2020

danxmoran removed the waiting/investigation label Nov 16, 2020

danxmoran added the kind/bug label Nov 16, 2020

stuartcarnie added the team/query label Nov 17, 2020

stuartcarnie self-assigned this Nov 17, 2020

stuartcarnie added a commit that referenced this issue Nov 17, 2020

chore: Add unit test to replicate issue #20035

8dca764

stuartcarnie added a commit that referenced this issue Nov 17, 2020

fix: Continue reading until itrs is empty, even for nil cursors

dfd5c23

Closes #20035

stuartcarnie mentioned this issue Nov 17, 2020

fix: Ensure Flux reads across all shards #20064

Merged

4 tasks

danxmoran changed the title ~~Adding a record with the same timestamp and tags corrupts the bucket~~ Flux stops reading results on first shard with no data Nov 17, 2020

danxmoran mentioned this issue Nov 17, 2020

Flux stops reading results on first shard with no data [backport 2.0.x] #20068

Closed

danxmoran closed this as completed in #20064 Nov 17, 2020

danxmoran pushed a commit that referenced this issue Nov 17, 2020

chore: Add unit test to replicate issue #20035

35227ca

danxmoran pushed a commit that referenced this issue Nov 17, 2020

fix: Continue reading until itrs is empty, even for nil cursors

bec12ab

Closes #20035

danxmoran pushed a commit that referenced this issue Nov 17, 2020

chore: Add unit test to replicate issue #20035

695fd56

danxmoran pushed a commit that referenced this issue Nov 17, 2020

fix: Continue reading until itrs is empty, even for nil cursors

42311c6

Closes #20035

danxmoran pushed a commit that referenced this issue Nov 17, 2020

chore: Add unit test to replicate issue #20035

cdc1b4c

danxmoran pushed a commit that referenced this issue Nov 17, 2020

fix: Continue reading until itrs is empty, even for nil cursors

0dee646

Closes #20035

danxmoran pushed a commit that referenced this issue Nov 17, 2020

chore: Add unit test to replicate issue #20035

e571295

danxmoran pushed a commit that referenced this issue Nov 17, 2020

fix: Continue reading until itrs is empty, even for nil cursors

8a663b9

Closes #20035

danxmoran pushed a commit that referenced this issue Nov 17, 2020

chore: Add unit test to replicate issue #20035

e7894ba

danxmoran pushed a commit that referenced this issue Nov 17, 2020

fix: Continue reading until itrs is empty, even for nil cursors

3a993e7

Closes #20035

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flux stops reading results on first shard with no data #20035

Flux stops reading results on first shard with no data #20035

Techmaster21 commented Nov 13, 2020 •

edited

Loading

danxmoran commented Nov 16, 2020

danxmoran commented Nov 16, 2020

danxmoran commented Nov 16, 2020

Techmaster21 commented Nov 16, 2020

danxmoran commented Nov 16, 2020

danxmoran commented Nov 16, 2020 •

edited

Loading

danxmoran commented Nov 16, 2020

stuartcarnie commented Nov 17, 2020

stuartcarnie commented Nov 17, 2020

Flux stops reading results on first shard with no data #20035

Flux stops reading results on first shard with no data #20035

Comments

Techmaster21 commented Nov 13, 2020 • edited Loading

danxmoran commented Nov 16, 2020

danxmoran commented Nov 16, 2020

danxmoran commented Nov 16, 2020

Techmaster21 commented Nov 16, 2020

danxmoran commented Nov 16, 2020

danxmoran commented Nov 16, 2020 • edited Loading

danxmoran commented Nov 16, 2020

stuartcarnie commented Nov 17, 2020

Flux query of Measurement1:

Flux query of Measurement4:

InfluxQL query of Measurement1:

InfluxQL query of Measurement4

stuartcarnie commented Nov 17, 2020

Techmaster21 commented Nov 13, 2020 •

edited

Loading

danxmoran commented Nov 16, 2020 •

edited

Loading

Flux query of `Measurement1`:

Flux query of `Measurement4`:

InfluxQL query of `Measurement1`:

InfluxQL query of `Measurement4`