-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DISTINCT does not operate on string fields #6495
Comments
@jsternberg ^^ |
@jsternberg I created the data with the following:
|
@beckettsean I'm unable to reproduce this with 0.12.2. I'm using one of the pending Docker images to ensure I have a clean slate.
Can you confirm you're using the correct version? |
On checking, my system is actually 0.12.1. Is there reason to believe this is fixed on 0.12.2?
|
Still happens for me on 0.12.2:
|
Replicated on new data:
|
Not a syntax issue as far as I can tell:
|
I spun up a brand new instance, installed 0.12.2, and DISTINCT operates as expected:
I'm not sure what it is about my other installations that prevents this, but it appears that it cannot be reproduced on a brand new box. |
I looked at the underlying data set and I think it's caused by a side-effect of a bad decision in the underlying query engine. If a shard returns no iterators for a query, it returns a fake float iterator. When casting is done, floats are given priority and all non-float iterators are discarded, making it appear like there's no data. When selecting as a raw field, this doesn't happen since the work is done through auxiliary iterators which don't have this problem. I'll start working on a fix. |
To be clear, this should happen with any aggregation function for anything lower than a float that has differing output types (it should not affect |
If a shard is empty for a specific field and the field type is something other than a float, a nil iterator would get returned from one of the empty shards and cause the combined iterators to be cast to the float type and all other iterator types to be discarded (or for integers, to be cast). This is rare since most aggregates don't accept strings or booleans, but for queries like: SELECT distinct(string) FROM mydata It would result in nothing getting returned if one of the shards didn't have a value for `string`. This change modifies the query engine to return nil for the shards instead of a fake iterator and then to only use the fake iterator if the final aggregate iterator is nil (meaning that no iterators could be constructed for the field from any shard). Fixes #6495.
Bug report
System info: [Include InfluxDB version, operating system name, and other relevant details]
InfluxDB 0.12.2 on Ubuntu 14.04, fresh install
Steps to reproduce:
DISTINCT
Expected behavior: [What you expected to happen]
Return from
DISTINCT(<string>)
would be the number of distinct strings.Actual behavior: [What actually happened]
Return was null.
Additional info: [Include gist of relevant config, logs, etc.]
The text was updated successfully, but these errors were encountered: