Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON ints sometimes being mapped to floats #27935

Closed
Fluxx opened this issue Dec 20, 2017 · 4 comments
Closed

JSON ints sometimes being mapped to floats #27935

Fluxx opened this issue Dec 20, 2017 · 4 comments
Labels
>bug feedback_needed :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@Fluxx
Copy link

Fluxx commented Dec 20, 2017

Elasticsearch version (bin/elasticsearch --version): Version: 5.6.2, Build: 57e20f3/2017-09-23T13:16:45.703Z, JVM: 1.8.0_131

Plugins installed: []

JVM version (java -version):

openjdk version "1.8.0_131"
OpenJDK Runtime Environment (build 1.8.0_131-8u131-b11-2ubuntu1.16.04.3-b11)
OpenJDK 64-Bit Server VM (build 25.131-b11, mixed mode)

OS version (uname -a if on a Unix-like system): Linux ip-10-0-16-177 4.4.0-1035-aws #44-Ubuntu SMP Tue Sep 12 17:27:47 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Description of the problem including expected versus actual behavior:

We have an index which is created daily at midnight UTC. In that index there is a field which is logged as JSON Integer, which sometimes is correctly dynamically mapped in the index to an long, and other times is mapped incorrectly as a float. There is only one log message logged by the service logging to this index, and that field is always populated with a long value.

Other fields from the same structure logged at the same time are logged as JSON integers, but have their index field values created at floats. I was unable to find a recent day-based index where these other fields mapped to anything other than a float.

It's not clear to me why this is the case. From my understanding the first document indexed when the index is created sets the types of the fields, and the first document should have set that field as a long. A test where I create a new test index and then POST a document into it creates all fields expected as long.

Steps to reproduce:

I cannot reliably reproduce this, as it seems to only be happening in our production environment. I'm mostly looking for guidance as to how a logged JSON int coulf be mapped to a float, or, if the only way that is possible is that we actually logged a JSON float somehow, and I should do perhaps into our production system.

@jpountz jpountz added feedback_needed :Search Foundations/Mapping Index mappings, including merging and defining field types >bug labels Dec 21, 2017
@jpountz
Copy link
Contributor

jpountz commented Dec 21, 2017

Can you try to identify the document that triggers this behaviour? I suspect it might have eg. 3.0 instead of 3 as a value, which makes Elasticsearch index it as a float. Maybe check whether you have a proxy between your client and Elasticsearch that might rewrite the json document and transform something that is an integer when it leaves the client-side into a float when it reaches Elasticsearch.

cc @elastic/es-search-aggs

@Fluxx
Copy link
Author

Fluxx commented Dec 21, 2017

@jpountz thanks for the reply. I did spend some time trying to find documents in the index which were indexed as a float, and was unable to. I'm fairly confident this is not occurring, as the particular field is one extract programmatically out of thrift struct, whose field is the i32 thrift type. Though I do acknowledge that a rogue document being indexed with a float value there would be the likely culprit.

We do not have a proxy between our logging application and elasticache either.

@polyfractal
Copy link
Contributor

Do you have any index or dynamic templates configured?

Is there an application processing the thrift datastructure and converting it to JSON to send to ES? Or are you using something like Logstash?

I ask because various languages have weird edge-cases. For example, PHP (everyone's favorite punching bag) will cast an integer to double if it is greater than MAX_INT so as to prevent overflow. There are also oddities with how it numerics are encoded to JSON in PHP. It implicitly truncates 3.0 to 3 unless you tell it otherwise.

Perhaps your language has similar oddities?

If you know the mappings ahead of time (since you have the thrift schema) setting explicit mappings in ES would be the easiest solution. If you make the mappings strict, any future rogue documents will throw an exception and you might be able to identify what the culprit is.

@javanna
Copy link
Member

javanna commented Mar 16, 2018

We were unable to identify the cause here, most likely a client problem. Closing, but feel free to reopen if you find out more.

@javanna javanna closed this as completed Mar 16, 2018
@javanna javanna added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug feedback_needed :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

4 participants