Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON Implementation is not standard #46

Closed
stevvooe opened this issue Mar 1, 2010 · 8 comments
Closed

JSON Implementation is not standard #46

stevvooe opened this issue Mar 1, 2010 · 8 comments

Comments

@stevvooe
Copy link

stevvooe commented Mar 1, 2010

Among other problems, I have noticed the following:

  1. Floating points in quotes, such as "1.0" causes null pointer exceptions when sent as document fields. These should just be treated as a regular string. Here is the error message:
    "error" : "Index[...] Shard[1] ; nested: Failed to parse; nested: Current token (VALUE_STRING) not numeric, can not use numeric value accessors\n at [Source: {..., "f": "1.0",...}; line: 1, column: 32]; "
  2. Integers in status response are placed in strings. This is more an annoying detail, than a bug.

I tested this with json generated in the python standard standard package, which is confirmed to be compatible with in-browser json and php json implementations. From the tracebacks I have seen, the problem is either in the JsonObectMapper or possibly the flexjson package, but I haven't looked into it in detail.

Let me know if you need some more examples.

Thanks for the great project!

@kimchy
Copy link
Member

kimchy commented Mar 1, 2010

Hi,

  1. Can you paste an example of an API where you use it? It might be API (or rather, parsing of an API) specific.
  2. Which status response? Can you post an example?

@stevvooe
Copy link
Author

stevvooe commented Mar 1, 2010

  1. I haven't been able to reproduce this consistently. I use a python script
    like this (although I have seen it with curl):
    import json
    import httplib2
c = httplib2.Http()

url = "http://localhost:9200/twitter/tweet/123413241234"
body = json.dumps({"an_id": 123413241234, "f": "1.0", "stuff": 'a' * 100})
headers = {'Content-Type': 'application/json; charset=UTF-8'}

response, content = c.request(url, 'PUT', body=body, headers=headers)
print content
if response.status != 200:
    print "response:", response
    print "content:", content
    print "BODY:", repr(body)
    print "==================="

It seems like the problem might lie in changing a field's type, but it
doesn't happen until I have done a large number of updates. Currently, I am
running a test to see if this is true. I'll get back to you on this.
2. I made a mistake here. I was printing a httplib2 library response dictionary.
It seems to have not used integers, my apologies.

Sorry about the shoddy bug report; I wrote this when I was way too tired.
Usually I am much more thorough.

@kimchy
Copy link
Member

kimchy commented Mar 1, 2010

Fist of all, I prefer "shoddy" bug reports than no bug reports :). They usually uncover something that can be simplified is possible. Regarding your first point, which version are you running against, is is vanilla 0.4 or master?

@stevvooe
Copy link
Author

stevvooe commented Mar 1, 2010

I have high standards for my bug reports ;). But I do suspect something odd is
going on here.

I am running against 0.4.0. I'll look into upgrading to the master. Do you have
the link to a build process?

These are probably unrelated, but here are various snippets of information that
might help:

  1. I can only get this issue after having the server run for a very long time. I
    don't know if this is based on time or number of nature of operations. I
    suspect the latter.
  2. At times, I can only reproduce the issue with python's httplib client and not
    using curl, although, I have seen it with curl. A tcpdump of packets shows that
    the python requests are valid and going out as suspected.
  3. Python sends the body and headers of a put request in different packets,
    whereas curl send them together. This is the default behaviour of the
    python standard httplib module, so urllib, urllib2 and httplib2 will all
    behave in the same way. I thought this might have been part of the problem,
    but I no longer see the exception after a full restart.
  4. I am using Ubuntu 9.10 with the interpreter under sun-java6-jdk. The
    executeable is being run in console mode (-f). The first time I saw the
    problem was with logging level debug, although, I think that I have seen in
    it with logging set to info.
  5. I have increased the socket buffers with the following command:
    sudo sysctl net.core.rmem_max=$((26_1024_1024)) net.core.wmem_max=$((25_1024_1024))

@kimchy
Copy link
Member

kimchy commented Mar 1, 2010

If you can, it would be great if you can download master and check. The download section has a link to master, and instructions on how to build it. Some changes done in master include much improved support for mappings, support for chunked http requests (though, for performance reasons, you should probably do without).

(Vizzini said) go back to the beginning. Do you define explicit mapping with type string, and then try and index a float number to it? or maybe rely on the first indexable document to set the field to type string, and then get the float parsing exception?

@stevvooe
Copy link
Author

stevvooe commented Mar 1, 2010

Will do on the master version.

No mappings were defined. Its possible that the field type changed (ie I indexed it one way then changed), but I do not get a float parse exception.

@stevvooe
Copy link
Author

stevvooe commented Mar 9, 2010

So, I took a little time to test the master and I definitely don't see this issue. We'll go ahead and close this and I will keep an eye out.

Thanks for the time and kepp up the good work.

@kimchy
Copy link
Member

kimchy commented Mar 9, 2010

cool!, thanks for the effort.

dadoonet added a commit that referenced this issue Jun 5, 2015
dadoonet added a commit that referenced this issue Jun 5, 2015
I noticed the documentation recommends a non-durable local resource for the Elasticsearch data path.  Although this is acceptable for some deployments it might be worth warning people that the path is not durable and there is a potential for data loss, even with replicas data loss is theoretically possible.

```
# recommended
path.data: /mnt/resource/elasticsearch/data
```

Alternatively the user could attach and use data disks which do come with a significant performance tradeoff, but premium storage options with higher IOPS have been announced and are right around the corner.

Closes #46.
njlawton pushed a commit to njlawton/elasticsearch that referenced this issue Mar 15, 2017
ClaudioMFreitas pushed a commit to ClaudioMFreitas/elasticsearch-1 that referenced this issue Nov 12, 2019
Systemd consistent with Puppet
henningandersen pushed a commit to henningandersen/elasticsearch that referenced this issue Jun 4, 2020
With this commit we measure maximum throughput in a separate step. This ensures
that the system shows less fluctuations in throughput when in throttled mode.

Relates elastic#46
cbuescher pushed a commit to cbuescher/elasticsearch that referenced this issue Oct 2, 2023
Occasionally executing night rally may fail because a user forgot a
shell or a less command resulting in open files under the races data
volume directory. In this case the Ansible role fails in the
subsequent task while attempting to unmount the disk volume[1].

Terminate processes with open file descriptors against the data disk
volume when running the initialize-data-disk/encryption-at-rest
fixtures.

[1]
https://groups.google.com/a/elastic.co/d/msg/build-reply/ZjV86sd4V5g/gYNum30BAgAJ

Relates elastic#46
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants