You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jun 28, 2022. It is now read-only.
I have tried several different options for --bulk-bytes, -w, -d and -q but always the same result. I'm getting a constant indexing speed of ~5MB/s which translates to 4 hours to import the file. While indexing the elasticsearch cluster is heavily under-utilized and the stream2es server has a single core at 100%. I have done extensive testing to ensure that there are no network or elasticsearch performance issues.
Workaround
My final solution was to run stream2es in parallel (not with -w) to see if that would help.
That helped a lot. Now all 6 cores and 12 threads get 100% and the indexing time fell from 4 hours to 35 minutes but the elasticsearch cluster is still pretty much idle. It seems to me that something in stream2es uses way more cpu than it should.
The text was updated successfully, but these errors were encountered:
Thanks for reporting this @diadistis, and sorry for the terrible response time. I've noticed similar, and I've done similar workarounds. I haven't had a chance to do profiling on the internal design to isolate the bottleneck, but I suspect at the very least the single LinkedBlockingQueue that feeds the pipeline is part of it.
I did just push a fix for some extraneous string copying, but it won't speed anything up 8x. If you still have this environment available I'd love to know its effect.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Setup
Problem
I'm running :
I have tried several different options for
--bulk-bytes
,-w
,-d
and-q
but always the same result. I'm getting a constant indexing speed of ~5MB/s which translates to 4 hours to import the file. While indexing the elasticsearch cluster is heavily under-utilized and the stream2es server has a single core at 100%. I have done extensive testing to ensure that there are no network or elasticsearch performance issues.Workaround
My final solution was to run stream2es in parallel (not with
-w
) to see if that would help.That helped a lot. Now all 6 cores and 12 threads get 100% and the indexing time fell from 4 hours to 35 minutes but the elasticsearch cluster is still pretty much idle. It seems to me that something in stream2es uses way more cpu than it should.
The text was updated successfully, but these errors were encountered: