OTel Cassandra/Elasticsearch Exporter queue defaults #2533

joe-elliott · 2020-10-02T20:36:58Z

Which problem is this PR solving?

While working on #2494 I discovered that the default otel config was pushing far less spans/second to my Scylla backend. After some research it was discovered that it's b/c the default otel collector config only has 10 workers while the default jaeger collector config has 50.

Short description of the changes

This PR uses the default collector queue settings for both the cassandra and elasticsearch exporters' queue settings. These are the two "production" exporters supported

This does bring up another concern that might make sense to discuss as part of this issue. By default the ingester pulls from the Kafka queue and immediately pushes its spans into the configured backend, but a no-queue configuration of the otel collector does not seem possible at the moment. The queue can technically be disabled but performance is abysmal. This requires more research.

There are certainly other ways to approach "fixing" this. Including leaving it alone and making the user change the configuration.

Signed-off-by: Joe Elliott <number101010@gmail.com>

codecov · 2020-10-02T21:09:04Z

Codecov Report

Merging #2533 into master will increase coverage by 0.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #2533      +/-   ##
==========================================
+ Coverage   95.32%   95.33%   +0.01%     
==========================================
  Files         208      208              
  Lines        9248     9248              
==========================================
+ Hits         8816     8817       +1     
+ Misses        355      354       -1     
  Partials       77       77

Impacted Files	Coverage Δ
cmd/query/app/static_handler.go	`79.81% <0.00%> (-2.76%)`	⬇️
...lugin/sampling/strategystore/adaptive/processor.go	`100.00% <0.00%> (+0.92%)`	⬆️
cmd/query/app/server.go	`90.16% <0.00%> (+1.63%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1d37ebc...b9a3ec1. Read the comment docs.

yurishkuro · 2020-10-02T21:35:36Z

Worth noting that the size 50 is pulled completely out of thin air :-). There's probably some queueing theory approach one could use, but we haven't done any testing with that. It's actually a multi-variate optimization since while playing with worker count could be optimized for a single collector w.r.t. its CPU utilization, it's far less clear what impact this would have on the storage. For example, Cassandra connection supports up to 128 (if not more) parallel streams, which could suggest matching number of worker threads, but I am sure the C servers would not be happy with that many (times the number of collectors). Quite curious if someone has an idea what formula could be used here.

joe-elliott · 2020-10-05T18:05:01Z

Worth noting that the size 50 is pulled completely out of thin air :-).

I am unsurprised that 50 isn't rigourously proven to give the best performance per collector :).

The main reason I think this PR is important is that it makes the otel collector/ingester layers perform roughly 1:1 with the current collector/ingester. Before they were doing half the work or less which may be a surprise to someone trying to swap to the new otel collector based pipeline.

objectiser · 2020-10-06T09:44:52Z

@joe-elliott Seems reasonable having the same defaults as currently used in Jaeger. Just to confirm, these defaults can still be overridden using the CLI flags?

joe-elliott · 2020-10-06T12:09:24Z

@objectiser These defaults can be overridden using the OTEL config, but not the Jaeger CLI. Currently the Jaeger CLI options collector.num-workers and collector.queue-size are unsupported. This PR does not change that.

joe-elliott added 2 commits October 2, 2020 16:26

Added jaeger defaults to cassandra queue

e92cc05

Signed-off-by: Joe Elliott <number101010@gmail.com>

Added to elasticsearch

eb76abb

Signed-off-by: Joe Elliott <number101010@gmail.com>

joe-elliott requested a review from a team as a code owner October 2, 2020 20:36

joe-elliott requested a review from pavolloffay October 2, 2020 20:37

objectiser approved these changes Oct 6, 2020

View reviewed changes

Merge branch 'master' into cassandra-exporter-queue

b9a3ec1

objectiser merged commit 1e5b7b8 into jaegertracing:master Oct 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OTel Cassandra/Elasticsearch Exporter queue defaults #2533

OTel Cassandra/Elasticsearch Exporter queue defaults #2533

joe-elliott commented Oct 2, 2020

codecov bot commented Oct 2, 2020 •

edited

Loading

yurishkuro commented Oct 2, 2020

joe-elliott commented Oct 5, 2020

objectiser commented Oct 6, 2020

joe-elliott commented Oct 6, 2020

OTel Cassandra/Elasticsearch Exporter queue defaults #2533

OTel Cassandra/Elasticsearch Exporter queue defaults #2533

Conversation

joe-elliott commented Oct 2, 2020

Which problem is this PR solving?

Short description of the changes

codecov bot commented Oct 2, 2020 • edited Loading

Codecov Report

yurishkuro commented Oct 2, 2020

joe-elliott commented Oct 5, 2020

objectiser commented Oct 6, 2020

joe-elliott commented Oct 6, 2020

codecov bot commented Oct 2, 2020 •

edited

Loading