apmbench: Define benchmark scenarios and topologies #7858

marclop · 2022-04-12T16:25:00Z

Description

Benchmark scenarios

We currently have 4 benchmarks that leverage the new event handler piece in apmbench to load pre-recorded APM Agent events and replay them to a target APM Server. The current benchmarks are split by language agent but that isn't necessarily the best way to benchmark the APM Server. The current generated data has been gathered from the existing opbeans applications, but that is also not necessarily the best type of application to use for our benchmarks.

We should discuss what kind of benchmark scenarios we'd like to include to be run on a daily basis and the purpose they serve.

Benchmark topologies

Additionally, we should look into the benchmark size matrix we'd like to support, for example:

Objectives / Outcomes:

Max throughput, APM Server performance with high APM Agent number
Undersized ES, APM Server should not OOM

APM Server Size	Elasticsearch size	Agent #	Objective / Outcome
1GB x 1 zone	16GB x 2 zones	Medium (600)	1
8GB x 1 zone	16GB x 2 zones	High (2400)	1
8GB x 1 zone	1GB x 2 zones	High (2400+)	2
8GB x 1 zone	1GB x 2 zones	High (2400+)	2

The text was updated successfully, but these errors were encountered:

simitt · 2022-04-19T18:17:50Z

Building on top of what you already suggested, I'd appreciate having a few more scenarios covered:

Test APM Server's max throughput by sending more events/second than it can process
One goal of these tests is also to ensure linear scalability.

APM Server Size	Elasticsearch size	Agent #
1GB x 1 zone	16GB x 2 zones	Medium (1000)
4GB x 1 zone	16GB x 2 zones	Large (2500)
8GB x 1 zone	16GB x 3 zones	Large (2500)
8GB x 2 zones	32GB x 2 zones	X-Large (5000)

Test APM Server's behavior with a specific throughput and observing resource usage

APM Server Size	Elasticsearch size	# events/second
1GB x 1 zone	16GB x 2 zones	TBD
4GB x 1 zone	16GB x 2 zones	TBD
8GB x 1 zone	16GB x 3 zones	TBD

The concrete number of events/second needs to be defined after the current limits of the APM Server per size are defined; it should be close to the maximum load it can process.

Test APM Server's behavior when ES is undersized, expecting a sensible resource usage and response pattern

APM Server Size	Elasticsearch size	Agent #
1GB x 1 zone	1 x 1 zones	Medium (1000)
8GB x 1 zone	1 x 1 zones	Large (2500)

We might need to tweak these concrete numbers eventually.

simitt · 2022-05-30T08:37:27Z

@marclop and @lahsivjar for finishing up the 8.3 work, let's set up the initial benchmarks with the numbers I suggested in #7858 (comment). For now, let's use the events defined in https://github.com/elastic/apm-server/tree/main/systemtest/benchtest/events for ruby,go,python and nodejs(e.g. 1000 agents overall-> 250 sending ruby events, 250 sending go events and so on).

Is there anything else that is required for finishing this task? I expect that these concrete cases need to be incorporated into the automation tooling that the engineering productivity team is currently doing. Please reference here if any more effort on the APM Server team side is required for this.

pazone · 2022-05-31T14:04:35Z

For convenience we can maybe split performance and hardware profiles into 2 .properties files

marclop · 2022-06-06T09:39:47Z

#8275 added a new benchmark BenchmarkAgentAll which will simulate the scenario defined in #7858 (comment).

marclop · 2022-06-09T06:53:45Z

@simitt After my benchmarking efforts testing the gomaxprocs changes (#8278 (comment)). I think the number of agents we initially proposed in this issue may be too high. Using using more than 64 agents for a 1GB APM Server may be more than it can handle and thus hinder the benchmark results, particularly if no -max-rate is used (all agents would be sending at max throughput).

A good rule of thumb seemed to be incrementing the agent count by ~64 or double the pervious size for each size increment yielded close to optimal results with the default server settings, we could try with other increments as well; 96 or 128. This table could look like:

size	agents	ES topology
1g	64	2x 16gb
2g	128	2x 16gb
4g	256	2x 32gb
8g	512	2x 32gb
16g (2x 8gb)	1024	2x 64gb

Perhaps we can separate the objectives that different number of agents have into different jobs or at least analyze them differently since many factors will affect the server's throughput, not only the server size.

As we are all aware in the team, the ultimate bottleneck to APM Server performance isn't the APM Server itself, since it can only process events as fast as they are coming our of the APM Server (with some capacity to absorb peaks in its modelindexer buffer), but rather the rate at which Elasticsearch can index the documents that we are sending in our _bulk requests.

For that reason, expecting linear scalability out of the APM Server wouldn't be reasonable without tuning the index.number_of_shards as we scale up and out. It is still acceptable to benchmark the performance out of the box with default settings, since many users will run with those, but may be good to start documenting more openly how users would want to increment the number of primary shards for some data streams as they start to scale up and out.

marclop · 2022-06-15T07:29:19Z

Quick update on this. We've moved forward with the different topologies defined in #7858 (comment) and have defined them in https://github.com/elastic/apm-server/tree/main/testing/benchmark/system-profiles. Different configurations outside of the topologies such as index.number_of_shards is not something we're looking to explore at the moment. Would it be better to open up a new issue with the contents of the comment above and close this one?

The remaining automation work is tracked in #7846.

simitt · 2022-06-15T14:34:25Z

@marclop your proposal makes sense; we can iterate on and fine-tune the scenarios over time. Please create a follow up issue with relevant config options that are currently out of scope of the tests, such as index.number_of_shards. It will be good to revisit these in the future.
This issue can be closed then.

marclop added enhancement discuss benchmarking labels Apr 12, 2022

marclop mentioned this issue Apr 13, 2022

Benchmark 2.0 production ready #7540

Closed

16 tasks

marclop added the 8.3-candidate label Apr 13, 2022

simitt added this to the 8.3 milestone Apr 19, 2022

simitt removed the 8.3-candidate label Apr 19, 2022

simitt added the 8.4-candidate label May 24, 2022

simitt modified the milestone: 8.3 May 24, 2022

simitt assigned marclop May 24, 2022

simitt added this to the 8.3 milestone May 24, 2022

simitt modified the milestones: 8.3, 8.4 Jun 3, 2022

simitt removed discuss 8.4-candidate labels Jun 3, 2022

simitt mentioned this issue Jun 3, 2022

APM Server Soak testing #8302

Closed

5 tasks

marclop mentioned this issue Jun 15, 2022

benchmarking: Run benchmarks with non default shard settings #8389

Open

marclop closed this as completed Jun 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

apmbench: Define benchmark scenarios and topologies #7858

apmbench: Define benchmark scenarios and topologies #7858

marclop commented Apr 12, 2022

simitt commented Apr 19, 2022

simitt commented May 30, 2022

pazone commented May 31, 2022

marclop commented Jun 6, 2022

marclop commented Jun 9, 2022 •

edited

Loading

marclop commented Jun 15, 2022

simitt commented Jun 15, 2022

apmbench: Define benchmark scenarios and topologies #7858

apmbench: Define benchmark scenarios and topologies #7858

Comments

marclop commented Apr 12, 2022

Description

Benchmark scenarios

Benchmark topologies

simitt commented Apr 19, 2022

simitt commented May 30, 2022

pazone commented May 31, 2022

marclop commented Jun 6, 2022

marclop commented Jun 9, 2022 • edited Loading

marclop commented Jun 15, 2022

simitt commented Jun 15, 2022

marclop commented Jun 9, 2022 •

edited

Loading