Skip to content
stackedsax edited this page Dec 30, 2015 · 1 revision

Ingestion Benchmarks

###Tunables tested:

#####Serverside:

  • Metrics per batch write inside blueflood: 100 200 300 500
  • Accept Threads: 1 5 10 20
  • Write Worker Threads: 10 25 50 75 100

#####Clientside:

  • Batches (i.e, concurrent requests): 1 5 10 15 20 30 40 50 75 100 125 150 200
  • Batch Size (distinct metrics per concurrent request): 1 25 50 100 200 300 400 500 600 700 800 900 1000

#####Results:

  • metrics/second

###Procedure:

For a given combination of server side tunable configurations, plus batch size configuration, Cassandra and Blueflood were started once. Blueflood would then be fed a very short burst of metrics from a tenant id used only for warmup purposes. Blueflood was then sent metrics for each of the 13 values for "batches", which are effectively concurrent requests. Each benchmark of a given combination of all server and client-side configuration ran for 180 seconds, with the first 60 seconds of data being thrown away*. After each benchmark, there was a 10 second pause before continuing with the next value for the "batches" client-side configuration. On each HTTP POST, one data point is sent for each metric that is included in the request.

Cassandra's commit-log and data directory were blown away after each run of 13 different configurations for Batches.

Q: Why throw away the first 60 seconds? Doesn't that mean the results are not realistic?

A: Over time, the values for total (metrics/sec across the full 180s) versus recent (last 120s of metrics/sec) converge. 180 seconds is not long enough for that to happen for certain configurations, however. >70% of the time, total versus recent were within 3% of each other.

###Conclusions:

Number of accept threads inside of Blueflood does not appear to have a strong impact on performance. This may have a greater impact when clients are not using http keep-alive.

As a general rule of thumb, sending 100+ metrics per request (1 data point each request), with Blueflood doing write batches of 200-500+ metrics per write, with a total amount of concurrent metrics in-flight (concurrent requests * metrics/request) greater than 1000 and less than 100,000, you will see performance of >50k metrics written per second.

The peak throughput seen was 56813 metrics/second. This was seen with Blueflood write batches at 500, with 900 metrics/request, 15 concurrent requests, and 75 write worker threads for Blueflood.

###Notes:

#####Setup:

Testing done using:

  • Rackspace 30GB Performance Cloud Server
  • Ubuntu 12.04 LTS
  • Cassandra 2.0.4 with JNA libraries available
  • Java(TM) SE Runtime Environment (build 1.7.0-b147)
  • Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17, mixed mode)
  • Cassandra Data Directory on the 300GB data disk provisioned with the server
  • Commitlog on a Cloud Block Storage SSD.
  • Blueflood run with -Xms1G -Xmx8G

Followed (in this order):

#####All the data:

Full data is available at: