Skip to content

Latest commit

 

History

History
139 lines (121 loc) · 6.32 KB

README.md

File metadata and controls

139 lines (121 loc) · 6.32 KB

Benchmark overview

Build and run

To download dependencies and build the executable jar-file, run:

mvn clean dependency:copy-dependencies package

Then run the benchmark with:

java -jar target/benchmark-1.0-SNAPSHOT.jar "path to benchmark config file"

or use the --default-config flag to generate the default config file in the current working directory:

java -jar target/benchmark-1.0-SNAPSHOT.jar --default-config

When loading the config file the benchmark checks that the given configuration is valid. If this is not the case, the benchmark will quit with a message describing why the config is invalid.

With 90+ individual configuration options some invalid setups may have slipped through the cracks of this config validation. Hopefully the benchmark will crash in this case rather than continue running in an invalid state but to be entirely certain you may want to enable assertions for the first full run of any config file.

Sample benchmark setups

Self-contained run: Configure generation, ingestion, and querying as desired and run the benchmark.

Multiple query-variations on a static dataset: If you want to run multiple different query-variations on the same static dataset, you may want to avoid reinserting all your data between runs. To do this, you'll need 2+ configuration files. Config file #1 needs to do data-insertion and serialize the generated metadata. The query-configs (config #2+) can then deserialize the previously generated metadata and avoid repopulating the database.

Obviously, if you're using this approach, you should make sure that all the database data is fully compacted/optimized before the first query runs to avoid favoring later runs. Database-caching of data between runs may also have an effect here that you may need to take into consideration.

Multiple benchmark processes on different hosts: The benchmark supports running one ingestion-process and N query-processes concurrently on different hosts. This setup requires no special configuration outside of sharing the serialized metadata from an initial generation between the hosts. The only requirement for this setup is that the queries.dateinformation setting must be non-negative (which is true of the default setup) to ensure that all cross-thread communication is done through the data stored in the target database.

Default config

Example of the default config generated by the benchmark. For documentation of each setting, see ConfigFile.java.

File-/folder-paths and database-credentials must be specified before the config file can be used.

benchmark.rngseed         = 1234
benchmark.schema          = narrow
benchmark.output.csv      = false
benchmark.output.csv.path = ./bench-out
benchmark.output.csv.prefix = NONE
benchmark.output.csv.header = true

serialization.enabled = false
serialization.path    = ./bench-out

generator.enabled = true
generator.input.idmapfile     = FILE PATH
generator.input.datafolder    = FOLDER PATH
generator.input.floorinfofile = FILE PATH
generator.input.floormapfile  = FILE PATH
generator.input.ignorefile    = FILE PATH
generator.input.combinedfile  = FILE PATH
generator.input.separator     = ;

generator.data.generationsamplerate  = 60
generator.data.seedsamplerate        = 60
generator.data.jitter                = 5
generator.data.startdate             = 2019-01-01
generator.data.enddate               = 2019-02-01
generator.data.granularity           = millisecond
generator.data.scalefactor.floor     = 1.0
generator.data.scalefactor.sensors   = 1.0
generator.data.scalefactor.connectedclients = 1.0
generator.output.targets             = influx
generator.output.filepath            = ./bench-out/generator-out.csv

ingest.enabled               = true
ingest.threads               = 1
ingest.target                = influx
ingest.target.recreate       = false
ingest.target.sharedinstance = false
ingest.startdate             = 2019-02-01
ingest.speed                 = -1
ingest.reportfrequency       = -1
ingest.duration.time         = -1
ingest.duration.enddate      = 9999-12-31

queries.enabled               = true
queries.threads               = 1
queries.target.sharedinstance = false
queries.target                = influx
queries.duration.time         = 60
queries.duration.warmup       = -1
queries.duration.count        = -1
queries.weight.floortotals    = 1
queries.weight.totalclients   = 1
queries.weight.maxforap       = 2
queries.weight.avgoccupancy   = 1
queries.weight.kmeans         = 1
queries.earliestdate          = 2019-01-01
queries.range.day             = 0.4
queries.range.week            = 0.7
queries.range.month           = 0.9
queries.range.year            = 0.95
queries.reporting.summaryfrequency = -1
queries.interval.min          = 21600
queries.interval.max          = 604800
queries.interval.min.kmeans   = 21600
queries.interval.max.kmeans   = 86400
queries.kmeans.clusters       = 5
queries.kmeans.iterations     = 10
queries.dateinformation       = 500

influx.url             = localhost:8086
influx.dbname          = benchmark
influx.username        = USERNAME
influx.password        = PASSWORD
influx.table           = generated
influx.batch.flushtime = 1000
influx.batch.size      = 3000

timescale.host                  = localhost:5432
timescale.dbname                = benchmark
timescale.username              = USERNAME
timescale.password              = PASSWORD
timescale.table                 = generated
timescale.rewritebatchedinserts = true
timescale.batchsize             = 3000
timescale.createsecondaryindex  = false

kudu.host                               = localhost:7051,localhost:7151,localhost:7251
kudu.table                              = generated
kudu.maxcolumns                         = 300
kudu.batchsize                          = 3000
kudu.mutationbufferspace                = 6000
kudu.partitioning.type                  = none
kudu.partitioning.hash.buckets          = 4
kudu.partitioning.range.interval        = monthly
kudu.partitioning.range.precreatedyears = 4

debug.createprecomputedtables = false
debug.printallsettings        = false
debug.savequeryresults        = false
debug.savequeryresults.path   = ./bench-out
debug.synchronizerngstate     = false
debug.truncatequerytimestamps = true
debug.reportquerystatus       = false
debug.partitionlockstep       = false
debug.partitionlockstep.timescaledetailed = false