Benchmarking is hard. Benchmarking a process in an easy to reproduce manner is even harder. This repository is a set of tools for benchmarking OpeNER components in such a way that it can be reproduced without having to pull your hairs out.
Benchmarking components and generating reports is broken up in two separate stages. The benchmarks write their results to a SQLite3 database, the reporting tools aggregate the data of this database.
Benchmarking data is versioned (based on the Gem versions) so you can easily see if performance changes between Gem versions. Data is also stored separately for each run, thus the more benchmarks you run the more accurate data becomes (in theory).
- JRuby 1.7 or newer (due to some components requiring JRuby)
- SQLite3
- Bundler
- gnuplot (for generating graphs)
Assuming you have a local clone of this repository, install the Gems first:
bundle install
Then set up the database:
bundle exec rake db:migrate
You can now run a benchmark:
bundle exec ./bin/benchmark components/language_identifier_bench.rb
For generating graphs you'll need to have gnuplot installed. If you're on OS X you can install this as following:
brew install gnuplot
For the various Linux flavours you can use the following:
sudo pacman -S gnuplot # Arch Linux
sudo apt-get install gnuplot # Debian/Ubuntu
sudo yum install gnuplot # CentOS/Fedora
Benchmarks are located in the benchmark/
directory. In its most basic form a
benchmark looks like the following:
OpenerBenchmark.benchmark 'benchmark-name' do
set :version, '...'
set :language, '...'
setup do
end
bench 'name' do
end
end
The OpenerBenchmark.benchmark
line registers a new benchmark group with the
given name. When benchmarking a component the group name should match the
component name ("language-identifier", "tree-tagger", etc).
The set :version
and set :language
lines are used to set the component
version and the input language. These values are not used in the actual
benchmarking loops, instead they are simply added to the database records for
reporting purposes.
The setup do ... end
block can be used to set up variables before any of the
benchmarks are executed. This block is only called once similar to RSpec's
before(:all)
block.
The bench 'name' do ... end
block is a single benchmark that will be
executed. The block's body should only contain benchmarking code, not any
setup related code.
The block is executed many times depending on how many iterations fit in the specified runtime (5 seconds by default). Before it is measured a warmup is performed (for 2 seconds by default). You can change the runtime and warmup time as following:
set :runtime, 10 # run for 10 seconds
set :warmup, 5 # warm up for 5 seconds
You can also add extra job specific metadata as following:
bench 'some benchmark', :words => 10 do
end
Because writing the above for every component and language can be a bit of a
pain there are some helper methods/DSLs that make it easier to write benchmarks
for multiple languages. The first step is to use benchmark_languages
instead
of benchmark
:
OpenerBenchmark.benchmark_languages 'benchmark-name' do
end
Make sure you don't use set :language, ...
as this will be done
automatically for every language.
If you want to benchmark a component using different word sizes you can use the
shared benchmark group word_sizes
:
OpenerBenchmark.benchmark_languages 'benchmark-name' do
include_shared_benchmark :word_sizes
end
A full example (as taken from the tokenizer) is as following:
require 'benchmark_helper'
OpenerBenchmark.benchmark_languages 'tokenizer' do
set :version, Opener::LanguageIdentifier::VERSION
setup do
steps = [:LanguageIdentifier]
@component = Opener::Tokenizer.new(:kaf => true)
@small_review = prepare_kaf(:small, steps)
@medium_review = prepare_kaf(:medium, steps)
@large_review = prepare_kaf(:large, steps)
end
include_shared_benchmark :word_sizes
end
To generate a plain text report of the benchmarking data you can use the Rake
tasks defines in the report
namespace. For example,
rake report:iteration_time
will present average iteration times grouped per
benchmarking group/names.
For more information run rake -T
.
Generating graphs is done using a set of Rake tasks and standalone gnuplot
scripts. These Rake tasks are defined in the plot
namespace. For example, to
generate a graph of the average iteration times for a benchmark group you can
run the following:
rake plot:iteration_time[language-identifier]
The resulting graph is saved as an SVG file in the plots/
directory. If you
don't have a dedicated SVG viewer you can usually open SVG files in your web
browser (e.g. Chrome/Chromium).