Releases: sosy-lab/benchexec
Release 3.7
This is expected to be the last BenchExec release that supports Python 3.5, newer releases will require Python 3.6 or newer.
Please cf. issue #717 for our plan on dropping support for further Python versions.
We would like to note that Linux kernel 5.11 brings a major improvement for BenchExec users not on Ubuntu:
Now it should be possible to use the overlayfs feature as a regular user, no need to pass --read-only-dir /
and similar parameters.
We updated our installation instructions accordingly and also clarified that BenchExec requires x86 or ARM machines and recommend Linux kernel 4.14 or newer due to reduced cgroups overhead.
Changes in this release:
- In HTML tables, the following settings are now stored in the hash part of the URL:
- Column sorting
- Page size of the table, i.e., how many rows are shown
- Filters for task names that are defined by entering text into the left-most input field of the filter row of the table.
Previously this would only work for task-name filters defined in the filter sidebar.
This means that using the back/forward navigation of the browser will change these settings and that they can be present in shared links.
- Fix a few cases of printing of statistics information in HTML tables.
This affects corner cases like the number of visible decimal digits for0
and trailing zeroes for the standard deviation in the tooltip. - When a user requests rounding to a certain number of decimal digits, the filtering functionality of the HTML tables will now use the raw values, not the rounded values.
This is consistent with the behavior when rounding is not explicitly requested and BenchExec applies the default rounding rules. - Fix harmless stack trace printed at end of
benchexec
execution in cases like of early termination, e.g., if the tool could not be found. - Some improvements to tool-info modules.
- Several updates of JS libraries, but this should not bring user-visible changes.
Release 3.6
- One tool-info module improved.
Release 3.5
- One tool-info module improved.
Release 3.4
-
BenchExec is now available in a PPA for easy installation on Ubuntu. Just run the following commands
sudo add-apt-repository ppa:sosy-lab/benchmarking sudo apt install benchexec
-
Column filters are now reflected in the URL of HTML tables.
This makes it possible to open a table, configure some filters, and share a link with others that will apply these filters on load.
Furthermore, using the back and forward buttons of the browser will now also update the applied filters. -
Add parameter
--initial-table-state
totable-generator
, which allows to define the default state of HTML tables (e.g., filters, opened tab, etc.). -
Category-specific statistics are shown more often again on first table tab.
Since BenchExec 3.0 these were removed in some cases where we cannot compute them, but this accidentally removed them from more than the desired cases. -
Improved rounding in table-generator.
-
SV-COMP scoring schema updated according to rules of SV-COMP'21.
-
Many tool-info modules updated to use the new API from BenchExec 3.3 and improvements for SV-COMP'21 and Test-Comp'21.
-
Improved warnings in certain cases where a benchmark definition does not make sense (e.g.,
<exclude>
tags that do not match anything). -
HTML tables now show a proper error message if the browser is not supported and also a loading message.
-
Several smaller bug fixes like avoiding crashes in corner cases.
Release 3.3
-
New API for tool-info modules (needed by
benchexec
for getting information about the benchmarked tool). The new API is defined by classbenchexec.tools.template.BaseTool2
and is similar to the old API, but more convenient to use and provides more useful information to the tool-info module.
The old API is still supported and will be removed no sooner than in BenchExec 4.0. We also provide a migration guide. -
A new parameter
--tool-directory
forbenchexec
allows to specify the installation directory of the benchmarked tool easily without having to modifyPATH
or change into the tool's directory.
Note that this only works if the respective tool-info module makes use of the newBaseTool2
API. -
New version 2.0 of the task-definition format for
benchexec
.
This format allows to specify arbitrary additional information in a key namedoptions
andbenchexec
will pass everything in this key to the tool-info module, but note that this only works if the respective tool-info module makes use of the newBaseTool2
API.
This is useful to add domain-specific information about tasks, for example in the SV-Benchmarks repository it is used to declare the program language.
BenchExec also still supports version 1.0 of the format. -
table-generator
is now defined to work on Windows and we test this in continuous integration.
Previously, it probably was working on Windows most of the time but we did not systematically test this. -
Fix a crash in
benchexec
for task with property but without task-definition file.
Release 3.2
- The HTML tables produced by
table-generator
now provide a score-based quantile plot in addition to the regular quantile plot if scores are used. If available, it is shown by default on the tab for quantile plots.
Score-based quantile plots are for example used by SV-COMP to visualize results. - Better axis labels in scatter plot of HTML tables.
- More auxiliary lines available in scatter plot of HTML tables.
- New tool-info module added.
Bug fixes:
- Fix crash in
benchexec
if a non-SV-COMP property was used. - Fix for empty property files being treated as SV-COMP properties.
- Fix unnecessarily large I/O for text file with results of
benchexec
during benchmarking. The.results.txt
file is now written incrementally. - Fix incorrect handling of
<withoutfile>
tasks if the tool-info module declared a non-standard working directory. - Small fix for the new filter overlay in the HTML tables when the first run set has no filter.
Release 3.1
- Fix our
benchexec.check_cgroups
installation check, which showed invalid warnings since BenchExec 2.7. - Improve handling of inaccessible mountpoints in containers.
This should make it possible to use nested containers on most systems using the default arguments (e.g., no need for--hidden-dir /sys
). - Improved row filters of HTML tables (thanks to @DennisSimon).
In addition to filtering via drop-down fields in the table header, it is now also possible to define filters on a separate overlay, which can be opened from all tabs via a button in the top-right corner
(e.g., also while looking at plots).
The filters for status and category in the filter overlay are more flexible because several values can be selected for status and category. This allows to define filters likecategory = "correct" AND (status = "false" OR status = "false(unreach-call)")
.
Furthermore, the filter overlay allows to filter the parts of the task id (left-most column) individually and makes it easier to define filters with numeric ranges. - Redesigned UI for changing the plot settings of quantile and scatter plots in the HTML tables (thanks to @lachnerm).
- Hiding columns in HTML tables is now reflected in the URL.
This makes it possible to create links to tables that hide columns.
Release 3.0
This release contains only one new feature compared to BenchExec 2.7:
- Tables produced by
table-generator
now show the expected verdict of each task, if it is known and it is not the same for all rows.
However, there are several deprecated features removed and other backwards-incompatible changes to make BenchExec more consistent and user-friendly:
- Support for Python 2.7 and 3.4 is removed, the minimal Python version is now 3.5 for all components of BenchExec.
We plan to remove support for Python 3.5 after Ubuntu 16.04 goes out of support in 2021. - If a tool-info module returns
UNKNOWN
for a run result, BenchExec will no longer overwrite that if it thinks the tool terminated abnormally. It will continue to do so ifERROR
is returned. - Result values named
cpuenergy-pkg[0-9]+
are renamed tocpuenergy-pkg[0-9]+-package
because these are not a sum of all the other CPU-energy measurements. - Names of result files produced by
benchexec
now contain timestamps with seconds in order to avoid problems when startingbenchexec
in quick succession. - Support for generating the old-style static HTML tables (with
table-generator --static-table
) is removed.
Only the modern tables that are available since BenchExec 2.3 and CSV tables can be generated. - More metadata are stored in result files of
benchexec
, sotable-generator
no longer needs access to the task-definition files, and changes to the expected verdict that are made after benchmarking will not be reflected in tables. - The Python library Tempita is no longer a dependency of BenchExec.
- We do not create and distribute
.egg
packages for BenchExec releases anymore, only the more modern.whl
packages, as well as Debian/Ubuntu packages and Tar archives.
Furthermore, BenchExec no longer contains hard-coded knowledge about any specific property, all properties are treated in the same way.
(The only exception is that score computation is enabled for SV-COMP properties.)
This simplification implies several more changes:
- For checking expected verdicts and computing scores it is now required that task-definition files are used.
Expected verdicts encoded in the task name are no longer supported. - Tool-info modules need to return results
true
orfalse
, the resultssat
andunsat
are no longer supported (these were allowed only for the propertySATISFIABLE
). - There is no special handling for composite properties like SV-COMP's property for memory safety anymore.
Previously this property would be represented as a collection of its subproperties, now it is treated as one property.
Task-definition files can still contain a violated subproperty, andbenchexec
will continue to use this information for checking the tool result, but this does not depend on which property is used. - Score computation is fixed for tables where property files have uncommon names.
The name of property files is now no longer relevant (as it should have been).
Because of this,table-generator
needs to have access to the property files that were used during benchmarking.
Release 2.7
- The supplied file
benchexec-cgroup.service
for cgroup configuration on systems with systemd now works with systemd 240 or newer (e.g., on Ubuntu 20.04).
This also affects the Debian package of BenchExec. - Error messages about failed cgroup access were improved.
- Buttons below plots in the HTML table do not need to be clicked twice.
- Directly opening the quantile tab of HTML tables via the URL works now.
- First line of logs shown in overlay of HTML tables is selectable again.
Release 2.6
This release brings several improvements for the new kind of HTML tables produced by table-generator
, in particular:
- Add hash routing, i.e., the possibility to navigate to certain parts of the application directly by adding a suffix to the URL. For example, opening
...table.html#/table
will directly open the table. While navigating through the application, the URL automatically adjusts. This also means that it is possible to use the "Back" button of the browser for going back to previously opened tabs or for closing an overlay window.
Thanks @DennisSimon for this! - Make references to files in task-definition files clickable.
When clicking on a cell in the first column of table, it shows the task-definition file in an overlay.
Now the file's YAML content is parsed and links to input files are added.
Thanks @lachnerm for this! - Fix filtering of negative values in half-open intervals.
- More tooltips and hover effects on table headers to improve usability.
- The table tab now appropriately adjusts if the browser window is resized.
- Fix legend of quantile plot if some columns are empty/missing, and show disabled columns in gray.
- Fix scatter plot if not all data points have valid values.
- Fix layout of column-selection dialog in case not all columns are present for all run sets.
- Fix scrolling behavior of close button of overlay windows.
- In case the property is the same for all tasks of a table, it was not shown so far in the table. Now we show it on the summary tab.
- Improve position of scroll bars across all tabs.
There are also a few changes in other parts of BenchExec:
- Fix mount problems in container mode if mount points with unusual characters (like
:
) or bind mounts over files exist. The latter is for example relevant when nesting containers (inside another BenchExec or Docker container). - Several new tool-info modules and small improvements to existing ones.
runexec
now creates parent directories of output files if necessary.table-generator
now works if environment variableLANG
is missing.table-generator
should now work on Windows.- It is possible to turn off colored output on stdout by setting the environment variable
NO_COLOR
(cf. https://no-color.org/). - In the
contrib
folder, we now provide a script for generating task-definition files in YAML format for old-style tasks.