opencrs_dataset
is the dataset resulted by running OpenCRS's dataset module with all integrated test suites. It contains 54586
vulnerable ELF executables, compiled from C sources and targetting the 32-bit i386
architecture.
opencrs_dataset root folder
βββ executables folder with all executables
β βββ ...
βββ index.csv labels for exact executable, with
| parent dataset and its CWEs,
| eventually separated by commas
βββ README.md this file
Identifier | Test Suite Name | Creator | Initial Sources Count | Final Executables Count |
---|---|---|---|---|
nist_juliet |
Juliet C/C++ 1.3 | National Security Agency's Center for Assured Software. National Institute of Standards and Technology | 64123 |
54531 |
nist_c_test_suite |
C Test Suite for Source Code Analyzer v2 - Vulnerable | Alexander Hoole. National Institute of Standards and Technology | 54 |
50 |
toy_test_suite |
Toy Test Suite | OpenCRS | 5 |
5 |
Weakness | Count |
---|---|
Stack-based Buffer Overflow | 13834 |
Heap-based Buffer Overflow | 11088 |
Integer Overflow or Wraparound | 3960 |
Mismatched Memory Management Routines | 3564 |
Integer Underflow | 2952 |
Free of Memory not on the Heap | 2680 |
Use of Externally-Controlled Format String | 2407 |
Buffer Underflow | 2048 |
Buffer Under-read | 2048 |
OS Command Injection | 1921 |
The columns present in the index.csv
file are the following:
name
: Unique identifier of a vulnerable program. It is used to determine the executable file path, namely by using the formatexecutables/<name>.elf
;cwes
: One or more CWEs that are present in the executable; andparent_dataset
: Parent dataset's identifier.
- Set up OpenCRS's
dataset
module on an Ubuntu 20.04 host by following the guide. - Build each test suite (identified by
<test_suite_id>
):poetry run dataset build --testsuite <test_suite_id>
. - The executables in this repository, under the
executables
folder, are those fromdataset
'sexecutables
. The same relation applies forindex.csv
, which isdataset
'svulnerables.csv
without the last column.