A description of the dataset and its possible applications (on top of fueling the vulerability assessment tool) can be found in
Serena E. Ponta, Henrik Plate, Antonino Sabetta, Michele Bezzi, Cédric Dangremont, A Manually-Curated Dataset of Fixes to Vulnerabilities of Open-Source Software
If you use this dataset, please cite it as:
@inproceedings{ponta2019msr,
author={Serena E. Ponta and Henrik Plate and Antonino Sabetta and Michele Bezzi and
C´edric Dangremont},
title={A Manually-Curated Dataset of Fixes to Vulnerabilities of Open-Source Software},
booktitle={Proceedings of the 16th International Conference on Mining Software Repositories},
year=2019,
month=May,
}
The Jupyter notebook used to analyze the dataset and to produce the statistics and the plots shown in the paper can be found here.
The paper A Practical Approach to the Automatic Classification of Security-Relevant Commits uses this dataset to train a classifier that detects security-relevant commits (i.e., that are likely to fix a vulnerability).