Skip to content

Latest commit

 

History

History
27 lines (19 loc) · 1.26 KB

README.md

File metadata and controls

27 lines (19 loc) · 1.26 KB

MSR 2019 Data Showcase

A description of the dataset and its possible applications (on top of fueling the vulerability assessment tool) can be found in

Serena E. Ponta, Henrik Plate, Antonino Sabetta, Michele Bezzi, Cédric Dangremont, A Manually-Curated Dataset of Fixes to Vulnerabilities of Open-Source Software

If you use this dataset, please cite it as:

@inproceedings{ponta2019msr,
    author={Serena E. Ponta and Henrik Plate and Antonino Sabetta and Michele Bezzi and
    C´edric Dangremont},
    title={A Manually-Curated Dataset of Fixes to Vulnerabilities of Open-Source Software},
    booktitle={Proceedings of the 16th International Conference on Mining Software Repositories},
    year=2019,
    month=May,
}

The Jupyter notebook used to analyze the dataset and to produce the statistics and the plots shown in the paper can be found here.

Sample applications

Automated classification of security-relevant commits in open-source repositories

The paper A Practical Approach to the Automatic Classification of Security-Relevant Commits uses this dataset to train a classifier that detects security-relevant commits (i.e., that are likely to fix a vulnerability).