MSR 2019 Data Showcase

A description of the dataset and its possible applications (on top of fueling the vulerability assessment tool) can be found in

Serena E. Ponta, Henrik Plate, Antonino Sabetta, Michele Bezzi, Cédric Dangremont, A Manually-Curated Dataset of Fixes to Vulnerabilities of Open-Source Software

If you use this dataset, please cite it as:

@inproceedings{ponta2019msr,
    author={Serena E. Ponta and Henrik Plate and Antonino Sabetta and Michele Bezzi and
    C´edric Dangremont},
    title={A Manually-Curated Dataset of Fixes to Vulnerabilities of Open-Source Software},
    booktitle={Proceedings of the 16th International Conference on Mining Software Repositories},
    year=2019,
    month=May,
}

The Jupyter notebook used to analyze the dataset and to produce the statistics and the plots shown in the paper can be found here.

Sample applications

Automated classification of security-relevant commits in open-source repositories

The paper A Practical Approach to the Automatic Classification of Security-Relevant Commits uses this dataset to train a classifier that detects security-relevant commits (i.e., that are likely to fix a vulnerability).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

MSR 2019 Data Showcase

Sample applications

Automated classification of security-relevant commits in open-source repositories

Files

README.md

Latest commit

History

README.md

File metadata and controls

MSR 2019 Data Showcase

Sample applications

Automated classification of security-relevant commits in open-source repositories