Skip to content

Latest commit

 

History

History
 
 

data

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

MUBench Logo

MUBench : Dataset

The MUBench Dataset references projects with known API misuses. Each subfolder of this directory identifies one project. For each project, the dataset references one or more project versions that contain the known misuses (usually the version immediately before a particular misuse was fixed). The dataset also specifies the misuses themselves and links misuses and project versions.

When running experiments you may use the qualified project, version, or misuse ids to select particular datapoints for your experiment. The project id is the name of the respective subfolder in this directory. The qualified version id has the form <project-id>.<version-id>, where the version id is the name of the respective directory in <project-id>/versions/. The qualified misuse id has the form <project-id>.<misuse-id>, where the misuse id is the name of the respective directory in <project-id>/misuses/.

You may also use qualified project, version, or misuse ids to specify sub-dataset in datasets.yml, to conveniently run experiments on certain subsets of the entire dataset.

Statistics

The current dataset contains 209 misuses from 108 project versions of 50 projects. It contains 162 misuses in 47 compilable project versions of 25 projects. It contains 64 misuses with corresponding crafted examples of correct usage, 39 from 29 compilable versions of 13 projects and 25 hand-crafted examples.

Manually Collected:

Identified by API-Misuse Detectors:

ICSE 18

Total: 132 misuses total, 85 misuses in compilable project versions

Experiment 1: 64 misuses, 39 from 29 versions of 13 projects and 25 hand-crafted examples

Experiment 2: 5 projects, 7 previously-unknown misuses identified in the MUDetect's top-20 findings

Experiment 3: 85 misuses, 60 from 29 versions of 13 projects and 25 hand-crafted examples

TSE 17

Total: 125 misuses total, 78 misuses in compilable project versions

Experiment 1: 64 misuses, 39 from 29 versions of 13 projects and 25 hand-crafted examples

Experiment 2: 5 projects, 14 previously-unknown misuses identified in the detectors' top-20 findings

Experiment 3: 53 misuses from 29 versions of 13 projects (no hand-crafted examples)

Contribute

To contribute to the MUBench Dataset, please contact Sven Amann with details about the misuses. For each misuse, please try to provide

  • A description of the misuse (and its fix).
  • A link to the website of the project you found the misuse in.
  • A link to the project's publicly-readable version-control system, and a commit id to a version with the misuse or, ideally, to the commit that fixes the misuse.
  • The misuse's location (file, method, and misused API).
  • Instructions on how to compile the project in the respective version.