Repository of the key data sets used in Reading The Bazaar: Comparing Approaches in the Classification of Linux C Function Parameter Mutability

This repository contains the two primary data sets used in the investigation and comparison of different labeling approaches for
labeling Linux C parameter immutability. Code and nebulous data associated with data set generation can be found at this Github repository. Code associated with machine learning modelling can be found at this Github repository. Both repositories are
unorganized. The human labeling interface can be found at this Github repository.

The General Data Set

The general data set consists of a scraped list of 125k Linux parameters from the kernel data set. A range of information was included about them,
including their parent function body. Due to the size of this data set (110MB), this data set was split in two for storage purposes.

The Human Data Set

This data set consists of the 48 parameters shown to human survey respondents for mutability classification.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LICENSE		LICENSE
README.md		README.md
human_labeling_set.csv		human_labeling_set.csv
labeled_body_shuffled_ds_0.csv		labeled_body_shuffled_ds_0.csv
labeled_body_shuffled_ds_1.csv		labeled_body_shuffled_ds_1.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Repository of the key data sets used in Reading The Bazaar: Comparing Approaches in the Classification of Linux C Function Parameter Mutability

The General Data Set

The Human Data Set

About

Releases

Packages

License

mjgaughan/bz_data

Folders and files

Latest commit

History

Repository files navigation

Repository of the key data sets used in Reading The Bazaar: Comparing Approaches in the Classification of Linux C Function Parameter Mutability

The General Data Set

The Human Data Set

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages