Analysis and examination of the relationship between errors in the codes on the data Code4Bench
code4bench is now available for download at http://doi.org/10.5281/zenodo.2582968
We aim to perform fault localization with the help of a network. To do this, it's necessary to first obtain the required information from the program. Then, by using network analysis tools like Gephi, we carry out the analysis.
The information needed in the program for constructing the network includes:
Node type 1: Test cases
Node type 2: Program building blocks
Edge type 1: Program coverage (source test case - destination building block covered by the test case)
Edge type 2: Data dependency among building blocks (both source and destination are building blocks)
Edge type 3: Control dependency among building blocks (both source and destination are building blocks)
-
For the given programs, obtain the necessary information and then analyze using the network metrics available (such as degree centrality, betweenness centrality, etc.) to interpret what each metric means for fault localization, and how the available data can be used for fault localization.
-
Identify which one of the three types of edges has more value in fault localization.
- Download and unzip file from the given url
- Install mysql version 5.7
- Create database name it “code4bench”
- In MySQL Workbench
a. Server->Data Import
b. Select the extracted folder
c. Push Start Import (it’s may take a time) - Finish
The schema of Code4Bench is drawn below
The number of submissions for each programming language are listed below
ID | Language | Submission Count |
---|---|---|
1 | GNU C++ 14 | 604,155 |
2 | GNU C | 93,492 |
3 | MS C++ | 164,912 |
4 | GNU C++ 11 | 906,811 |
5 | FPC | 47,522 |
6 | GNU C++ | 1,167,214 |
7 | Java 8 | 154,087 |
8 | Python 3 | 52,433 |
9 | Go | 3,011 |
10 | D | 742 |
11 | MS C# | 14,896 |
12 | GNU C 11 | 18,574 |
13 | Python 2 | 36,469 |
14 | PyPy 2 | 4,507 |
15 | Ruby | 3,806 |
16 | PHP | 2,570 |
17 | PyPy 3 | 3,222 |
18 | Delphi | 9,698 |
19 | Kotlin | 4,739 |
20 | JavaScript | 3,020 |
21 | Haskell | 3,585 |
22 | OCaml | 543 |
23 | Scala | 2,131 |
24 | Mono C# | 5,199 |
25 | Java 7 | 27,931 |
26 | Rust | 599 |
27 | Perl | 784 |
28 | GNU C++ 11 | 1,083 |
29 | Java 8 ZIP | 107 |
30 | J | 2,673 |
31 | GNU C++ 0X | 34,746 |
32 | Java 6 | 22,988 |
33 | Pike | 4,076 |
34 | Befunge | 4,343 |
35 | Cobol | 2,114 |
36 | Factor | 2,606 |
37 | Secret-171 | 158 |
38 | Roco | 3,136 |
39 | Tcl | 3,752 |
40 | F# | 15 |
41 | Io | 2,908 |