Skip to content
Chaiyong Ragkhitwetsagul edited this page Mar 16, 2018 · 7 revisions

Welcome to the Siamese's Development wiki!

To find the optimal n-gram size

  • Use Bellon's data set for measuring MRR, and find the best MRR value across multiple n values (n \in [1, 20]).
  • The Bellon's clone pairs are created from BellonClonesReader project.
File: bellon_benchmark/eclipse-jdtcore.rcf
Version: 1
Pairs: 1345
File: bellon_benchmark/netbeans-javadoc.rcf
Version: 1
Pairs: 55
File: bellon_benchmark/java-swing.rcf
Version: 1
Pairs: 777
Done processing 2177 clone pairs.
Selected 1297 clone pairs.
  • Then, Siamese is used to index the 3 projects (j2sdk1.4.0-javax-swing, eclipse-jdtcore, netbeans-javadoc).
  • The search is done using 255 filtered type-3 clone pairs (>= 10 lines)
  • To start the experiment, run
./script/bellon.sh
Clone this wiki locally