Evaluation Artifact for "Give an Inch and Take a Mile? Effects of Adding Reliable Knowledge to Heuristic Feature Tracing"
This repository contains the artifact for our paper Give an Inch and Take a Mile? Effects of Adding Reliable Knowledge to Heuristic Feature Tracing which is accepted at the 28th ACM International Systems and Software Product Line Conference (SPLC 2024). The project comprises the source code running the empirical evaluation of our boosted comparison-based feature tracing algorithm, which is implemented as a library in a separate open-source GitHub repository.
Our algorithm is designed to enhance the accuracy of retroactive heuristic feature tracing with proactively collected feature traces. Particularly, the algorithm can be used for projects with multiple product variants. There, it can improve the accuracy and efficiency of the tracing process by exploiting reliable manual knowledge.
In our experiment, we employed the feature traces computed as ground-truth by VEVOS and used them to assign mappings to randomly chosen nodes in the artifact trees which represent the variants' source code. We ran each experiment without proactive traces mapped onto the nodes (0%) and increased the amount of available information. Furthermore, we performed this experiment with increasing numbers of compared variants, both as noted in the properties-file (in data).
In each run, the variants are randomly generated (invocation of prepareVariants()
in RQRunner)
and evaluated based on the increasing amounts of feature traces,
which are randomly distributed and assigned to nodes of the artifact trees, for each increase of available proactive feature trace annotations.
After having propagated the proactive traces for the boosting effect, the boosted algorithm determines feature annotations for each remainingnode.
Finally, in this experiment, we compare (compareMappings()
in Evaluator)
the agreement of the ground truth data and the mapping determined by the boosted algorithm,
on whether a node should be present a variant.
If ground truth and mapping align we count it as true positive or negative (the latter if the node is removed).
If the ground truth keeps the node while the mapping would remove it, we count it as false negative, and vice versa.
Clone the repository to a location of your choice using git:
git clone https://github.com/VariantSync/trace-boosting-eval.git
This is done automatically by the Docker script and is only required if you plan on interacting directly with the data.
Open a terminal in the cloned directory and execute the setup script. The script downloads the required data files which consists of the subject repositories and their ground truth from Zenodo.
./setup.sh
The following paragraphs explain how to run the experiments from the paper by using Docker. An explanation how to run the Java implementation from source follows thereafter.
- Install Docker on your system and start the Docker Daemon.
Depending on your Docker installation, you might require elevated permission (i.e., sudo) to call the Docker daemon under Linux.
- Open a terminal and navigate to the project's root directory
- Build the docker image by calling the build script corresponding to your OS
Under MacOS the image's and host's platform might not match. Therefore, we added a separate
build-on-mac
script for MacOS users in which the platform can be specified. If there is a mismatch, Docker will print a warning at the start of the build process that states the used platforms. Please update the required platform in the build script in accordance with your host platform.
# Windows:
build.bat
# Linux:
./build.sh
# MacOS - you might have to change the platform in the script according to your hardware; see the troubleshooting section for more information
./build-on-mac.sh
-
You can validate the installation by calling the validation corresponding to your OS. The validation runs about
30 minutes
depending on your system.# Windows: execute.bat validation # Linux | MacOS: execute.sh validation
The script will generate figures and tables similar to the ones presented in our paper. They are automatically saved to
./results
.Please note: In the provided default configuration (data/validation.properties), the experiment is run only on one set of variants.
-
If the variants are well-aligned, the comparison may result in perfect matches of the computed and actual feature trace,
-
(i.e., accuracy metrics of 1).
-
For a higher amount of compared variants and for a higher amount of proactive traces, highly accurate traces can be expected.
- All commands in this section are supposed to be executed in a terminal with working directory at the evaluation-repository's project root.
- You can stop the execution of any experiment by running the following command in another terminal:
# Windows Command Prompt: stop-execution.bat # Windows PowerShell: .\stop-execution.bat # Linux | MacOS ./stop-execution.sh
Stopping the execution may take a moment.
You can repeat the experiments exactly as presented in our paper. The following command executes 30 runs of the experiments for RQ1 and RQ2.
Please note: Due to comparing potentially large files, mapped to tree-structures, and feature expressions, the entire experiment requires high amounts of RAM (> 24GB) and several hours up to days (depending on the hardware capabilities). To achieve results faster, we recommend to reduce the executions, for instance by examining only one subject each. Similar, the lower the number of compared variants, the faster the experiment finishes.
# Windows Command Prompt:
execute.bat replication
# Windows PowerShell:
.\execute.bat replication
# Linux | MacOS
./execute.sh replication
Please note further: The variants for which feature traces are computed are determined randomly. Thus, if you replicate our experiment with exactly the same setup, the results may not be exactly the same. However, the general trend of boosting the accuracy should be visible for all computed metrics in a very similar matter.
By default, the properties used by Docker are configured to run the experiments as presented in our paper. We offer the possibility to change the default configuration.
- Open the properties file in data which you want to adjust
replication.properties
configures the experiment execution ofexecute.(bat|sh) replication
validation.properties
configures the experiment execution of
execute.(bat|sh) validation
- Change the properties to your liking
- Rebuild the docker image as described above
- Delete old results in the
./results
folder - Start the experiment as described above
Finally, you can plot the results using Docker.
If you have not executed the replication or validation, there are no results to analyze and plot, yet. However, we also provide our reported results for which the plots can be generated. To generate the plots shown in our paper, you have to copy the result files (.json) under reported-results to the results directory.
# Windows Command Prompt:
execute.bat plotting
# Windows PowerShell:
.\execute.bat plotting
# Linux | MacOS
./execute.sh plotting
The more experiments you run, the more space will be required by Docker. The easiest way to clean up all Docker images and containers afterwards is to run the following command in your terminal. Note that this will remove all other containers and images as well:
docker system prune -a
Please refer to the official documentation on how to remove specific images and containers from your system.
To run the experiment from source, you need to have installed:
- Maven
- Java JDK 17 or a newer version
The entrypoint to the experiment is implemented in the class Main. To run the experiment from that class, you need to provide the configuration of the experiment as first argument to the main-methods.
As argument to the main method you can specifiy
"data/replication.properties"
to conduct the entire experiment, as reported in the paper."data/validation.properties"
to conduct the validation experiment, which runs the experiment once with OpenVPN for a lower number of distinct percentages
Intermediary results are written to the directory 'results/intermediary', which persist the outcome of each execution run (i.e., with one number of applied percentages and to a fixed set of variants). The final results are only written when all execution runs, as specified in the properties files, have been executed. They can be found in the 'results' directory.
If you have not executed the replication or validation, there are no results to analyze and plot, yet.
The artifact also contains the results we obtained by conducting the experiment and which were reported in the paper.
To generate the plots presented in our paper, you need to copy the result files (.json) located in the reported-results directory to the results directory.
cp reported-results/* results/
- Navigate to the python sources
cd python
- Install virtualenv (if you haven't already):
pip install virtualenv
- Create a virtual environment:
Navigate to your project directory and run:
This will create a virtual environment named
virtualenv venv
venv
in your project directory. - Activate the virtual environment:
- On Windows, run:
.\venv\Scripts\activate
- On macOS and Linux, run:
source venv/bin/activate
- On Windows, run:
- Install the dependencies:
With the virtual environment activated, install the dependencies from
requirements.txt
by running:pip install -r requirements.txt
- Run the script:
With the dependencies installed, you can now run your Python script. For example, if your script is named
script.py
, run:python generatePlots.py
- Deactivate the virtual environment (when you're done):
To exit the virtual environment, simply run:
deactivate
You might encounter the following warning (or a similar one) during the build step of the docker image:
➜ trace-boosting-eval git:(main) ✗ ./execute.sh validation
Starting validation
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
Running validation
In this case, please update the host platform in the build-on-mac.sh
script and try to rerun the build step.
You might encounter a situation where the replication script runs in an unreachable state.
In this case, ensure that the ground truth and the repositories are available in your workspace. You can either execute setup.sh or copy them from Zenodo to your workspace in the respective directories.