Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unparser #152

Open
wants to merge 24 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
272cae5
Drop Me
Aug 23, 2024
a54cb7f
Die Klasse DiffNode wurde zum speichern der Zeile mit dem endif erwei…
Oct 26, 2024
02bbdb7
Die Klasse VariationTreeNode wurde zum speichern der Zeile mit dem en…
Oct 26, 2024
558bad1
Die Klassen VariationNode und Projection wurden um eine Methode zum A…
Oct 26, 2024
f4e3966
Beim Erstellen eines VariationTrees aus einem Variation-Diff wird jet…
Oct 26, 2024
f3945e7
Unparser für Variation-Trees wurde eingebaut dazu eine Testkalsse ers…
Oct 28, 2024
ba1d260
Der Parser wurde zum Speichern von Zeilen mit endif erweitert. Der Ty…
Oct 30, 2024
b484c49
Die Unparser für VariationTrees und VariationDiffs wurde erstellt.
Nov 1, 2024
a532fc4
methode zum projezieren von text diff erstellt
Nov 2, 2024
af652ca
In der Funktion zwei sachen mit einander vertauscht damit dem Algorit…
Nov 4, 2024
771651a
TestKlasse und Testfälle für das Testen von VariattionUnparser eingebaut
Nov 7, 2024
a7dd02d
Dataset datei hinzugefügt
Nov 27, 2024
ac68c15
Analayse von Unparser erstellt und ef Startbar in Doker gemacht
Nov 27, 2024
718866d
Analayse von Unparser repariert
Nov 28, 2024
4b25def
Analayse von Unparser repariert
Nov 28, 2024
06ebe21
Änderung der überprüften Daten
eugen-shulimov Dec 3, 2024
acf8956
Änderung der bei der Analyse
eugen-shulimov Dec 5, 2024
cd762c5
UnparseAnalysis: error reporting
pmbittner Dec 5, 2024
047db5f
Das speichern von endif berücksichtigt jetzt auch Time
eugen-shulimov Dec 21, 2024
fe1319f
Fälle welche nicht korrekt ungeparst werden konnten, aus der Auswertung
eugen-shulimov Dec 21, 2024
c8d5605
die methode zum undiffen wurde überarbeitet, doppelter code entfernt
eugen-shulimov Dec 21, 2024
b178a04
fall vergessetn zu commiten
eugen-shulimov Dec 21, 2024
3b2ff14
Die Klasse und datein welche die Auswertung umgesetzt haben
eugen-shulimov Dec 21, 2024
67f538f
Möglichkeit einzeln für datein die semantische gleicheit zu testen
eugen-shulimov Dec 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Hallo.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Hallo
5 changes: 5 additions & 0 deletions docs/datasets/eugen-bachelor-thesis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Project name | Domain | Source code available (\*\*y\*\*es/\*\*n\*\*o)? | Is it a git repository (\*\*y\*\*es/\*\*n\*\*o)? | Repository URL | Clone URL | Estimated number of commits
-------------------|-------------------------|-------------------------------------------------|--------------------------------------------------|--------------------------------------------------------------|----------------------------------------------------|---------------------------------
berkeley-db-libdb | database system | y | y | https://github.com/berkeleydb/libdb | https://github.com/berkeleydb/libdb.git | 7
sylpheed | e-mail client | y | y | https://github.com/jan0sch/sylpheed | https://github.com/jan0sch/sylpheed.git | 2,682
vim | text editor | y | y | https://github.com/vim/vim | https://github.com/vim/vim.git | 17,109
57 changes: 57 additions & 0 deletions replication/unparse-views/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# syntax=docker/dockerfile:1

FROM alpine:3.15
# PACKAGE STAGE

# Prepare the compile environment. JDK is automatically installed
RUN apk add maven

# Create and navigate to a working directory
WORKDIR /home/user

COPY local-maven-repo ./local-maven-repo

# Copy the source code
COPY src ./src
# Copy the pom.xml if Maven is used
COPY pom.xml .
# Execute the maven package process
RUN mvn package || exit

FROM alpine:3.15

# Create a user
RUN adduser --disabled-password --home /home/sherlock --gecos '' sherlock

RUN apk add --no-cache --upgrade bash
RUN apk add --update openjdk17

# Change into the home directory
WORKDIR /home/sherlock

# Copy the compiled JAR file from the first stage into the second stage
# Syntax: COPY --from=STAGE_ID SOURCE_PATH TARGET_PATH
WORKDIR /home/sherlock/holmes
COPY --from=0 /home/user/target/diffdetective-*-jar-with-dependencies.jar ./DiffDetective.jar
WORKDIR /home/sherlock
RUN mkdir results

# Copy the setup
COPY docs holmes/docs

# Copy the docker resources
COPY docker/* ./
COPY replication/unparse-views/docker/* ./
RUN mkdir DiffDetectiveMining

# Adjust permissions
RUN chown sherlock:sherlock /home/sherlock -R
RUN chmod +x execute.sh
RUN chmod +x entrypoint.sh
RUN chmod +x fix-perms.sh

# Set the entrypoint
ENTRYPOINT ["./entrypoint.sh", "./execute.sh"]

# Set the user
USER sherlock
162 changes: 162 additions & 0 deletions replication/unparse-views/INSTALL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
# Installation
## Installation Instructions
In the following, we describe how to replicate the validation from our paper (Section 5) step-by-step.
The instructions explain how to build the Docker image and run the validation in a Docker container.

### 1. Install Docker (if required)
How to install Docker depends on your operating system:

- _Windows or Mac_: You can find download and installation instructions [here](https://www.docker.com/get-started).
- _Linux Distributions_: How to install Docker on your system, depends on your distribution. The chances are high that Docker is part of your distributions package database.
Docker's [documentation](https://docs.docker.com/engine/install/) contains instructions for common distributions.

Then, start the docker deamon.

### 2. Open a Suitable Terminal
```
# Windows Command Prompt:
- Press 'Windows Key + R' on your keyboard
- Type in 'cmd'
- Click 'OK' or press 'Enter' on your keyboard

# Windows PowerShell:
- Open the search bar (Default: 'Windows Key') and search for 'PowerShell'
- Start the PowerShell

# Linux:
- Press 'ctrl + alt + T' on your keyboard
```

Clone this repository to a directory of your choice using git:
```shell
git clone https://github.com/VariantSync/DiffDetective.git
```
Then, navigate to the `esecfse22` folder in your local clone of this repository:
```shell
cd DiffDetective/replication/esecfse22
```

### 3. Build the Docker Container
To build the Docker container you can run the `build` script corresponding to your operating system:
```
# Windows:
.\build.bat
# Linux/Mac (bash):
./build.sh
```

## 4. Verification & Replication

### Running the Replication or Verification
To execute the replication you can run the `execute` script corresponding to your operating system with `replication` as first argument. To execute the script you first have to navigate to the `esecfse22` directory, if you have not done so.
```shell
cd DiffDetective/replication/esecfse22
```

#### Windows:
`.\execute.bat replication`
#### Linux/Mac (bash):
`./execute.sh replication`

> WARNING!
> The replication will at least require an hour and might require up to a day depending on your system.
> Therefore, we offer a short verification (5-10 minutes) which runs DiffDetective on only four of the datasets.
> You can run it by providing "verification" as argument instead of "replication" (i.e., `.\execute.bat verification`, `./execute.sh verification`).
> If you want to stop the execution, you can call the provided script for stopping the container in a separate terminal.
> When restarted, the execution will continue processing by restarting at the last unfinished repository.
> #### Windows:
> `.\stop-execution.bat`
> #### Linux/Mac (bash):
> `./stop-execution.sh`

You might see warnings or errors reported from SLF4J like `Failed to load class "org.slf4j.impl.StaticLoggerBinder"` which you can safely ignore.
Further troubleshooting advice can be found at the bottom of this file.

The results of the verification will be stored in the [results](results) directory.

### Expected Output of the Verification
The aggregated results of the verification/replication can be found in the following files.
The example file content shown below should match your results when running the _verification_.
(Note that the links below only have a target _after_ running the replication or verification.)

- The [speed statistics](results/validation/current/speedstatistics.txt) contain information about the total runtime, median runtime, mean runtime, and more:
```
#Commits: 24701
Total commit process time is: 14.065916666666668min
Fastest commit process time is: d86e352859e797f6792d6013054435ae0538ef6d___xfig___0ms
Slowest commit process time is: 9838b7032ea9792bec21af424c53c07078636d21___xorg-server___7996ms
Median commit process time is: f77ffeb9b26f49ef66f77929848f2ac9486f1081___tcl___13ms
Average commit process time is: 34.166835350795516ms
```
- The [classification results](results/validation/current/ultimateresult.metadata.txt) contain information about how often each pattern was matched, and more.
```
repository: <NONE>
total commits: 42323
filtered commits: 7425
failed commits: 0
empty commits: 10197
processed commits: 24701
tree diffs: 80751
fastestCommit: 518e205b06d0dc7a0cd35fbc2c6a4376f2959020___xorg-server___0ms
slowestCommit: 9838b7032ea9792bec21af424c53c07078636d21___xorg-server___7996ms
runtime in seconds: 853.9739999999999
runtime with multithreading in seconds: 144.549
treeformat: org.variantsync.diffdetective.variation.diff.serialize.treeformat.CommitDiffVariationDiffLabelFormat
nodeformat: org.variantsync.diffdetective.mining.formats.ReleaseMiningDiffNodeFormat
edgeformat: org.variantsync.diffdetective.mining.formats.DirectedEdgeLabelFormat with org.variantsync.diffdetective.mining.formats.ReleaseMiningDiffNodeFormat
analysis: org.variantsync.diffdetective.validation.PatternValidationTask
#NON nodes: 0
#ADD nodes: 0
#REM nodes: 0
filtered because not (is not empty): 212
AddToPC: { total = 443451; commits = 22470 }
AddWithMapping: { total = 51036; commits = 2971 }
RemFromPC: { total = 406809; commits = 21384 }
RemWithMapping: { total = 36622; commits = 2373 }
Specialization: { total = 7949; commits = 1251 }
Generalization: { total = 11057; commits = 955 }
Reconfiguration: { total = 3186; commits = 381 }
Refactoring: { total = 4862; commits = 504 }
Untouched: { total = 0; commits = 0 }
#Error[conditional macro without expression]: 2
#Error[#else after #else]: 2
#Error[#else or #elif without #if]: 11
#Error[#endif without #if]: 12
#Error[not all annotations closed]: 8
```

Moreover, the results comprise the (LaTeX) tables that are part of our paper and appendix.
The processing times might deviate because performance depends on your hardware.

### (Optional) Running DiffDetective on Custom Datasets
You can also run DiffDetective on other datasets by providing the path to the dataset file as first argument to the execution script:

#### Windows:
`.\execute.bat path\to\custom\dataset.md`
#### Linux/Mac (bash):
`./execute.sh path/to/custom/dataset.md`

The input file must have the same format as the other dataset files (i.e., repositories are listed in a Markdown table). You can find [dataset files](../../docs/datasets/all.md) in the [docs/datasets](../../docs/datasets) folder.

## Troubleshooting

### 'Got permission denied while trying to connect to the Docker daemon socket'
`Problem:` This is a common problem under Linux, if the user trying to execute Docker commands does not have the permissions to do so.

`Fix:` You can fix this problem by either following the [post-installation instructions](https://docs.docker.com/engine/install/linux-postinstall/), or by executing the scripts in the replication package with elevated permissions (i.e., `sudo`).

### 'Unable to find image 'replication-package:latest' locally'
`Problem:` The Docker container could not be found. This either means that the name of the container that was built does not fit the name of the container that is being executed (this only happens if you changed the provided scripts), or that the Docker container was not built yet.

`Fix:` Follow the instructions described above in the section `Build the Docker Container`.

### No results after verification, or 'cannot create directory '../results/validation/current': Permission denied'
`Problem:` This problem can occur due to how permissions are managed inside the Docker container. More specifically, it will appear, if Docker is executed with elevated permissions (i.e., `sudo`) and if there is no [results](results) directory because it was deleted manually. In this case, Docker will create the directory with elevated permissions, and the Docker user has no permissions to access the directory.

`Fix:` If there is a _results_ directory, delete it with elevated permission (e.g., `sudo rm -r results`).
Then, create a new _results_ directory without elevated permissions, or execute `git restore .` to restore the deleted directory.

### Failed to load class "org.slf4j.impl.StaticLoggerBinder"
`Problem:` An operation within the initialization phase of the logger library we use (tinylog) failed.

`Fix:` Please ignore this warning. Tinylog will fall back onto a default implementation (`Defaulting to no-operation (NOP) logger implementation`) and logging will work as expected.
Loading