TransLocator

Project Structure

The project structure is as follows:

data: Contains the dataset in json format.
src: Contains the source code for the project.
requirements.txt: Contains the required python packages for the project.
README.md: Contains the project documentation.

Dataset

The refined Bench4BL dataset used for this project is provided in the data directory in json format. The dataset contains information about the location of the bus stops in the city of Bengaluru. The dataset contains the following fields:

bug_id: Unique identifier for the bug.
bug_title: Title of the bug.
bug_description: Description of the bug.
project: Project to which the bug belongs.
sub_project: Subject to which the bug belongs.
version: Version of the project.
fixed_version: Version in which the bug was fixed.
fixed_files: Files in which the bug was fixed as a json array.

Pre-requisites

The following are the pre-requisites for the project:

Python 3.10
Elasticsearch
NVIDIA CUDA enabled GPU
Required Python Packages

Installing Required Packages

Python 3.10:

We recommend using a virtual environment to install the packages and run the application. Learn to use a virtual environment here.

Windows:

Download Python 3.10:
- Visit python.org/downloads
- Download the Windows installer (Windows Installer (64-bit) recommended).
- Run the installer.
- Check the box to add Python to PATH during installation.
Verify Installation:
- Open Command Prompt.
- Type python --version.
- You should see Python 3.10.x.

Linux (Ubuntu/Debian):

Install Python 3.10:
- Open Terminal.
- Run the following commands:
```
sudo apt update
sudo apt install python3.10
```
Verify Installation:
- Type python3.10 --version.
- You should see Python 3.10.x.

Install PyTorch:

PyTorch with CUDA 11.3 support is required for the project.

Use the following command to install PyTorch with CUDA support:

pip install torch==1.10.0+cu113 torchvision==0.11.0+cu113 torchaudio==0.10.0+cu113 torchtext==0.11.0 -f https://download.pytorch.org/whl/cu113/torch_stable.html

Verify the installation by running the following command:

python -c "import torch; print(torch.cuda.is_available())"

You should see True if PyTorch is installed correctly with CUDA support.

If you do not have a CUDA-enabled GPU, install the CPU version of PyTorch. Learn more about PyTorch with CUDA support here.

Elasticsearch:

Windows:

Download Elasticsearch:
- Visit elastic.co/downloads/elasticsearch.
- Download the ZIP package for Windows.
Extract and Start Elasticsearch:
- Extract the downloaded ZIP file.
- Navigate to the extracted directory.
- Run bin\elasticsearch.bat in Command Prompt.
Verify Installation:
- Open a web browser.
- Go to http://localhost:9200.
- Check for a JSON response indicating Elasticsearch is running.

Linux (Ubuntu/Debian):

Download and Install Elasticsearch:

Open Terminal.

Run the following commands:

wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-<version>-amd64.deb
sudo dpkg -i elasticsearch-<version>-amd64.deb

Start Elasticsearch Service:

Run:

sudo systemctl start elasticsearch
sudo systemctl enable elasticsearch

Verify Installation:
- Open a web browser.
- Go to http://localhost:9200.
- Ensure Elasticsearch is running by checking for a JSON response.

Install Required Python Packages:

Navigate to Project Directory:
- Open terminal/command prompt.
- Use cd to move to the directory containing requirements.txt.
Install Packages:
- Run pip install -r requirements.txt.

Replicate

Index Documents in Elasticsearch for Each version of the Project:

Create Index:
- Run 'src/IR/Indexer/Index_Creator.py' to create an index in Elasticsearch. The configuration for the index is provided in 'IR_Config.yaml'.
- Extract the source files from Git Projects per version and index them in Elasticsearch using 'Indexer.py'. The GitHub Repositories are listed in the Bench4BL repository.
- The default port for Elasticsearch is 9200.
Train or download the models from the following link:
- Resources

Localizing the bugs:

Run the command below to localize the bugs:

python src --br-path /path/to/input/data  --kw-model-dir /path/to/keyword/model --ce-model-dir /path/to/cross-encoder/model --L 10 --topK_rerank 50 --topN 10

- `--br-path`: Path to the input data in json format. The format of the json file should follow the format of the dataset provided in the `data` directory.
- `--kw-model-dir`: Path to the keyword model.
- `--ce-model-dir`: Path to the cross-encoder model.
- `--L`: Length of the keywords.
- `--topK_rerank`: Number of bugs to rerank.
- `--topN`: Number of top outputs to return.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.idea		.idea
data		data
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TransLocator

Project Structure

Dataset

Pre-requisites

Installing Required Packages

Python 3.10:

Windows:

Linux (Ubuntu/Debian):

Install PyTorch:

Elasticsearch:

Windows:

Linux (Ubuntu/Debian):

Install Required Python Packages:

Replicate

Index Documents in Elasticsearch for Each version of the Project:

Localizing the bugs:

About

Releases

Packages

Languages

asifsamir/TransLocator

Folders and files

Latest commit

History

Repository files navigation

TransLocator

Project Structure

Dataset

Pre-requisites

Installing Required Packages

Python 3.10:

Windows:

Linux (Ubuntu/Debian):

Install PyTorch:

Elasticsearch:

Windows:

Linux (Ubuntu/Debian):

Install Required Python Packages:

Replicate

Index Documents in Elasticsearch for Each version of the Project:

Localizing the bugs:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages