Introduction

This repository provides a comprehensive evaluation of all tools and reproduces the results from the EquiRep paper. The official EquiRep repository can be found here.

Data

All simulated and real data used in the experiments are available in the data directory.

simulated_data: dat3 contains the randomly simulated data of different unit lengths and copy numbers. dat_aax2 contains simulated sequences with 2 recurring kmers. dat_aax3 contains simulated sequences with 3 recurring kmers. error_10 and error_20 inside each of these folders represent error rates of 10% and 20% respectively.

HOR_data: hor_repeats.fasta contains the 13 Higher Order Repeat sequences from human chromosome 5. hor_combined.fasta contains the concatenated sequences (x), concatenated sequences with flanking regions (axa), and error rate of 1%, 5%, 10% applied to them (x_err1, x_err5, x_err10, axa_err1, axa_err5, axa_err10).

RCA_data: RCA_101.fasta contains selected 101 Nanopore long read sequences from human prostate tissue (GEO, accession number: GSE141693).

EquiRep

Installation

No external installation is required. The source code for EquiRep is provided in this repository at EquiRep_test/EquiRep.cpp.

Running experiments on simulated data

Navigate to the EquiRep Testing Folder:
```
cd EquiRep_test
```

Change exec.sh to select the dataset you'd like to use (options are dat3, dat_aax2, or dat_aax3):

input_dir_prefix=../data/simulated_data/dat3
# Replace dat3 with dat_aax2 or dat_aax3 for other simulated datasets

Run the following command to process all data files in the selected dataset. The results will be saved in the folders EquiRep_test/EquiRep_error10 and EquiRep_test/EquiRep_error20.
```
./exec.sh
```

Evalultion for simulated data results

Navigate to the Evaluation Folder:
```
cd ../eval
```

In r_eval.sh, specify the dataset to be evaluated:

true_dir=../data/simulated_data/dat3/$data_index
# Replace dat3 with dat_aax2 or dat_aax3 for other datasets

Run the evaluation script:
```
./r_eval.sh
```
Use the following command to compile all evaluation results into a summary sheet in the `EquiRep_test' folder:
```
python result_gather.py
```

Running experiments on HOR data

Running experiments on RCA data

mTR

Installation

To install mTR, visit the official mTR repository on GitHub.

Running experiments on simulated data

Navigate to the mTR Testing Folder:
```
cd mTR_test
```

Change run_mtr_simulated_data.sh to set the location of mTR and select the dataset you'd like to use (options are dat3, dat_aax2, or dat_aax3):

dataset=dat3
# Replace dat3 with dat_aax2 or dat_aax3 to test on other simulated data
tool_dir=
# Put the location of your installed mTR here

Change run_evaluate_edit.sh to select the dataset you'd like to use (options are dat3, dat_aax2, or dat_aax3):
Run the following command to process all data files in the selected dataset. The evaluation will be run as well. The results will be saved in the folders data/simulated_data/{dataset name}/error_10/MTR_results and data/simulated_data/{dataset name}/error_20/MTR_results.:
```
./run_all.sh
```

Evalultion for simulated data results

Change result_gather_mTR.py to set the dataset you used:

base_dir = "../data/simulated_data/dat3"
# Replace dat3 with dat_aax2 or dat_aax3 to test on other simulated data

Use the following command to compile all evaluation results into a summary sheet in the `mTR_test' folder:
```
python result_gather_mTR.py
```

Running experiments on HOR data

Running experiments on RCA data

TRF

Installation

To install TRF, visit the official TRF repository on GitHub.

Running experiments on simulated data

Navigate to the TRF Testing Folder:
```
cd TRF_test
```

Change run_trf_simulated_data.sh to set the location of TRF and select the dataset you'd like to use (options are dat3, dat_aax2, or dat_aax3):

dataset=dat3
# Replace dat3 with dat_aax2 or dat_aax3 to test on other simulated data
tool_dir=
# Put the location of your installed mTR here

Change run_evaluate_edit.sh to select the dataset you'd like to use (options are dat3, dat_aax2, or dat_aax3):
Execute the following command to process all data files in the selected dataset and run the evalution as well:
```
./run_all.sh
```

Evalultion for simulated data results

Change result_gather_TRF.py to set the dataset you used:

base_dir = "../data/simulated_data/dat3"
# Replace dat3 with dat_aax2 or dat_aax3 to test on other simulated data

Use the following command to compile all evaluation results into a summary sheet in the `TRF_test' folder:
```
python result_gather_TRF.py
```

Running experiments on HOR data

Running experiments on RCA data

mreps

Installation

To install mreps, visit the official mreps page.

Running experiments on simulated data

Navigate to the mreps Testing Folder:
```
cd mreps_test
```
Change run_mreps_simulated_data.sh to set the location of mreps and select the dataset you'd like to use (options are dat3, dat_aax2, or dat_aax3):
```
dataset=dat3
# Replace dat3 with dat_aax2 or dat_aax3 to test on other simulated data
tool_dir=
# Put the location of your installed mTR here
```
Change run_evaluate_edit.sh to select the dataset you'd like to use (options are dat3, dat_aax2, or dat_aax3):
Execute the following command to process all data files in the selected dataset and run the evalution as well:
```
./run_all.sh
```

Evalultion for simulated data results

Change result_gather_mreps.py to set the dataset you used:

base_dir = "../data/simulated_data/dat3"
# Replace dat3 with dat_aax2 or dat_aax3 to test on other simulated data

Use the following command to compile all evaluation results into a summary sheet in the `mTR_test' folder:
```
python result_gather_mreps.py
```

Running experiments on HOR data

Running experiments on RCA data

TideHunter

Installation

To install TRF, visit the official TRF repository on GitHub.

Running experiments on simulated data

Navigate to the tidehunter Testing Folder:
```
cd tidehunter_test
```
Change run_mtr_simulated_data.sh to set the location of tidehunter and select the dataset you'd like to use (options are dat3, dat_aax2, or dat_aax3):
```
dataset=dat3
# Replace dat3 with dat_aax2 or dat_aax3 to test on other simulated data
tool_dir=
# Put the location of your installed mTR here
```
Change run_evaluate_edit.sh to select the dataset you'd like to use (options are dat3, dat_aax2, or dat_aax3):
Execute the following command to process all data files in the selected dataset and run the evalution as well:
```
./run_all.sh
```

Evalultion for simulated data results

Change result_gather_tidehunter.py to set the dataset you used:

base_dir = "../data/simulated_data/dat3"
# Replace dat3 with dat_aax2 or dat_aax3 to test on other simulated data

Use the following command to compile all evaluation results into a summary sheet in the `tidehunter_test' folder:
```
python result_gather_tidehunter.py
```

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
EquiRep_test		EquiRep_test
TRF_test		TRF_test
data		data
eval		eval
hor_test		hor_test
mTR_test		mTR_test
mreps_test		mreps_test
tidehunter_test		tidehunter_test
.gitignore		.gitignore
README.md		README.md

Shao-Group/EquiRep-test

Folders and files

Latest commit

History

Repository files navigation

Introduction

Data

EquiRep

Installation

Running experiments on simulated data

Evalultion for simulated data results

Running experiments on HOR data

Running experiments on RCA data

mTR

Installation

Running experiments on simulated data

Evalultion for simulated data results

Running experiments on HOR data

Running experiments on RCA data

TRF

Installation

Running experiments on simulated data

Evalultion for simulated data results

Running experiments on HOR data

Running experiments on RCA data

mreps

Installation

Running experiments on simulated data

Evalultion for simulated data results

Running experiments on HOR data

Running experiments on RCA data

TideHunter

Installation

Running experiments on simulated data

Evalultion for simulated data results

Running experiments on HOR data

Running experiments on RCA data

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages