InterTADs

InterTADs is an open-source tool written in R, for integrating multi-omics data (e.g. DNA methylation, expression, mutation) from the same physical source (e.g. patient) taking into account the chromatin configuration of the genome, i.e. the topologically associating domains (TADs).

Installation

You can simply clone the repository by using git:

git clone https://github.com/nikopech/InterTADs

Before running any scripts, make sure the following packages are installed in your machine:

install.packages(c("data.table", "tidyverse", "gplots", "png", "gghalves"))
devtools::install_github("stephenturner/annotables")

...and from Bioconductor:

BiocManager::install(c("TxDb.Hsapiens.UCSC.hg19.knownGene", "TxDb.Hsapiens.UCSC.hg38.knownGene", "GenomicRanges", "org.Hs.eg.db", "systemPipeR", "karyoploteR"))

Usage

There are three main scripts for integrating your multi-omics data:

Data_Integration.R
TADiff.R
Visualization.R

Data Integration

For the Data Integration part, all datasets are separated into two folders, freq and counts, based on the information they are carrying (frequency or score count values).

The two folders are placed into a directory, along with a meta-data file which provides information about the mapping between the columns for each dataset. For more details regarding the structure of this file please see here.

The script allows the user to define different folder (or file) names. Moreover, the user can choose a folder name for the output table and a option about the Human Genome that is being used (accepted values are hg19 or hg38).

Once every input is provided, the script can be run by:

source("Data_Integration.R")

TADiff

For the TADiff part, the paths to the input and output folders must be provided. Also a BED file is needed containing information about the TADs. In order to run the script:

source("TADiff.R")

Visualization

For the visualization of the results, the paths to input and output data need to be provided:

source("Visualization.R")

Data

The proposed method was evaluated on data from Chronic lymphocytic leukemia (DNA methylation and expression values). The datasets have been deposited in the ArrayExpress database at EMBL‐EBI under the accession numbers E‐MTAB‐6955 and E‐MTAB‐6962, respectively.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the MIT License - see the LICENSE file for details

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
Datasets		Datasets
Comparison_statistical_significance.R		Comparison_statistical_significance.R
Data_Integration.R		Data_Integration.R
LICENSE		LICENSE
README.md		README.md
TADiff.R		TADiff.R
Visualization.R		Visualization.R
evenDiff.R		evenDiff.R
helpers.R		helpers.R
libraries.R		libraries.R
prepareMethylationValues.R		prepareMethylationValues.R
runtime_microbenchmark.R		runtime_microbenchmark.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InterTADs

Installation

Usage

Data Integration

TADiff

Visualization

Data

Contributing

License

About

Languages

License

npechl/InterTADs

Folders and files

Latest commit

History

Repository files navigation

InterTADs

Installation

Usage

Data Integration

TADiff

Visualization

Data

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages