Tuan Trieu
Department of Computer Science
University of Missouri, Columbia
Email: tuantrieu@mail.missouri.edu
Jianlin Cheng, PhD
Department of Computer Science
University of Missouri, Columbia
Email: chengji@missouri.edu
- src: source code in java
- input: sample input data
- output: all HierarchicalModeller experimental output data
- executable: Download latest .jar executable from here : https://github.com/BDM-Lab/Hierarchical3DGenome/releases
- miniMDS: output results and scripts generated for this method
To run the tool, type: java -jar HierarchicalModeller.jar chr_id resolution observed_contact_data normalized_contact_data domain_file output_folder
Example: java -jar HierarchicalModeller.jar 10 5000 input/chr10_5kb.RAWobserved input/chr10_5kb_gm12878_list.txt input/GSE63525_GM12878_primary+replicate_Arrowhead_domainlist_whole.txt output
-
Parameters:
- chr_id: eg. 1, 2, ..
- resolution: e.g 5000
- observed_contact_data: observed hi-C contact file, each line contains 3 numbers (separated by a space) of a contact, position_1 position_2 interaction_frequencies (input/chr10_5kb.RAWobserved)(can be downloaded from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE63525)
- normalized_contact_data: normalized hi-C contact file, each line contains 3 numbers (separated by a space) of a contact, position_1 position_2 interaction_frequencies (input/chr10_5kb_gm12878_list.txt) (can be downloaded and normalized from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE63525)
- domain_file: file contains domains identified by Juicer (input/GSE63525_GM12878_primary+replicate_Arrowhead_domainlist_whole.txt) (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE63525)
- output_folder: output folder
-
Typically, the input is several GBs in size and therefore, the program requires a lot of RAM memory to run. We ran our experiment in a server with 120 GB RAM and 80 cores.
The executable software and the source code of is distributed free of charge as it is to any non-commercial users. The authors hold no liabilities to the performance of the program.