-
Notifications
You must be signed in to change notification settings - Fork 10
Pre Processing Data
dcHiC accepts sparse matrix/bed pairings (Hi-C Pro) as its default input, although other formats can be converted. See below for .cool and .hic support.
Input Option | Meaning |
---|---|
-input | [required] Specify 'cool' for .cool files, and 'hic' for .hic files |
-file | [required] Specify file path for .cool/.mcool/.hic file |
-res | [required] Specify resolution for analysis (e.g. '100000') |
-prefix | [required] Specify prefix of results. |
-genomeFile | [.cool only] Location of chromosome size file. Can edit to remove chromosomes from analysis. |
-removeChr | [.hic only; optional] Remove chromosomes by specifying in "A,B,C" format. Commonly used for chromosome Y. |
To process .cool files, dcHiC uses the cooler dump feature to obtain the sparse matrix and uses the provided -genomeFile to produce a corresponding bed index file. It accepts .mcool and .cool files. The -genomeFile should be a tab-separated list of chromosomes with their associated sizes.
python preprocess.py -input cool -file coolfile.mcool -genomeFile mm10sizes.txt -res 100000 -prefix coolfile
NOTE: If your .cool/.mcool file only covers certain chromosome(s), change the -genomeFile so that it only specifies those.
To process .hic files, dcHiC uses the straw library. The pre-processing script outputs the sparse matrix input necessary for dcHiC. First, make sure you have hic-straw
installed in your environment. If you wish to use all chromosomes, omit the -removeChr
argument.
python preprocess.py -input hic -file HiCFile.hic -res 1000000 -prefix hicfile -removeChr 2,3,4
Note: This only accepts cis matrix interactions at the moment.