Input JSON

An input JSON file includes all input parameters and metadata for running pipelines. Items 1), 2), 3), 4) and 5) are mandatory. Item 6) is optional so that our pipeline will use default values if it is not defined.

Mandatory

Input FASTQ file pairs.
Reference genome bwa index.
Reference genome chromosome sizes.
Restriction site locations in the reference genome sequence.
Name of the restriction enzyme

Optional

Pipeline parameters.
Boolean flag no_call_loops to skip loop calling using HiCCUPS. To set this flag, add the following parameter to the input json: "hic.no_call_loops": true

Templates

We provide three template JSON files for processing of a single library with one or more sequencing runs and for multiple libraries.

template for a single sequencing run from a single library
template for two sequencing runs from a single library
template for two libraries, each having a single sequencing run

Let us take a close look at the following template JSON. Comments are not allowed in a JSON file but we added some comments to help you understand each parameter.

{
    ////////// 1) Input FASTQ files //////////
    "hic.fastq": [[[
        "test/test_data/merged_read1.fastq.gz",
        "test/test_data/merged_read2.fastq.gz"
    ]]],

    ////////// 2) Reference genome chromosome sizes//////////
    "hic.chrsz": "test/test_data/ce10_selected.chrom.sizes.tsv",
    
    ////////// 3) Restriction sites locations in the reference genome sequence //////////
    "hic.restriction_sites": "test/test_data/ce10_selected_MboI.txt",

    ////////// 4) Reference genome index //////////
    "hic.reference_index": "test/test_data/ce10_selected.tar.gz",
    
    ////////// 5) Ligation site sequence //////////
    "hic.restriction_enzyme": "MboI"

}

Reference genome

In order to run the HiC pipeline you will need to specify the bwa index file prepared using a referemnce genome sequence. We recommend using reference files from the ENCODE portal to enasure comparability of the analysis results.

reference file description	assembly	ENCODE portal link
bwa index	hg19	link
genome fasta	hg19	link
chromosome sizes	hg19	link
bwa index	GRCh38	link
genome fasta	GRCh38	link
chromosome sizes	GRCh38	link

You will also need a restriction map file appropriate for the restriction enzyme and assembly. MboI and DpnII share the same restriction map because they have the same recognition site.

restriction enzymes	assembly	ENCODE portal link
DpnII, MboI	GRCh38	link
HindIII	GRCh38	link
DpnII, MboI	hg19	link
HindIII	hg19	link

Alternatively, you can also create your own restriction map file using the generate_site_positions.py script from the juicer pipeline. You should make sure that your restriction map has a format like:

1 11160 12411 12461 ... 249250621
2 11514 11874 12160 ... 243199373
3 60138 60662 60788 ... 198022430

Other formats can lead to problems with the hiccups step of the pipeline.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!