Parameters of Config File

The typical command for running the pipeline is as follows:

nextflow path/to/m6APipe/main.nf -c path/to/m6APipe/nextflow.config \
    --readPaths path/to/datapath \
    --designfile path/to/designfile \
    --comparefile path/to/comparefile \
    -profile standard[,docker] \
    -resume

As you can see, you can use the parameter like the above command "--fasta path/to/fasta_file"

Detail of Parameters

Mandatory arguments
readPaths	Path to input data (must be surrounded with quotes)
fasta	Path to genome sequence file ( .fa )
gtf	Path to genome annotation file ( .gtf )
designfile	Path to file of designfile ( format: Sample_ID,input_FileName,ip_FileName,Group )
comparefile	Path to file of comparefile ( format: A_vs_B )
aligners	star/bwa/tophat2/hisat2/none (must be surrounded with quotes) 'none' means input file is BAM file and skip alignment
peakCalling_mode	"group" OR "independence" ( Group means that there are biological replicates in function of PeakCalling, while independence means no biological replicate)
peakMerged_mode	"mspc" OR "rank" OR "macs2" OR "MATK" OR "metpeak"
methylation_analysis_mode	"MATK" OR "QNB" OR "Wilcox-test" OR "MeTDiff" OR "edgeR" OR "DESeq2"
expression_analysis_mode	"DESeq2" OR "edgeR" OR "none"
gzip	Boolean value ( true/false ) : True means your data is gzip compressed ( .gz ) while false is not
singleEnd	Specifies that the input is single end reads
stranded	Specifies that the input is strand specific ( yes/no/reverse ), defalut is 'no'

Options parameters
tophat2_index	Path to tophat2 index, eg. "path/to/Tophat2Index/*"
hisat2_index	Path to hisat2 index, eg. "path/to/Hisat2Index/*"
bwa_index	Path to bwa index, eg. "path/to/BwaIndex/*"
star_index	Path to star index, eg. "path/to/StarIndex/"
matk_jar	Path to the jar file of MATK, eg "path/to/MATK-1.0.jar"

Other parameters
outdir	The output directory where the results will be saved, defalut = $baseDir/results
email	Set this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits
skip_qc	Skip all QC steps
skip_expression	Skip all differential expression analysis steps
skip_peakCalling	Skip all Peak Calling steps
skip_diffpeakCalling	Skip all Differential methylation analysis
skip_fastqc	Skip FastQC
skip_fastp	Skip Fastp
skip_rseqc	Skip RSeQC
skip_createbedgraph	Skip generating bigwig file for genebody coverage
skip_metpeak	Skip MeTPeak's process of Peak Calling steps
skip_macs2	Skip MACS2's process of Peak Calling steps
skip_matk	Skip MATK's process of Peak Calling steps
skip_meyer	Skip Meyer's process of Peak Calling steps
skip_cufflinks	Skip the cufflinks process of differential expression analysis steps
skip_edger	Skip the EdgeR process of differential expression analysis steps
skip_deseq2	Skip the DESeq2 process of differential expression analysis steps

Designfile

Edit the nextflow.config and define "readPaths", "designfile", "comparefile", "aligners" and correspondiente alignment index for recommend. Designfile is just like the following table with a comma (,) separated, which is .csv suffix file. You also can see in designfile_test.csv.

Sample_ID	input_FileName	ip_FileName	Group
H1A_Endo	A	B	group_Endo
H1A_ES	C	D	group_ES
H1B_Endo	E	F	group_Endo
H1B_ES	G	H	group_ES

Tips 1. A, B, C... mean the filenames of data, just like A.fastq.gz. 2. If your data is .fastq.gz suffix file, please add the parameter of gzip, just like "--gzip true". 3. If your filename of data is "Hela_cell_input.fastq.gz", please write its filename as "Hela_cell_input".

Sequence Data(.fastq)

While your data is paired-end, you need to make the file name format, just like "A_R1.fastq[.gz]" or "A_1.fastq[.gz]".( A is the file name, like the above table's input_FileName ) For example, your paired-end data file name is the following text if your designfile is the above table.

# paired-end data file names
A_R1.fastq 
A_R2.fastq 
B_R1.fastq
B_R2.fastq
------  OR  -------
A_1.fastq 
A_2.fastq 
B_1.fastq
B_2.fastq

And then the single-end data file name is just like the following text.

# single-end data file names
A.fastq 
B.fastq

Aligned Data(.bam)

the bam data file name is just like the following text.

# single-end data file names
A.bam 
B.bam

Comparefile

Comparefile is just like the following text which is a "_vs_" between two groups, just like the file comparefile.txt.

group_Endo_vs_group_ES

2018-2019 Center for Bioinformatics, Sun Yat-sen University Cancer Center

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parameters of Config File

Detail of Parameters

Designfile

Sequence Data(.fastq)

Aligned Data(.bam)

Comparefile

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally