You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Both '--read_length' and '--macs_gsize' not specified! Please specify either to infer MACS2 genome size for peak calling.
As you mention in your pipeline's documentation: https://nf-co.re/chipseq/2.0.0/parameters, --read_length is necessary for the peak calling if --macs_gsize is not provided. However, when you look at the --macs_gsize explanation, the documentation says that is not necessary to provide the --macs_gsize as long as you have been provided the --genome.
Having said that, I was wondering whether the documentation is not updated yet due to the fact that the --read_length parameter was included in the latest release. If this is the case, I'd really appreciate if you can include where we can find the --read_length and --macs_gsize information for several genomes in order to be able to run the pipeline. If it is not the case, I'd really appreciate if you can check the code in order to know what is going on.
Thank you,
Ariadna
Command used and terminal output
`./nextflow run nf-core/chipseq --single_end --input ./design.csv --outdir ./Results_chipseq --genome GRCh38 -profile singularity --narrow_peak`Both '--read_length' and '--macs_gsize' not specified! Please specify either to infer MACS2 genome size for peak calling.
Relevant files
No response
System information
N E X T F L O W ~ version 22.04.3
nf-core/chipseq v2.0.0
The text was updated successfully, but these errors were encountered:
Hi @ariadnaaterrades
This is the intended behavior. You need to either explicitly provide the macs gsize using the --macs_gsize parameter or otherwise, you need to provide the length of your reads using the --read_length parameter. When the latter parameter is set together with a genome available in the igenomes config then the macs gsize is retrieved using the corresponding map here. The reason is that the genome size is different for different read lengths. If the genome it is not available in the igenomes config then the pipeline calculates macs gsize using the unique-kmers.py script of khmer as explained here but for this again we need to know which is the size of the reads that is set by --read_length. We discussed to set a default read length but we were afraid that then some users will just use the default value and not be aware of the behavior discussed above. Does it makes sense to you now?
Anyway, probably we should improve the documentation regarding this behavior.
Description of the bug
Hi,
I've tried to run the pipeline and I've ended up having an error which didn't show up the last time I ran it:
As you mention in your pipeline's documentation: https://nf-co.re/chipseq/2.0.0/parameters,
--read_length
is necessary for the peak calling if--macs_gsize
is not provided. However, when you look at the--macs_gsize
explanation, the documentation says that is not necessary to provide the--macs_gsize
as long as you have been provided the--genome
.Having said that, I was wondering whether the documentation is not updated yet due to the fact that the
--read_length
parameter was included in the latest release. If this is the case, I'd really appreciate if you can include where we can find the--read_length
and--macs_gsize
information for several genomes in order to be able to run the pipeline. If it is not the case, I'd really appreciate if you can check the code in order to know what is going on.Thank you,
Ariadna
Command used and terminal output
Relevant files
No response
System information
N E X T F L O W ~ version 22.04.3
nf-core/chipseq v2.0.0
The text was updated successfully, but these errors were encountered: