Cannot add .gz; skipping #97

u2058152 · 2021-10-14T14:43:38Z

I am working with data from Wheat, where the chromosomes are very large, and am using ASCIIGenome to look at peak output from MACS3. I want to add two files, the narrowPeak and summits files. When I attempt this using the following command:
ASCIIGenome -fa Wheat.v21.fa Control.bam Wheat.v21.bam narrowPeak narrowPeak.gz
I get the following message:
Warning: 37457. Skipping:

Cannot add narrowPeak.gz; skipping

I think this error is to do with the size of the files. I compressed the files using tabix as in the ASCIIGenome instructions but still get the same error

dariober · 2021-10-15T16:15:36Z

Hi- thanks for reporting the issue. I'm pretty sure this is due to tabix index failing with chromosomes larger than 512MB.

One should work with CSI indexes instead for both bam and interval files (I guess this is what you have for your bam files?). Unfortunately though, I'm not sure htsjdk supports CSI index for bed files but at least it does for BAM files (and I think IGV supported csi for interval files only recently, if that makes me feel better...).

As a hack, you could convert your narrowPeaks to bam and load those instead. You could do this using, e.g. bedtools:

bedToBam -i test.narrowPeak -g genome.fasta.fai | samtools sort > test.narrowPeak.bam
samtools index -c test.narrowPeak.bam

If you want to add additional information as sam tags you could use (check it's ok!):

# Prepare header
bedToBam -i test.narrowPeak -g genome.fasta.fai | samtools view -H > test.narrowPeak.hdr

# Output reads/peaks
bedToBam -i test.narrowPeak -g genome.fasta.fai | samtools view > test.narrowPeak.txt

# Prepare tags (ep: End position; sc:score; fc: fold-change, pv: pValue; qv: qValue; sm: peak summit)
awk -v OFS='\t' '{print "ep:i:"$3, "sc:f:"$5, "fc:f:"$7, "pv:f:"$8, "qv:f:"$9, "sm:i:"$10}' test.narrowPeak > test.narrowPeak.tags

# Combine header, reads, and tags. Then index
cat test.narrowPeak.hdr <(paste test.narrowPeak.txt test.narrowPeak.tags) | samtools sort > test.narrowPeak.bam
samtools index -c test.narrowPeak.bam

You would have to do the same for the annotation file. If you go through this route I would recommend using the latest versions of samtools and bedtools.

Hope this helps - I'm sorry it's only a temporary solution!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot add .gz; skipping #97

Cannot add .gz; skipping #97

u2058152 commented Oct 14, 2021 •

edited

Loading

dariober commented Oct 15, 2021

Cannot add .gz; skipping #97

Cannot add .gz; skipping #97

Comments

u2058152 commented Oct 14, 2021 • edited Loading

dariober commented Oct 15, 2021

u2058152 commented Oct 14, 2021 •

edited

Loading