Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splitrim returns 0 reads #18

Open
TakacsBertalan opened this issue Jul 21, 2023 · 1 comment
Open

Splitrim returns 0 reads #18

TakacsBertalan opened this issue Jul 21, 2023 · 1 comment

Comments

@TakacsBertalan
Copy link

Hi!
I am trying to run GOTTCHA on a CAMI dataset (toy human microbiome) and I keep receiving the following error.
$gottcha_new/bin/gottcha.pl --threads 11 --outdir gottcha_new/sajat_teszt --input /media/deltagene/Microbiome/CAMI_data/gastrooral_dir/sunbeam_output/qc/00_samples/sample2_anonymous_reads.fq --database gottcha/database/GOTTCHA_BACTERIA_c4937_k24_u30_xHUMAN3x.species
[00:00:00] Starting GOTTCHA v1.0c
[00:00:00] Auto set database level to SPECIES.
[00:00:00] Number of threads: 11
[00:00:00] Checking running environment...
[00:00:00] Done. All required scripts and tools found.
[00:00:00] Split-trimming with parameters fixL=30, minQ=20, ascii=33.
[00:00:00] Split-trimming: /media/deltagene/Microbiome/CAMI_data/gastrooral_dir/sunbeam_output/qc/00_samples/sample2_anonymous_reads.fq...
[00:03:26] Done splitrimming /media/deltagene/Microbiome/CAMI_data/gastrooral_dir/sunbeam_output/qc/00_samples/sample2_anonymous_reads.fq.
[00:03:26] Done merging splitrim stats.

                                RAW         SPLIT-TRIMMED  
                                ===         =============  
  # of Reads:            33,332,582                     0  (0.00 %)
  # of Bases:         4,999,887,300                     0  (0.00 %)

Mean Read Length: 150 0 (0.00 %)

[00:03:26] Mapping split-trimmed reads to GOTTCHA database and profiling...

                                   RAW         SPLIT-TRIMMED
                         =============         =============

# of Processed Reads: 33,332,582 0
# of Mapped Reads: 0 0 (genome)
# of Mapped Reads: 0 0 (plasmid only)
# of Unmapped Reads: 33,332,582 0

[00:05:22] Done profiling mapping results.

0 taxanomy(ies) found.

[00:05:22] No read mapped to species-level signatures. Please try again with upper-level databases.

Running the same command on different samples (not from CAMI) always results in at least some split trimmed reads. What could be the problem here?

Thanks,
Bertalan Takács

@TakacsBertalan
Copy link
Author

TakacsBertalan commented Aug 2, 2023

UPDATE:

This seems to be an issue with multiple parts. Firstly, the CAMI samples were interleaved (read1 and read2 in the same file) and GOTTCHA doesn't seem to like that. After deinterleaving samples, I could run GOTTCHA on 15 of my 20 samples. Interestingly, for 5 samples, I got the same error, splittrim returned 0 reads. Every time I received the same error message as in this issue: #5

I tried to run splittrim separately and each run resulted in the same message, except when I tried to modify the --minQ parameter. This resulted in the following error:
----> ENTRY HEADER:@dummy:1:DUMMY_FC:1:1:1:1 1:Y:0:A
Threads: 1 (effective) 1 (requested)
IDX = 0: reading from 0 to 5760994150
core.exception.ArrayIndexError@splitrim.d(758): index [150] is out of bounds for array of length 150

??:? onArrayIndexError [0x55f076006b1e]
??:? _d_arraybounds_indexp [0x55f075fe4dab]
??:? void splitrim.trimEntry(splitrim.inputOptions, in immutable(char)[], in immutable(char)[], in immutable(char)[], in immutable(char)[], ref ulong, ref ulong, ref ulong[ushort], ref ulong, ref immutable(char)[]) [0x55f075f669a7]
??:? void splitrim.parseFASTQ(in ulong, in ulong, ref splitrim.inputOptions, in ulong, std.stdio.File) [0x55f075f662a3]
??:? _Dmain [0x55f075f68919]

Just as a sanity check I checked the first few reads of my samples and they seem to be in order:
"@dummy:1:DUMMY_FC:1:1:1:1 2:Y:0:A
CGGCCTGATCGGTGATGGTGTGCCAGATGAACACCGGCGGCATGGTCGCGTCGACATGCTTCTCGATCGACAGCAGCTCGCGCAGCGACGCAACGGCTTTGCCGTCACCCAGCAGGTTGTCGAAACTGCCGCTGTACGCGCACCGCCCTG
+
DDDGEGGGEIIGIKH9JJJBHKKKIKHIKKKJKKEKJKHKKF@HKGHEKKE>GEJK6*DHJCAE)DEFCIE?@EE;:*E$$F=ECFDCFE)E$CD)EEE3ECA$'E4=FE=FDBDE?E9=E?$EDDEACDC=$E=$DA?D$CDCE$=$$A
@dummy:1:DUMMY_FC:1:1:1:2 2:Y:0:A
ATTTCAAATATATATTCTGAACTTGCCAGTTCCACTAATAAAGATGCTCAGATAATAGTAATTACAGGTAAAAACAGAAAATTATATGCAAAACTTATGTCTCTCAGTGAATTTTCCAATCTAGATACCAAAAGCCATGTTTTTATTAAA
+
DADE@GEEECIIEJKKGHH:KKJJJKHECK?K$KB=FGHHHEHJJAEKEH$ABFHIEKJGGKEFHEG$I:ECDII<$A,EE)EIEAE5=EEED?DAEABECEFEEEEE,F?E$EEE:DEA$D:DEEECF$,AE$$D$$B$$E$$D$EEC@
@dummy:1:DUMMY_FC:1:1:1:3 2:Y:0:A
CTTTAATAAACACAAATGTATTTACTCTTTTAATGTTATCATGTTGTGCAAGTGCTGCTAAATCTCCAGTCATAACTCCATCATCAATTGTCTTTAATGAAGATTTTTCTAACTTATTTGCAAACTCAACCAATCCTTTATTATTATCTA
+
CCDGGGGGI3I$IJKK@K,DIKHHHKJKK4EEJKK>IHJ$H@$BJK$JK$J?J<E$$KGGKJJEGIE?@?AECBFI<FEGFE$EEEEECACB1?EE$B$?DAEEDEEEDCDBFEEE,DADEEEEEDEEEE$@$$$DEE:):EC4BBED$$
@dummy:1:DUMMY_FC:1:1:1:4 2:Y:0:A
TTCTCCATCCTTAAAAGTAATTGAAAATTTTTCTATTAGATGTCCTCTGTAATTTAGTGGTTTGGAACTTACTACAATACCATTTACAGCTGTCATTTTACGTAATGTAAATACCTCATCTGTAGGTATTTTTGCAATCAACTCTACTTA
+
@DD2EEG9IIIDIJKJBK@HKKK=JJ<KJI2KH$KJEDJ?K$EEG8JHKJJJGGK4KKKEDKCFK)?;BGBKFECBEKEEEEBC=;D9)BED?F':$AAB$DAE$E3EEE$BDAEED$EDAD$DC$EDA$EBC,?DEB$@@EECAD$B?
@dummy:1:DUMMY_FC:1:1:1:5 2:Y:0:A
TGTAAATTTCATTGGTTATATTTGTGGGAGTTAACAGAGTTTTTTGACGGCTGTAATATTCAAAACTGTAATCCCTTATTCCAACGGATACTACGCTGATTTTCACATCATCTGTTCGCAGTCTTGCGGTTGCGTTTTCTGCCAAAGACA
+
CCDEGGECAI2IIKIEKJKIKKIDCAJJFJHK$HKIKEJ4KJIJGKKEEIICH
EI?KADEE9EJ$KICEJ:?EFEDEIECE$FEDEECEBEDCCCDC?EE;E$EEE$CDEE?EE$=?EE19CDDEE$6$E$?D$G?E$E;EACD$CE??"

I suspect that the developers don't really care about this project anymore, still, I hope somebody will see this and can offer some help!

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant