Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty NUC_FASTA_OUT #115

Closed
GomathiNayagam opened this issue Mar 21, 2023 · 6 comments
Closed

Empty NUC_FASTA_OUT #115

GomathiNayagam opened this issue Mar 21, 2023 · 6 comments
Labels
bug Something isn't working

Comments

@GomathiNayagam
Copy link

Hi,
I am trying to find AMR genes in NUC_FASTA. Though AMRfinder identifies AMR in the input, it outputs an empty NUC_FASTA_OUT. I am probably making a silly error here. Could you help me out?

amrfinder --nucleotide $NUC_FASTA --ident_min 0.9 --coverage_min 0.89 --nucleotide_output out.fasta --threads 6 --output amrout.tsv

@evolarjun
Copy link
Contributor

evolarjun commented Mar 21, 2023

Hi @GomathiNayagam ,

I don't see a silly error, and I'm having trouble reproducing your issue. With the test data distributed with AMRFinderPlus I get FASTA output:

$ NUC_FASTA=test_dna.fa
$ amrfinder --nucleotide $NUC_FASTA  --ident_min 0.9 --coverage_min 0.89 --nucleotide_output out.fasta --threads 6 --output amrout.tsv
Running: amrfinder --nucleotide test_dna.fa --ident_min 0.9 --coverage_min 0.89 --nucleotide_output out.fasta --threads 6 --output amrout.tsv
Software directory: '/panfs/pan1.be-md.ncbi.nlm.nih.gov/bacterial_pathogens/backup/packages/AMRFinderPlus_v3.11.4/'
Software version: 3.11.4
Database directory: '/panfs/pan1.be-md.ncbi.nlm.nih.gov/bacterial_pathogens/backup/packages/AMRFinderPlusData/2023-02-23.1'
Database version: 2023-02-23.1
AMRFinder translated nucleotide search
  - include -O ORGANISM, --organism ORGANISM option to add mutation searches and suppress common proteins
Running blastx ...
Making report ...
AMRFinder took 9 seconds to complete

The results look as I would expect:

$ wc -l amrout.tsv
6 amrout.tsv
$ fgrep -c '>' out.fasta
5

Could you paste in what AMRFinderPlus prints to the screen and attach your $NUC_FASTA file? That might help me reproduce your issue.

Thanks,
Arjun

@GomathiNayagam
Copy link
Author

Hi,
I too tried with the test data and it produces nuc_output. But it doesn't work for my file. So, I think it could be an error from my file. Please find my fasta file attached.

This is what the AMRfinder prints out.

Software directory: '/home/gomathinayagam/miniconda3/envs/amrfinder/bin/'
Software version: 3.11.4
Database directory: '/home/gomathinayagam/miniconda3/envs/amrfinder/share/amrfinderplus/data/2023-02-23.1'
Database version: 2023-02-23.1
AMRFinder translated nucleotide search
  - include -O ORGANISM, --organism ORGANISM option to add mutation searches and suppress common proteins
Running tblastn ...
Making report ...
AMRFinder took 16 seconds to complete

test.gz

@vbrover
Copy link
Contributor

vbrover commented Mar 21, 2023

Thank you for reporting this!
It is a bug in amrfinder: leading underscore symbol is trimmed from the contig name in the report.
This will be fixed in version 3.11.5.

@GomathiNayagam
Copy link
Author

Yes, it is the 'unusual' FASTA header. It works after editing the headers.
Thank you very much!

@evolarjun
Copy link
Contributor

@GomathiNayagam I'm glad to hear it's working for you.

I'm going to reopen just for our tracking because we consider this a bug and plan to release a fix with the next AMRFinderPlus software release.

Thanks for reporting!

@evolarjun evolarjun reopened this Mar 22, 2023
evolarjun added a commit that referenced this issue Apr 10, 2023
Release 3.11.8
- Performance improvements by optimizing blast parameters
    - Faster 70% single-threaded on nucleotide-only run
    - Faster by 64% on single-threaded protein-only run
    - Faster by 58% on single-threaded combined run
- Fixed handling for FASTA identifiers with leading underscore "_" (
- Empty NUC_FASTA_OUT #115)
- Improved handling of special characters in GFF files
- Added --annotation_format standard
@evolarjun evolarjun added the bug Something isn't working label May 9, 2023
evolarjun added a commit that referenced this issue May 10, 2023
AMRFinderPlus release 3.11.14

This release addresses a few issues brought up on GitHub. We weren't able to solve all of them when we couldn't reproduce them, but we are trying.

Changes:
- On failure no `-o` output file is created - #115
- AMRFinderPlus will now automatically decompress files ending in .gz with gunzip (relies on gunzip being in PATH) - #61
- AMRFinderPlus does not support unicode, but it no longer checks GFF files to prohibit unicode characters specifically - #119
- Add reporting of curl error messages - #120
@evolarjun
Copy link
Contributor

So the bug itself was fixed in release 3.11.8, but additionally we changed the behavior so if there is an error no empty -o file will be created (release 3.11.14).

evolarjun pushed a commit to bioconda/bioconda-recipes that referenced this issue May 10, 2023
This release addresses a few issues brought up on GitHub.

Changes:
- On failure no `-o` output file is created - ncbi/amr#115
- AMRFinderPlus will now automatically decompress files ending in .gz with gunzip (this relies on gunzip being in PATH) - ncbi/amr#61
- AMRFinderPlus does not support unicode, but it will not check GFF files to prohibit extended ASCII or UTF-8 characters specifically (still prohibits GFF files with ASCII control characters 0x00 and 0x1F) - ncbi/amr#119
- Add reporting of curl error messages - ncbi/amr#120
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants