Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error occurring for some FASTA files #62

Open
karubiotools opened this issue Jun 20, 2022 · 2 comments
Open

error occurring for some FASTA files #62

karubiotools opened this issue Jun 20, 2022 · 2 comments

Comments

@karubiotools
Copy link

karubiotools commented Jun 20, 2022

Dear Developers,
I would like to know why the following error occur when I launch Kleborate with the '--all' option for some FASTA files, please:
"
strain species ST virulence_score resistance_score Yersiniabactin YbST Colibactin CbST Aerobactin AbST Salmochelin SmST RmpADC RmST rmpA2 wzi K_locus K_locus_confidence O_locus O_locus_confidence AGly_acquired Col_acquired Fcyn_acquired Flq_acquired Gly_acquired MLS_acquired Phe_acquired Rif_acquired Sul_acquired Tet_acquired Tgc_acquired Tmt_acquired Bla_acquired Bla_inhR_acquired Bla_ESBL_acquired Bla_ESBL_inhR_acquired Bla_Carb_acquired Bla_chr SHV_mutations Omp_mutations Col_mutations Flq_mutations truncated_resistance_hits spurious_resistance_hits
Traceback (most recent call last):
File "/soft/miniconda3/bin/kleborate", line 33, in
sys.exit(load_entry_point('Kleborate==2.2.0', 'console_scripts', 'kleborate')())
File "/soft/miniconda3/lib/python3.9/site-packages/kleborate/main.py", line 64, in main
results.update(get_resistance_results(data_folder, contigs, args, res_headers,
File "/soft/miniconda3/lib/python3.9/site-packages/kleborate/main.py", line 570, in get_resistance_results
res_hits = resblast_one_assembly(contigs, gene_info, qrdr, trunc, omp, seqs,
File "/soft/miniconda3/lib/python3.9/site-packages/kleborate/resBLAST.py", line 32, in resblast_one_assembly
hits_dict = blast_against_all(seqs, min_cov, min_ident, contigs, gene_info,
File "/soft/miniconda3/lib/python3.9/site-packages/kleborate/resBLAST.py", line 125, in blast_against_all
hit_allele, hit_class, hit_bla_class = gene_info[hit.gene_id]
KeyError: '403__TetX_Tet__tet(X6)__2434'
"
Thank you in advance for your help.
Best regards,
David

@nquynh8991
Copy link

That's because your "gene_info" did not match with "gene_id" in your *.csv file. Take a look back to "gene_id" in your database file, I think you need to fix that ID a little bit before running it again. Hope its help.

@learithe
Copy link

learithe commented Jan 17, 2023

This exact error just occurred for me on one sequence out of a set of >800. Thanks to @nquynh8991's comment I tracked it down to a typo for two sequences in the CARD database that comes with the latest version of kleborate here.

CARD_v3.0.8.fasta contains the headers:

402__TetX_Tet__tet(X5)__2433
403__TetX_Tet__tet(X6)__2434

CARD_AMR_clustered.csv contains the entries:

402,tet(X5),Tgc,TetX,tet(X5),2433,ARO_3005057,-,-,no,no,NA,NA
403,tet(X6),Tgc,TetX,tet(X6),2434,ARO_3005056,-,-,no,no,NA,NA

The difference is the specified antibiotic (Tet vs Tgc). I believe these should be Tgc in the fasta file headers, consistent with the csv file (these variants of TetX are associated with tigecycline resistance)

I solved this by editing the CARD fasta file and recreating the blast database from it. It would be good to solve this typo for a future Kleborate or database release!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants