Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of missing Sabio-RK entries #6

Closed
yaccos opened this issue Oct 9, 2020 · 2 comments
Closed

Handling of missing Sabio-RK entries #6

yaccos opened this issue Oct 9, 2020 · 2 comments

Comments

@yaccos
Copy link
Contributor

yaccos commented Oct 9, 2020

I got KeyError: '3.6.1.40.5' in line 69 of create_combined_kcat_database.py. Tracing back the error, I observed that searching for EC-number 3.6.1.40.5 in Sabio-RK with autopacmen does not give any results on any wildcard level. The last lines of output from create_combined_kcat_database may shed more light on the problem:
Wildcard level 3... ['3.6.*.*.*'] Performing query [{'ECNumber': '3.6.*.*.*', 'Parametertype': 'kcat', 'EnzymeType': 'wildtype'}]... SABIO-RK API error with query: ((ECNumber:3.6.*.*.* AND Parametertype:kcat AND EnzymeType:wildtype)) Wildcard level 4... ['3.*.*.*.*'] Performing query [{'ECNumber': '3.*.*.*.*', 'Parametertype': 'kcat', 'EnzymeType': 'wildtype'}]... SABIO-RK API error with query: ((ECNumber:3.*.*.*.* AND Parametertype:kcat AND EnzymeType:wildtype))
Sabio-RK has of course entries for these high wildcard levels, but there might just be too many of them for the API of return any results. This mean that even with the wildcard search, you may expect to have some EC-numbers to which no entry is obtainable. Consequently, we must handle the case where a Sabio-RK entry is not available for the combined database.

Paulocracy added a commit that referenced this issue Oct 13, 2020
@Paulocracy
Copy link
Member

As far as I found out this error was caused by "3.6.1.40.5" being an invalid EC number format since it contains 5 numbers (and the standard is ..., where * stands for an integer or provisional characters). That's why the SABIO-RK wildcard search failed since there is no EC number with 5 numbers, and since I did not include a strict EC number validity test beforehand as there are sometimes strangely looking provisional or custom numbers.
Now, to address the error, I included a check in create_combined_kcat_database.py which looks if the current EC number has any entry in either the BRENDA or SABIO-RK database. If this is not the case, a warning is printed and the EC number is ignored. But there may be also other ways to show that there is a problem such as halting the program, this would be just a "non-intrusive" way to handle these invalid numbers.

@yaccos
Copy link
Contributor Author

yaccos commented Oct 13, 2020

I later realized that the 5-digit EC number was an error in the model, and there is therefore no wonder why it did not get any hits. I agree that it is a good idea to issue a warning when the EC number is not found in BRENDA or SABIO-RK. Ideally, I would perfer that any error in the user input would give a custom error message instead of an internal error such as KeyError, but it is not important for me and probably not worth the effort.

@yaccos yaccos closed this as completed Oct 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants