Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

centrifuge-kreport values do not ad #276

Open
pablorr24 opened this issue May 22, 2024 · 3 comments
Open

centrifuge-kreport values do not ad #276

pablorr24 opened this issue May 22, 2024 · 3 comments

Comments

@pablorr24
Copy link

Hi,
I've been working on Centrifuge, and using the centrifuge-kreport tool to convert my results to kraken2 format. I'm using:

centrifuge-kreport -x /.../database/db_prefix centrif_report.tsv > kreport.txt

I am getting a kraken report but the values seem to differ. In the screenshot the Centrifuge report (above) values for Bacillus differ from those on the kraken report (below). I'm not sure if I'm interpreting something incorrectly, or there might be an issue with the code.

image

Best regards,
Pablo R

@mourisl
Copy link
Collaborator

mourisl commented May 22, 2024

Indeed, the 5 reads marked in the numUniqueReads should be reflected in the report. I'll check whether I can reproduce this issue on our data or ERR3077553. Thank you for reporting it!

@pablorr24
Copy link
Author

Hi, do you have any updates on this?

@mourisl
Copy link
Collaborator

mourisl commented Jul 17, 2024

I think one reason could be due to the taxonomy tree structure. When there are too many multiple mapping, Centrifuge will try to reduce the reported taxonomy IDs by promoting to higher taxonomy levels. In the main program, the promote will be to the standard taxonomy levels, like species, genus, family,.... In the centrifuge-kreport, the promotion may be to some non-standard taxonomy levels, like subgenus. I guess in your case, the other 4 reads may be uniquely promote to ranks like superspecies, subgenus, so the number is a bit off.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants