Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kraken2] Update Krona to be compatible with viral data #709

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

sage-wright
Copy link
Member

@sage-wright sage-wright commented Jan 2, 2025

This PR closes #278 and closes #518

🗑️ This dev branch should be deleted after merging to main.

🧠 Summary

Krona for a while now has been incompatible with viral data due to a weird quirk where viral data was appearing in the krona plots as "Other root" which wasn't really what we wanted. This PR preprocesses the Kraken2 reports with KrakenTools to make the viral data appear appropriately in the HTML plots.

⚡ Impacted Workflows/Tasks

  • Kraken2s that use Krona (PE and SE)
  • TheiaMeta

This PR may lead to different results in pre-existing outputs: Yes

  • viral data will now appear properly

This PR uses an element that could cause duplicate runs to have different results: No

🛠️ Changes

⚙️ Algorithm

  • the krona task now uses the following docker image: us-docker.pkg.dev/general-theiagen/staphb/krona:2.8.1
  • krona task now runs kreport2krona before making the krona visualization
  • no variables have been changed.

➡️ Inputs

None

⬅️ Outputs

None

🧪 Testing

Suggested Scenarios for Reviewer to Test

  • viral and bacterial stuff to confirm nothing is broken unintentionally

🔬 Final Developer Checklist

  • The workflow/task has been tested and results, including file contents, are as anticipated
  • The CI/CD has been adjusted and tests are passing (Theiagen developers)
  • Code changes follow the style guide
  • Documentation and/or workflow diagrams have been updated if applicable
    • You have updated the latest version for any affected worklows in the respective workflow documentation page and for every entry in the three workflows_overview tables.

🎯 Reviewer Checklist

  • All changed results have been confirmed
  • You have tested the PR appropriately (see the testing guide for more information)
  • All code adheres to the style guide
  • MD5 sums have been updated
  • The PR author has addressed all comments
  • The documentation has been updated

@sage-wright
Copy link
Member Author

@cimendes i will want your eyes on this because i'm not sure how good this approach is and if it is what we want

@cimendes cimendes self-assigned this Jan 6, 2025

# Get taxonomy file
ktUpdateTaxonomy.sh taxonomy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sage-wright any reason why this step is no longer needed? This was used to update the NCBI's taxonomy file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I looked a little better, krImportText doesn't require a taxonomy file (source: https://github.com/marbl/Krona/wiki/Installing) so good call!


# Run krona with taxonomy on krakren report
ktImportTaxonomy -o ~{samplename}_krona.html ~{kraken2_report} -tax taxonomy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cimendes
Copy link
Member

cimendes commented Jan 6, 2025

Overall I think this is a valid approach to getting those Krona reports. It relies on two dependencies instead of just one but I think it's a fair trade-off to get the viral components of a community represented in the final plots.
The affected workflows are as follows:

  • Kraken2 Standalone:
    • ONT
    • PE
    • SE
  • TheiaMeta PE

Let me know once it's ready for review and I'll start testing!

@sage-wright
Copy link
Member Author

This doesn't affect Kraken ONT -- we're not running krona in that workflow

@sage-wright sage-wright marked this pull request as ready for review January 13, 2025 15:01
@sage-wright sage-wright requested a review from a team as a code owner January 13, 2025 15:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants