Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix metabat #53

Closed
wants to merge 2 commits into from
Closed

Fix metabat #53

wants to merge 2 commits into from

Conversation

d4straub
Copy link
Collaborator

This is to fix #32 by setting metabat process to only 1 core.

PR checklist

  • This comment contains a description of changes (with reason)
  • If you've fixed a bug or added code that should be tested, add tests!
  • If necessary, also make a PR on the nf-core/mag branch on the nf-core/test-datasets repo
  • Ensure the test suite passes (nextflow run . -profile test,docker).
  • Make sure your code lints (nf-core lint .).
  • Documentation in docs is updated
  • CHANGELOG.md is updated
  • README.md is updated

p.s. sorry for creating this branch here, my mistake.

@d4straub d4straub requested a review from skrakau June 19, 2020 13:36
@skrakau
Copy link
Member

skrakau commented Jun 22, 2020

Do you know what exactly caused the problem? Maybe it is worth testing it with metabat 2.15 in case you can reproduce the error (or at least open an issue, seems there were some fixes since 2.13)?

In general regarding metabat, wouldn't it make sense to run it using the parameter --seed 1 to avoid the usage of random seeds and ensure reproducibility?

@d4straub
Copy link
Collaborator Author

Fair enough, I didn't explain very well. The problem doesn't appear to be reproducibility but rather that the output of jgi_summarize_bam_contig_depths in the process metabat can be shuffled when using multiple cores. This is a hypothesis and the problem only occurs sometimes, but it never occurred when I set cpu = 1, as detailed in #32 . I do not think that --seed 1 would help in this case, but I would be glad if I were wrong.

As to avoid running metabat on only one core, it is possible to split process metabat into two processes so that only jgi_summarize_bam_contig_depths runs with one core, that should solve the problem as well when my theory is right. Or we could use a conditional setup i.e. only when error an error exit status of 1 occurs then a retry with 1 core would be triggered. I would not know how to implement the latter and an error exit status of 1 is not specific at all to the problem.

What do you suggest?

@skrakau
Copy link
Member

skrakau commented Jun 25, 2020

Likely this doesn't solve the problem. There is a channel problem, which is not always reproducible. I am working on it.

@skrakau skrakau closed this Jun 28, 2020
@skrakau skrakau deleted the metabat-1-core branch June 28, 2020 12:21
@skrakau skrakau mentioned this pull request Jun 28, 2020
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants