Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Potential related issues
All issues related to long runtime: #21, #24, #27
Description
After spending a day figuring out why the software takes hours to finish on some sequences, I found several issues:
multiprocessing
module sometimes does not work. An issue which might be related to numpy inference on CPU attribution.Pull request description
Complete rewritting of the main
dvf.py
script. This should produce exactly the same results than before. The results order might change however.With one core and using the CRC test file, old version took 102 seconds while new one and in 29 seconds, meaning a 3.5 times improvement. I suspect the improvement to be actually much better than this since half of the time (~ 15 seconds) is used here to load models. New version of the software does not include the multiprocessing step.
I implement a random check that new pvalue equals old value every 1,000 predictions but this has never been trigger for all my tests.
I added a
compare_results.py
script to compare results regardless of the order.