Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite while loop prevents maracluster from progressing #17

Closed
TimothyOlsson opened this issue Mar 13, 2019 · 4 comments
Closed

Infinite while loop prevents maracluster from progressing #17

TimothyOlsson opened this issue Mar 13, 2019 · 4 comments
Labels

Comments

@TimothyOlsson
Copy link

Hi Matthew,

I noticed a bug where a large clusterSize in BatchSpectrumClusters.cpp prevents maracluster from progressing.

I ran a data set from cyanobacteria, 10 samples with each file about 300 mb in size. After running quandenser on the data set, I noticed that it froze when it ran maracluster. I debugged the code and found out that it froze on this specific line:

Line 209: BatchSpectrumClusters.cpp
while (clusterSize >>= 1 && clusterSizeBin < 9) ++clusterSizeBin;

When the variable "clusterSize" reaches above and equal to 512, the while loop gets stuck forever.

Copy the following code into this website and run it to verify the problem: https://repl.it/languages/cpp

#include <iostream>
#include <chrono>
#include <thread>

int main() {
  std::cout << "Started with clusterSize 511. This will work" << std::endl;  // Prints out start
  size_t clusterSizeBin = 0u;  // Exactly as in BatchSpectrumClusters.cpp
  size_t clusterSize = 511;  // Any value over 512 will make the while loop get stuck. This will be fine
  while (clusterSize >>= 1 && clusterSizeBin < 9) {
    ++clusterSizeBin;
    // The part above should be exactly the same as:
    // while (clusterSize >>= 1 && clusterSizeBin < 9) ++clusterSizeBin;
    std::this_thread::sleep_for(std::chrono::seconds(1));  // wait 1 sec
    std::cout << "clusterSizeBin = " << clusterSizeBin << std::endl;
    std::cout << "clusterSize = " << clusterSize << std::endl;
    // Shift bits right is almost as x/2
  }
  std::cout << "Done" << std::endl;
  
  std::cout << "Started with clusterSize 512. This will get stuck forever" << std::endl; 
  clusterSizeBin = 0u;  // Restart
  clusterSize = 512;  // This will get stuck forever
  while (clusterSize >>= 1 && clusterSizeBin < 9) {
    ++clusterSizeBin;
    std::this_thread::sleep_for(std::chrono::seconds(1));  // wait 1 sec to not crash
    std::cout << "clusterSizeBin = " << clusterSizeBin << std::endl;
    std::cout << "clusterSize = " << clusterSize << std::endl;
  }
  std::cout << "Done" << std::endl;  // We will never reach this
}

If I have understood what the problematic line is trying to accomplish aka shift clusterSize bits right by one until clusterSize == 1 or until clusterSizeBin is above or equal to 9.

In that case, I suggest rewriting the line to something similar to this:

while (clusterSize >>= 1) {
  ++clusterSizeBin;
  if (clusterSizeBin >= 9) {
    break;
  }
}

This will prevent the bug from happening. I tested it briefly with my data set and it passed through maracluster cleanly.

Best regards,

Timothy Bergström

@TimothyOlsson
Copy link
Author

TimothyOlsson commented Mar 13, 2019

Note: This affects this tree used for quandenser:
https://github.com/statisticalbiotechnology/maracluster/tree/f4d48a96b80b45cb511af7c854415f049057ed51

In maracluster 1.0, the file has been renamed to "SpectrumClusters.cpp". The bug should be valid there as well.

@TimothyOlsson TimothyOlsson changed the title BatchSpectrumClusters.cpp: Infinite while loop prevents maracluster from progressing Infinite while loop prevents maracluster from progressing Mar 13, 2019
@MatthewThe
Copy link
Contributor

Thanks, you're completely right. Could you submit a pull request to the master branch with your fix?

@MatthewThe
Copy link
Contributor

Actually, the problem was simply some missing parentheses, so I fixed it myself.

@TimothyOlsson
Copy link
Author

TimothyOlsson commented Mar 20, 2019

Thanks, you're completely right. Could you submit a pull request to the master branch with your fix?

Since Quandenser uses a detached head, I added a branch of maracluster which adds the fix.
https://github.com/statisticalbiotechnology/maracluster/tree/quandenser_branch

Perhaps it is possible to use the branch as a submodule for quandenser instead of the detached head to incorporate the fix. However, I'm not sure how to do it in github.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants