-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3D multi-body hangs on Expectation interation 1 #543
Comments
Your dataset is large. Did RELION 3.0 run with reasonable speed? |
In relion 3.0 they were processed as 3 individual datasets. I took advantage of relion 3.1 to merge. The previous 3D refinement in relion 3.1 took almost a weak to complete (using a ramdisk as scratch so about half the dataset was in ram and half on a spinning disk). From the last successful 3D refinement the first expectation took 4.63 hours. I started the 3D multibody last night around 9 and at 7:30 this morning the time did not advace past 000/???. I stopped that job, pulled the latest git commits and restarted with "copy to scratch" turned off (I have had issues int he past with this) and restarted. It has been running about 1.5 hours now with no updates but I will leave to running maybe until tomorrow to see if it progresses. |
I don't think this is RELION 3.1's problem. Simply your dataset is too large or your hardware not strong enough. If you believe this is RELION 3.1's problem, please try this:
|
Thanks, I will try out this suggestion. Note on a different dataset (448/448 px; 406695 particles) but the same hardware I was able to complete a 3D refinement with relion 3.0.7 (first expectation is 4.22 hrs with time updates at 0.08 hrs) and also a multibody refinement (first expectation is 7.55 hrs with time updates after 0.13 hrs). I guess through the optic groups increase the requirements though. Thanks though. I will try the suggestions and get back to you. |
Hi. I tried quickly to start the multibody with commit 0841d0 (the version that was able to complete the 3D-auto refine for me; and before MTF fix) and it at least is updating the expected completion time (i.e. it updated 000/??? to 0.12/7.06 hrs). I will let you know if it completes. Are there any issues with using this version for the multi-body refinement? |
If you are not merging datasets, it is completely fine. Otherwise, Class2D/3D might give worse results without this MTF fix. Refine3D/MultiBody should be less affected, but we cannot guarantee. |
Ok, so the safest option might be to remove anything related the MTF correction from the optics group table and repeat the Refine3D/MultiBody without the MTF correction? |
If your three datasets came from the same detector and the same pixel size, you can remove MTF files from the optics group table. |
Good Morning. I have tried your recommended test (i.e using a subset of particles), but so far I have only done it with different commits of ver3.1 though: Refine 201
Refine 198
The two commands are the same but I just use different builds (3.1-beta-commit-f511ad vs. version 3.1-beta-commit-0841d0). Basically, the 3.1-beta-commit-0841d0 will complete the Refine3D without error such that Expectation iteration 1 takes 59 secs to complete, but with version 3.1-beta-commit-f511ad the first Expectation iteration remains like this:
even if I leave it running for an hour. In general, I am also having problems completing also a Multibody refinement even when I use
Thanks so much for any advice, you can provide |
In summary
Commands and output below:
Refine 205:
Just in case it is helpful this is the output of
|
Could you please privately share a small subset (~100 particles from each optics group) of your dataset to us? We need to further investigate this issue locally. Please write to Sjors and Takanori (you can find their email addresses in CCPEM). |
We fixed this issue in commit 3bfef2b. |
Refine3D was fixed by the above commit but MultiBody issue remains. @scheres is working on it now. |
Fixed now. |
When starting a 3D-Multibody refinement in relion 3.1 (f2c3d8) it runs to the first expectation round and appears to hang there. The time to complete never updates past 000/??? and in htop it appears that only one thread is running at 100% but nvidia-top shows both gpus at 95%. There is no error output in run.err. I have tried to start multiple times, sometimes leaving in this state overnight. Also I have tried with and without copy particles to scratch. The prior 3D refinement was run with the MTF files provided and previously the dataset subjected to CtfRefinement with anisotrophic magnification selected.
Environment:
Dataset:
Job options:
Error message:
run.err:
run.out:
I will leave it running like this to see if starts to respond.
The text was updated successfully, but these errors were encountered: