You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm having an issue with running orthofinder on 470 genomes in protein space. This occurs at the end of the "initial processing" steps. The error message is below, but (I think) boils down to this one
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 1.63 MiB for an array with shape (428224,) and data type int32
Full error message:
2024-12-01 17:49:48 : Initial processing of species 466 complete
2024-12-01 18:09:00 : Initial processing of species 468 complete
2024-12-01 18:16:52 : Initial processing of species 469 complete
2024-12-01 18:21:37 : Initial processing of species 470 complete
Process Process-95:
Traceback (most recent call last):
File "/mnt/lustre/software/anaconda/colsa/envs/orthofinder-2.5.5/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/mnt/lustre/software/anaconda/colsa/envs/orthofinder-2.5.5/lib/python3.12/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/mnt/gpfs01/software/anaconda/colsa/envs/orthofinder-2.5.5/bin/scripts_of/__main__.py", line 560, in Worker_ConnectCognates
WaterfallMethod.ConnectCognates(*args, d_pickle=d_pickle)
File "/mnt/gpfs01/software/anaconda/colsa/envs/orthofinder-2.5.5/bin/scripts_of/__main__.py", line 549, in ConnectCognates
B = matrices.LoadMatrixArray("B", seqsInfo, iSpecies, d_pickle)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/gpfs01/software/anaconda/colsa/envs/orthofinder-2.5.5/bin/scripts_of/matrices.py", line 54, in LoadMatrixArray
matrixArray.append(LoadMatrix(name, iSpecies, jSpecies, d_pickle))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/gpfs01/software/anaconda/colsa/envs/orthofinder-2.5.5/bin/scripts_of/matrices.py", line 47, in LoadMatrix
M = pic.load(picFile)
^^^^^^^^^^^^^^^^^
...
MemoryError
Process Process-111:
Traceback (most recent call last):
File "/mnt/lustre/software/anaconda/colsa/envs/orthofinder-2.5.5/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
File "/mnt/lustre/software/anaconda/colsa/envs/orthofinder-2.5.5/lib/python3.12/multiprocessing/process.py", line 108, in run
File "/mnt/gpfs01/software/anaconda/colsa/envs/orthofinder-2.5.5/bin/scripts_of/__main__.py", line 560, in Worker_ConnectCognates
File "/mnt/gpfs01/software/anaconda/colsa/envs/orthofinder-2.5.5/bin/scripts_of/__main__.py", line 550, in ConnectCognates
File "/mnt/gpfs01/software/anaconda/colsa/envs/orthofinder-2.5.5/bin/scripts_of/__main__.py", line 620, in ConnectAllBetterThanAnOrtholog_s
File "/mnt/gpfs01/software/anaconda/colsa/envs/orthofinder-2.5.5/bin/scripts_of/__main__.py", line 589, in GetMostDistant_s
File "/mnt/lustre/software/anaconda/colsa/envs/orthofinder-2.5.5/lib/python3.12/site-packages/scipy/sparse/_lil.py", line 412, in tocsr
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 1.63 MiB for an array with shape (428224,) and data type int32
Process Process-108:
Traceback (most recent call last):
File "/mnt/lustre/software/anaconda/colsa/envs/orthofinder-2.5.5/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/mnt/lustre/software/anaconda/colsa/envs/orthofinder-2.5.5/lib/python3.12/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/mnt/gpfs01/software/anaconda/colsa/envs/orthofinder-2.5.5/bin/scripts_of/__main__.py", line 560, in Worker_ConnectCognates
WaterfallMethod.ConnectCognates(*args, d_pickle=d_pickle)
File "/mnt/gpfs01/software/anaconda/colsa/envs/orthofinder-2.5.5/bin/scripts_of/__main__.py", line 549, in ConnectCognates
B = matrices.LoadMatrixArray("B", seqsInfo, iSpecies, d_pickle)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I do have 700Gb of RAM available, and 64-bit python. No indication (from slurm) that this is a RAM/disk issue.
Thoughts about this? Any help appreciated.
The text was updated successfully, but these errors were encountered:
Running 470 species on orthofinder-2.5.5 is quite a challenge (+16 days). You will also be making a very large matrix file which might max out your RAM, are you running this with MAFFT or DendroBLAST?
I would recommend potentially switching to using the new --core --assign function. To do this sample a subset of your proteomes to build a core and the assign further proteomes using --assign. You can view this information on the main github page.
Hi All,
I'm having an issue with running orthofinder on 470 genomes in protein space. This occurs at the end of the "initial processing" steps. The error message is below, but (I think) boils down to this one
Full error message:
I do have 700Gb of RAM available, and 64-bit python. No indication (from slurm) that this is a RAM/disk issue.
Thoughts about this? Any help appreciated.
The text was updated successfully, but these errors were encountered: