Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mcl added garbage entries #953

Open
jarrTecn opened this issue Dec 17, 2024 · 1 comment
Open

mcl added garbage entries #953

jarrTecn opened this issue Dec 17, 2024 · 1 comment

Comments

@jarrTecn
Copy link

Hello! I'm getting an error when running orthofinder, it seems that the problematic file is OrthoFinder_graph.txt
this file contains a lot of 'inf' instances.

I've attached the output files: example_error.zip

The error message is:

WARNING: program called by OrthoFinder produced output to stderr

Command: mcl /scr/k80san/jantonio/revolutionh-tl-project/datasets/sagephy/5/Fastas/noD/OrthoFinder/Results_Dec17_1/WorkingDirectory/OrthoFinder_graph.txt -I 1.2 -o /scr/k80san/jantonio/revolutionh-tl-project/datasets/sagephy/5/Fastas/noD/OrthoFinder/Results_Dec17_1/WorkingDirectory/clusters_OrthoFinder_I1.2.txt -te 10 -V all

stdout:
b''
stderr:
b'[mcl] added <3144> garbage entries\n'
2024-12-17 11:07:42 : Ran MCL

Writing orthogroups to file
---------------------------

2024-12-17 11:07:42 : Done orthogroups

Analysing Orthogroups
=====================
2024-12-17 11:07:42 : Starting MSA/Trees
Species tree: Using 0 orthogroups with minimum of 40.0% of species having single-copy genes in any orthogroup

Inferring multiple sequence alignments for species tree
-------------------------------------------------------
All MSAs for the concatenated multiple sequence alignment were empty.
Please correct the error and re-run.
ERROR: An error occurred, ***please review the error messages*** they may contain useful information about the problem.

How could I fix this problem? I tried the same command in other datasets and everything runs well.

Thanks for reading!

@Jonathan-Holmes-Bioinformatics

Hi jarrTecn,

I have reviewed your message and output zip. The issue seems to be due to the set of BLAST hits between your species, the mcl output graph shows only 2 clustering resulting in only 2 Orthogroups with 2 <= gene sequences in them have been identified.

The genome files also seems extremely small, are you using simulated data? The MSA might be breaking due to the lack of distinct mcl clusters leading to Orthogroup identification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants