You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I get this error sometimes for species with few genomes, I know that the recommended is 15 genomes at least, but I'm trying to get a distribution of the gamma tendency to decide at what point (number of genomes) I can trust the pangenome:
ValueError: The gene family has not been associated to a partition.
Does this happen precisely because there are too few genomes and the partitions cannot be built ?
Also, what output file contains this gamma value ? Haven't look extensively but I couldn't find it yet
I'm using ppanggolin like this: ppanggolin workflow --anno list_genomes.tsv -c 64 -o output --clusters clusters.tsv --infer_singletons --rarefaction
Thanks in advance !
Eric
The text was updated successfully, but these errors were encountered:
Hi,
This definitely happens because there are too few genomes yes. I mean, 2 is not a lot to apply a statistical model. Though the place ppanggolin crashes is a bit unexpected to me, it should probably crash at the partitioning step rather than the hdf5 writing step, if it must crash for that reason.
As for the gamma tendency values, I believe it should be written in the "rarefaction_parameters.csv" file, it is being written when you call --rarefaction with the workflow.
It looks like that information is indeed missing from the documentation, we'll look to add that in for the next release.
Thanks, I think I'll just proceed only with species containing 15 genomes or more
Because it works with other species that also have 2 genomes only I thought there was another problem causing this error.
Ah I see, you may be right then, maybe there is something particular with those 2.
In any case, if you want to rely on the gamma-tendency for your analysis, I would definitively not go with species that have 2 genomes. I'm not sure it can compute it with just 2, but even if it can I don't think it would be very reliable anyway.
Hi !
I get this error sometimes for species with few genomes, I know that the recommended is 15 genomes at least, but I'm trying to get a distribution of the gamma tendency to decide at what point (number of genomes) I can trust the pangenome:
ValueError: The gene family has not been associated to a partition.
Does this happen precisely because there are too few genomes and the partitions cannot be built ?
Also, what output file contains this gamma value ? Haven't look extensively but I couldn't find it yet
I'm using ppanggolin like this:
ppanggolin workflow --anno list_genomes.tsv -c 64 -o output --clusters clusters.tsv --infer_singletons --rarefaction
Thanks in advance !
Eric
The text was updated successfully, but these errors were encountered: