Extract representative genome from a motu #4

lkalmar · 2023-03-14T16:56:32Z

Hi,

What is your suggestion to extract the representative genome for a meta_ and ext_ mOTUs?

E.g., if we download meta_mOTU_v3_12240, it downloads 4361 genomes (even if we only choose one of the genomes it downloads all, but I saw it is there on your todo list already), and these genomes are ranging from ~800KB to ~4.8MB.

Our plan is to annotate the genomes we found in our metagenomics samples, and use the list of genes for further analysis. We have a list of about 2000 mOTUs (1/3 are ref, 2/3 are meta and ext), ideally we would like to end up with the same number of genomes to annotate (by prokka).

Should we use the genome that is the closest to the median / mean of the genome sizes in the mOTU?

Thanks in advance for your help

AlessioMilanese · 2023-03-19T07:27:53Z

Hi,

I would filter genomes based on completeness and contamination (based on CHECKM). You can find this information here:
https://zenodo.org/record/7146984#.ZBa4qbTMIbk

Then you could either choose the genome with the best parameters (highest completeness and lowest contamination), or you could choose the genome that is in a centroid position. In other words, the genome that has the lowest distance to all other genomes in the cluster. You could calculate the distance with fastANI or MASH.

lkalmar · 2023-03-20T13:25:05Z

Thanks, I thought about a solution that doesn't require that much of re-processing. One would think that when these clusters / mOTUs were originally formed, something like this has been done already. Would be nice to have access to that data.

AlessioMilanese added the question Further information is requested label Mar 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extract representative genome from a motu #4

Extract representative genome from a motu #4

lkalmar commented Mar 14, 2023

AlessioMilanese commented Mar 19, 2023

lkalmar commented Mar 20, 2023

Extract representative genome from a motu #4

Extract representative genome from a motu #4

Comments

lkalmar commented Mar 14, 2023

AlessioMilanese commented Mar 19, 2023

lkalmar commented Mar 20, 2023