Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generating the Sequence File from Multiple Consensus Genomes #51

Open
FatihSarigol opened this issue Nov 18, 2018 · 7 comments
Open

Generating the Sequence File from Multiple Consensus Genomes #51

FatihSarigol opened this issue Nov 18, 2018 · 7 comments

Comments

@FatihSarigol
Copy link

Hi!
This is not really an issue about the program, but instead of asking via email, I thought perhaps having it here might help someone else, too.
I have a few whole genomes (which are in same coordinates with the variants applied to the reference, which consists of over 1000 contigs) from different individuals, and I want to run G-PhoCS on them. Could you suggest an easy way to generate a proper format input sequence file from them? I don't have much time to write a code to do that right now, but if nobody else has a similar code, I would be happy to share it here once I write it myself one day.
Any help is much appreciated!
Thanks

@gphocs-dev
Copy link
Owner

I heard that GLACtools has an option for generating G-PhoCS sequence input format. Can you please check and post here?

@FatihSarigol
Copy link
Author

Thank you for your message!
GLACtools has a script to export ACF files (which contains allele counts for either a single individual or a group of individuals (population)) as G-PhoCS format.
I haven't tried it, yet, but it also can convert single sample VCF to ACF, so could be a way even though a long one seemingly.
Any alternative ideas to go from fasta files with variants already applied?
Thanks!

@gphocs-dev
Copy link
Owner

You'll probably have to write up a custom script for that. I typically end up using different custom scripts for different data sets.

@grenaud
Copy link

grenaud commented Jun 17, 2019

@FatihSarigol I am the author of glactools. I just saw this, converting VCF to ACF should be straightforward. The only problem is getting an outgroup or ancestral if need be. There is a perl script to convert contiguous chunks to gphocs output. Let me know if you run into any issues.

@FatihSarigol
Copy link
Author

Hello @grenaud
Thank you for your comment! I recently wrote my own script to merge and convert fasta files of different samples into the format that G-PhoCS requires.

If anyone else needs to take that road, too, I'd be happy to share my code; I'll eventually put it on my github, but can still make it better for other users.

@grenaud
Copy link

grenaud commented Jun 17, 2019

Yes glactools is not really designed for fasta files. It targets mostly genotyping or single bases from BAM files.

@gphocs-dev
Copy link
Owner

gphocs-dev commented Jun 17, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants