Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gfa file to visualize the final consensus fasta by Bandage #15

Open
avera1988 opened this issue Aug 16, 2021 · 2 comments
Open

gfa file to visualize the final consensus fasta by Bandage #15

avera1988 opened this issue Aug 16, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@avera1988
Copy link

avera1988 commented Aug 16, 2021

Hi Ryan,
I am using Trycycler to assembly 90 bacterial genomes from fish guts. It is an amazing and very helpful tool. I was able to obtain circular, > 98 % completeness, and very good coverage in all the genomes after following the pipeline and polishing the assemblies by medaka. However, I'm wondering if there is an easy and proper way to obtain a gfa-like file to visualize the genomes in bandage. I would like to plot those circular chromosomes and plasmids as well as those linear ones in a bandage-like figure. I've been looking if there is any available script in the Trycycler pipeline to do this, but so far I haven't found any.

Hope you can help me.

Thank you in advance.

Arturo.

@rrwick
Copy link
Owner

rrwick commented Aug 19, 2021

Hi Arturo,

There is not currently a Trycycler command to make a GFA from an assembly, but it wouldn't be difficult to do. Each contig sequence would form a segment, and each circular contig would get a zero-overlap link connecting the end to the start.

I'll tag this as an enhancement and keep it in mind for the next release of Trycycler. Thanks!

And some notes to my future self about implementation:

  • I'll need to somehow keep track of whether a contig was reconciled with --linear or not, as that will change the links. Maybe save circular=true or circular=false in the headers of the 2_all_seqs.fasta file. This would also remove the need for the --linear option in the consensus command, simplifying things.
  • Would also be nice to calculate per-replicon depth to include in the GFA. Can probably do this using the partitioned reads, taking reads bases over replicon size.

Ryan

@rrwick rrwick added the enhancement New feature or request label Aug 19, 2021
@avera1988
Copy link
Author

Hi Ryan,

Thank you for the replay. I will work on this trying to produce the gfa-file following your comments. I consider it would be super nice to have these enhancements in the following release.

Cheers!

Arturo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants