Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create BLAST tabular adapter #4627

Merged
merged 7 commits into from
Nov 8, 2024
Merged

Create BLAST tabular adapter #4627

merged 7 commits into from
Nov 8, 2024

Conversation

garrettjstevens
Copy link
Collaborator

This adds a synteny adapter that works with files generated by BLAST usint the -outfmt 6 option (called "tabular" output in the docs).

You can see a demo of this branch here: https://s3.amazonaws.com/jbrowse.org/code/jb2/create_blast_tabular_adapter/index.html?config=test_data%2Fgrape_peach_synteny%2Fconfig.json&session=share-4SDMC98yDL&password=JI48K

The demo file is just matches for the first 1,000,000 bases of Pp05 for peach and chr18 for grape. The file was generated like this:

# The files we have are Ppersica_Pp05_subset.fa and Vvinifera_chr18_subset.fa,
# both of with are the first 1,000,000 bases of the indicated chromosomes.
makeblastdb -in Vvinifera_chr18_subset.fa -dbtype nucl
tblastx -query Ppersica_Pp05_subset.fa -db Vvinifera_chr18_subset.fa -outfmt 6 -evalue 0.1 > peach_vs_grape.tsv

This took ~1m15s to generate on my computer.

@garrettjstevens garrettjstevens self-assigned this Oct 29, 2024
@cmdcolin cmdcolin force-pushed the create_blast_tabular_adapter branch from 5a6c786 to ad9fd29 Compare October 31, 2024 16:08
@cmdcolin
Copy link
Collaborator

I rebased this off main and gzipped the sample data file in case you get merge conflicts when pulling!

@garrettjstevens
Copy link
Collaborator Author

Here's a new share link with the gene tracks: https://s3.amazonaws.com/jbrowse.org/code/jb2/create_blast_tabular_adapter/index.html?config=test_data%2Fgrape_peach_synteny%2Fconfig.json&session=share-csnO59YS7_&password=wSNlc

image

I also added a config option so users can specify their columns in case they used the custom column option in BLAST's outfmt. If you ran the command with -outfmt "6 qseqid sseqid qstart qend sstart send", then you would specify the columns in the config as 'qseqid sseqid qstart qend sstart send'. Those six columns are required, all others are optional.

@cmdcolin
Copy link
Collaborator

cmdcolin commented Nov 8, 2024

if you get a chance, i would be curious how outfmt 17 (SAM, then convert to BAM) works in jbrowse too...

@cmdcolin cmdcolin force-pushed the create_blast_tabular_adapter branch from 3843c2d to c33489a Compare November 8, 2024 14:49
@cmdcolin cmdcolin force-pushed the create_blast_tabular_adapter branch from 912f5e4 to 050a993 Compare November 8, 2024 15:11
@cmdcolin cmdcolin merged commit c5554a8 into main Nov 8, 2024
4 checks passed
@cmdcolin cmdcolin deleted the create_blast_tabular_adapter branch November 8, 2024 15:27
@cmdcolin cmdcolin added the enhancement New feature or request label Nov 8, 2024
@garrettjstevens
Copy link
Collaborator Author

outfmt 17 only works with blastn, so I've attached both the outfmt 6 file and outfmt 17 file using blastn. For some reason the raw SAM output has Query_1 instead of PP05, so I replaced it and converted to BAM, and those files are attached, too.

Here's what it looks like with outfmt 6 loaded as a synteny track and outfmt 17 loaded as an alignments track:

image

peach_vs_grape.blastn.tar.gz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants