-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gmap-based method to identify allelic contig table #16
Comments
Hi starsyi,
Note: 1) target.genome is the contig level of polyploid genome assembly
Note: 1) reference.gff3 is the gff3 annotation of the diploid genome |
hi, tangerzhang ! |
Could you please share a couple of lines of your reference.gff3 file and gmap.gff3 file? |
I also got an empty table. Maybe it's better to give an option in |
Thanks for the suggestion. Since some gff3 files are not compatible with the script, I will suggest that the users can make a bed file for the reference annotation and then use the revised script (Attached). The BED file should contain at least four columns, including: ChrID Start_Position End_Position and GeneID.
|
hi, tangerzhang, I got my bed file by a python script. Then, I got an empty table when I use the script {Perl gmap2AlleleTableBED.pl reference.gff3}. So, I recomposed gmap2AlleleTableBED.pl code by adding {$gene =~ s/;.*//g;} next to {my $gene = $1 if(/Name=(\S+)/); } and altering {print OUT "$data[0]$data[3];} to {print OUT "$data[0]$data[2];}. Finally, the output looks like the figure of allelic.ctg.table. I wonder if it is OK that there is a considerable discrepancy between the allelic.ctg.table(based on blased) and the allelic.ctg.table(based on Gmap) . |
Hi @Zachary-Wu |
gmap.gff3 Generated by GMAP version 2020-10-14 using call: gmap.sse42 -D . -d DB -t 12 -f 2 -n 2 /home02/mqyin/reference/reference.CDS.fastatig00001992_np12 DB gene 14871 16501 . - . ID=Cg8g024190.2.1.path1;Name=Cg8g024190.2.1;Dir=sense reference.gff3 |
Hi I was not able to find the script gmap_build! Could you please point me where that might be? |
This requires to install GMAP program, which can be found here (http://research-pub.gene.com/gmap/). |
dear @tangerzhang, i got the Allele.ctg.table file by 'perl gmap2AlleleTableBED.pl ref.bed', but the file like this: |
@Ahahaha3, would mind paste some lines of you ref.bed file here, and we will check why it happened. |
@sc-zhang, Thanks for your reply. There are some lines of my ref.bed. |
@Ahahaha3 , sorry, I also need some lines of gmap.gff3 to check why it happened. |
gmap.gff3 is reference.cds.fasta is i got the Allele.ctg.table file by 'perl gmap2AlleleTableBED.pl ref.bed', but the file like this: |
@lxingze , I cannot get the same result as you paste here, so I want to know which version of perl you use, and then make a further test. |
This is perl 5, version 26, subversion 2 (v5.26.2) |
@lxingze , maybe you need check if you use the latest version of gmap2AlleleTableBED.pl. If it still not work, maybe you can try codes below, and check if it works. #!/usr/bin/perl -w
die "Usage: perl $0 ref.bed\n" if(!defined ($ARGV[0]));
my $refGFF = $ARGV[0];
open(IN, "grep 'gene' gmap.gff3 |") or die"";
while(<IN>){
chomp;
my @data = split(/\s+/,$_);
my $gene = $1 if(/Name=(\S+)/);
$infordb{$gene} .= $data[0]."\t";
}
close IN;
open(OUT, "> Allele.ctg.table") or die"";
open(IN, $refGFF) or die"";
while(<IN>){
chomp;
my @data = split(/\s+/,$_);
my $gene = $data[3];
$gene =~ s/;.*//g;
next if(!exists($infordb{$gene}));
my @tdb = split(/\s+/,$infordb{$gene});
my %tmpdb = ();
map {$tmpdb{$_}++} @tdb;
print OUT $data[0]."\t".$data[3]."\t";
map {print OUT $_."\t"} keys %tmpdb;
print OUT "\n";
}
close IN;
close OUT; |
|
1 similar comment
|
Hi, @tangerzhang, When I want to use gmap2AlleleTable.pl to aquire the Allelic.contig.table I encountered an error as below: My gmap.gff3 contents like this: And my reference.gff3 format like this: I think the format of reference.gff3 was normal format, of course, I've tried using gmap2AlleleTableBED.pl and encountered the same error, could you please tell me how to modify perl script or file contents to acquire the Allelic.contig.table I want. |
@tangerzhang |
您好!
您这个ALLHiC软件要想鉴定contigs之间的等位基因,需要一个contigs的注释文件和cds文件。也就是说需要对contigs进行从头注释得到这两个信息文件,才能鉴定等位基因,从而去除染色体之间的噪音,是这样吗?
因为我这里看来,做一个大的基因组,注释是非常消耗时间和计算资源的,我在想怎么只通过已经发表的cds序列blastn到contigs上,来鉴定contigs之间的等位基因,这样可以吗?
祝好!!!
The text was updated successfully, but these errors were encountered: