-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allelic analysis HiC-Pro G2 genome not mapping #652
Comments
Hi, |
Thank you for response, here is the masked genome chromosome name and vcf file chr names
vcf file(https://drive.google.com/file/d/1-18ej_EoZpTL4KEvkiukEv-exqz6Mz3G/view?usp=sharing) ##fileformat=VCFv4.2 Thank you |
Looking at your VCF file header, it seems that the VCF was built using mm39 genome, while you used mm10 to build your N-mask genome.
and it makes sense with the stats, because all your reads are
It means that for a given SNP, HiC-Pro is exacting the nucleotide at that position in the BAM file, but that the latter is not corresponding to the expected allele ... which match the fact that you are not using the same genome version |
Thanks, I think one mistake I have made is when I raised issue initially worked on mm10 later I switch to mm39 based on issue. downloaded latest vcf from mouse genome project and downloaded reference genome with same version from ensembl (Mus_musculus.GRCm39.dna.toplevel.fa (104 version)). Changing genome version and vcf did not resolve my issue allelic stat based on Hi-pro subset_masked_GRCm39.bwt2pairs_allspe.allelstat/usr/local/bin//HiC-Pro_3.1.0/scripts/markAllelicStatus.py
|
Dear
I am trying to do allelic analysis Hic data getting same error G2 is mapping reads are zero and iced normalisation step getting error. I used bellow command to run analysis. non allelic analysis running completely good . I tried many publically available data with different strain cross like 129S vs CAST and B6 vs PWK got same error I am unable to find solution where its going wrong please help me
bowtie2 allespe.stat file looks like below
HiC-Pro_3.1.0/scripts/markAllelicStatus.py
bam=bowtie_results/bwt2/Sample1/subset_masked_mm10.bwt2pairs.bam
snpFile=mm10_snps_C57b6_PWK_PhJ.vcf
tag=XA
output=bowtie_results/bwt2/Sample1/subset_masked_mm10.bwt2pairs_allspe.bam
verbose=True
Total number of snps loaded 17046154.0
Total number of reads 353850 100
Number of reads with at least one 'N' 368870 104.245
Number of reads assigned to ref genome 91967 25.99
Number of reads assigned to alt genome 0 0.0
Number of conflicting reads 0 0.0
Number of unassigned reads 261883 74.01
code used generate VCF and masked genome
`extract_snps.py -i mgp.v5.merged.snps_all.dbSNP142.vcf -a PWK_PhJ> mm10_snps_C57b6_PWK_PhJ.vcf
bedtools maskfasta -fi mm10.fa -bed mm10_snps_C57b6_PWK_PhJ.vcf -fo masked_mm10.fa
bowtie2-build masked_mm10.fa masked_mm10 --threads 6
HiC-Pro -c config_test_as.txt -i 01_Fastq/ -o 03_output2`
please suggest me where I am going wrong
Thank you
The text was updated successfully, but these errors were encountered: