Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should the individual mask bed be a subset of Mappability mask ? #40

Open
sandyplus opened this issue May 18, 2021 · 1 comment
Open

Comments

@sandyplus
Copy link

sandyplus commented May 18, 2021

Dear @stschiff ,

Should the individual mask bed be a subset of Mappability mask file ?
If I understood correctly, the mappability mask file shows the the valid range of genome sites, which includes mask range for all other samples.
However, in your tutorial, I found that it is not the case.
A lot of sites in VCF mask bed is not include in mappability mask file, why?
Did I misunderstand anything?

Best regards,
Sandy

PS: the mask range

## NA19238.chr1.mask.bed.gz
chr1    11093   11101
chr1    11137   11154
chr1    11203   11235
chr1    11276   11288
chr1    11319   11371
chr1    11378   11387
chr1    11437   11453
chr1    11481   11504
chr1    11511   11527
chr1    11568   11637
chr1    11677   11699
chr1    11736   11760
chr1    11806   11840
chr1    11871   11890
chr1    11910   11916
chr1    11926   11943
chr1    11973   11997
chr1    12004   12022
chr1    12052   12093
chr1    12124   12134
chr1    12173   12186
chr1    12204   12237
chr1    12244   12262
chr1    12348   12375
chr1    12386   12409
chr1    12426   12518
chr1    12557   12587
chr1    12612   12668
chr1    12696   12733
chr1    12758   12773
chr1    12811   12850
chr1    12858   12920
chr1    12951   12978
chr1    12988   13110
chr1    13124   13174
chr1    13217   13279
chr1    13331   13357
chr1    13364   13384
chr1    13427   13569
chr1    13576   13609
chr1    13639   13652
chr1    13688   13746
chr1    13791   13809
chr1    13841   13866
chr1    13915   13953
chr1    13985   14111
chr1    14119   14206
chr1    14213   14244
chr1    14273   14350
chr1    14381   14432
chr1    14461   14518
chr1    14557   14570
chr1    14613   14622
chr1    14629   14649
chr1    14680   14695
chr1    14702   14745
chr1    14755   14780
chr1    14818   14850
chr1    14893   14902
chr1    14933   14996
chr1    15018   15041
chr1    15056   15090
chr1    15121   15175
chr1    15214   15224
chr1    15259   15269
chr1    15286   15421
chr1    15449   15515
## mappability: hs37d5_chr1.mask.bed
1       10514   10549
1       17476   17506
1       52177   52202
1       54695   54720
1       55304   55336
1       55376   55398
@stschiff
Copy link
Owner

stschiff commented Jul 6, 2021

No, they're independent. The individual masks gives regions at which the individual has enough data. The mappability mask gives regions in the genome which are generally mappable. The code will then create an intersect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants