diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json
index d6cdb42..1ba3a72 100644
--- a/dev/.documenter-siteinfo.json
+++ b/dev/.documenter-siteinfo.json
@@ -1 +1 @@
-{"documenter":{"julia_version":"1.10.5","generation_timestamp":"2024-09-01T02:30:32","documenter_version":"1.6.0"}}
\ No newline at end of file
+{"documenter":{"julia_version":"1.10.5","generation_timestamp":"2024-09-02T02:25:22","documenter_version":"1.6.0"}}
\ No newline at end of file
diff --git a/dev/PGEN_description/index.html b/dev/PGEN_description/index.html
index be060b5..14f3783 100644
--- a/dev/PGEN_description/index.html
+++ b/dev/PGEN_description/index.html
@@ -2,4 +2,4 @@
 <html lang="en"><head><meta charset="UTF-8"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><title>PGEN format description · PGENFiles.jl</title><meta name="title" content="PGEN format description · PGENFiles.jl"/><meta property="og:title" content="PGEN format description · PGENFiles.jl"/><meta property="twitter:title" content="PGEN format description · PGENFiles.jl"/><meta name="description" content="Documentation for PGENFiles.jl."/><meta property="og:description" content="Documentation for PGENFiles.jl."/><meta property="twitter:description" content="Documentation for PGENFiles.jl."/><script data-outdated-warner src="../assets/warner.js"></script><link href="https://cdnjs.cloudflare.com/ajax/libs/lato-font/3.0.0/css/lato-font.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/juliamono/0.050/juliamono.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/fontawesome.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/solid.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.4.2/css/brands.min.css" rel="stylesheet" type="text/css"/><link href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.8/katex.min.css" rel="stylesheet" type="text/css"/><script>documenterBaseURL=".."</script><script src="https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js" data-main="../assets/documenter.js"></script><script src="../search_index.js"></script><script src="../siteinfo.js"></script><script src="../../versions.js"></script><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-mocha.css" data-theme-name="catppuccin-mocha"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-macchiato.css" data-theme-name="catppuccin-macchiato"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-frappe.css" data-theme-name="catppuccin-frappe"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/catppuccin-latte.css" data-theme-name="catppuccin-latte"/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-dark.css" data-theme-name="documenter-dark" data-theme-primary-dark/><link class="docs-theme-link" rel="stylesheet" type="text/css" href="../assets/themes/documenter-light.css" data-theme-name="documenter-light" data-theme-primary/><script src="../assets/themeswap.js"></script></head><body><div id="documenter"><nav class="docs-sidebar"><div class="docs-package-name"><span class="docs-autofit"><a href="../">PGENFiles.jl</a></span></div><button class="docs-search-query input is-rounded is-small is-clickable my-2 mx-auto py-1 px-2" id="documenter-search-query">Search docs (Ctrl + /)</button><ul class="docs-menu"><li><a class="tocitem" href="../">PGENFiles.jl Tutorial</a></li><li class="is-active"><a class="tocitem" href>PGEN format description</a><ul class="internal"><li><a class="tocitem" href="#Introduction"><span>Introduction</span></a></li><li><a class="tocitem" href="#PGEN-format"><span>PGEN format</span></a></li></ul></li></ul><div class="docs-version-selector field has-addons"><div class="control"><span class="docs-label button is-static is-size-7">Version</span></div><div class="docs-selector control is-expanded"><div class="select is-fullwidth is-size-7"><select id="documenter-version-selector"></select></div></div></div></nav><div class="docs-main"><header class="docs-navbar"><a class="docs-sidebar-button docs-navbar-link fa-solid fa-bars is-hidden-desktop" id="documenter-sidebar-button" href="#"></a><nav class="breadcrumb"><ul class="is-hidden-mobile"><li class="is-active"><a href>PGEN format description</a></li></ul><ul class="is-hidden-tablet"><li class="is-active"><a href>PGEN format description</a></li></ul></nav><div class="docs-right"><a class="docs-navbar-link" href="https://github.com/OpenMendel/PGENFiles.jl" title="View the repository on GitHub"><span class="docs-icon fa-brands"></span><span class="docs-label is-hidden-touch">GitHub</span></a><a class="docs-navbar-link" href="https://github.com/OpenMendel/PGENFiles.jl/blob/main/docs/src/PGEN_description.md" title="Edit source on GitHub"><span class="docs-icon fa-solid"></span></a><a class="docs-settings-button docs-navbar-link fa-solid fa-gear" id="documenter-settings-button" href="#" title="Settings"></a><a class="docs-article-toggle-button fa-solid fa-chevron-up" id="documenter-article-toggle-button" href="javascript:;" title="Collapse all docstrings"></a></div></header><article class="content" id="documenter-page"><h1 id="The-PGEN-format"><a class="docs-heading-anchor" href="#The-PGEN-format">The PGEN format</a><a id="The-PGEN-format-1"></a><a class="docs-heading-anchor-permalink" href="#The-PGEN-format" title="Permalink"></a></h1><p>Content on this page is based on the <a href="https://github.com/chrchang/plink-ng/raw/master/pgen_spec/pgen_spec.pdf">draft specification</a>, distributed under GPLv3. </p><h2 id="Introduction"><a class="docs-heading-anchor" href="#Introduction">Introduction</a><a id="Introduction-1"></a><a class="docs-heading-anchor-permalink" href="#Introduction" title="Permalink"></a></h2><p>The PGEN format is the central file format for genomic data in PLINK 2. </p><ul><li>PLINK 1’s binary genotype file format (the BED format, can be read using <a href="https://github.com/OpenMendel/SnpArrays.jl">SnpArrays.jl</a>)<ul><li>Simple, compact, and supports direct computation on the packed data representation. Thanks to these properties, it continues to be widely used more than a decade after it was designed.</li><li>Limitation: can only represent unphased biallelic hard-called genotypes.<ul><li>suboptimal for GWASes which tend to benefit from inclusion of imputed dosages and more sophicated handling of multiallelic variants</li><li>cannot represent phase information for workflows like investigation of compound heterozygosity, imputation-related data management, etc.</li></ul></li></ul></li></ul><p>The widely-used binary genotype formats which addresses limitations above include BCF format and BGEN format, but they do not support direct computation on packed data, impossible to match efficiency of PLINK 1.9. </p><p>Hence, PLINK 2 decided to introduce a new binary genotype file format, the PGEN format. </p><ul><li>Backward-compatible to BED format</li><li>Can represent phased, multiallelic, and dosage data in a manner that better support <a href="https://www.nature.com/articles/nbt.2241">&quot;compressive genomics&quot;</a></li><li>Incorporates &quot;<a href="https://sysbiobig.dei.unipd.it/software/?q=software#SNPack">SNPack</a>-style&quot; genotype compression, reducing file sizes by 80+% with negligible encoding and decoding cost (and supporting some direct computation on the compressed representation)</li></ul><p>Not as simple as PLINK 1 format, but now it includes open-source internal library (pgenlib) to read and write the format. </p><p>Also introduced are: </p><ul><li>PSAM format, an extension of <code>.fam</code> format<ul><li>Stores categorical and other phenotype/covariates</li></ul></li><li>PVAR format, an extension of <code>.bim</code> format.<ul><li>Stores all header and variant-specific information. </li><li>Designed so that &quot;sites-only VCF&quot; files are directly valid PVAR files.</li></ul></li></ul><h2 id="PGEN-format"><a class="docs-heading-anchor" href="#PGEN-format">PGEN format</a><a id="PGEN-format-1"></a><a class="docs-heading-anchor-permalink" href="#PGEN-format" title="Permalink"></a></h2><p>Binary format capable of representing mixed-phase, multiallelic, mixed-hardcall/dosage/missing genotype data.</p><ul><li>A PLINK 1 variant-major .bed file is grandfathered in as a valid PGEN file. Simple to handle with the PGEN format definition.</li><li>PGEN(+PVAR) is designed to interoperate with, not replace, VCF/BCF. <ul><li>PGEN cannot represent: read depths, quality scores, or biallelic genotype probability triplets, or triploid genotypes. </li><li>It specializes on the subset of the VCF format which is relevant to PLINK’s function. </li><li>Fast VCF ↔ PGEN conversion in PLINK 2.</li></ul></li></ul><h3 id="File-organization"><a class="docs-heading-anchor" href="#File-organization">File organization</a><a id="File-organization-1"></a><a class="docs-heading-anchor-permalink" href="#File-organization" title="Permalink"></a></h3><ul><li>Header: information to enable random access to the variant record: e.g., record types and record length of each variant.<ul><li>Here, record type means how the genotype is compressed, if it contains phase and dosage information, etc. </li></ul></li><li>A sequence of variant records.</li></ul><p>A variant record’s main data track can be “LD-compressed” (LD = linkage disequilibrium):</p><ul><li>Most recent non-LD-compressed variant record and only storing genotype-category differences from it. <ul><li>The only type of inter-record dependency, </li></ul></li><li>Record type and size information in the header, and the genotypes from the latest non-LD-compressed variant is enough to decode genotypes of each variant sequentially.</li></ul><p>&lt;!– Three fixed-width storage modes are defined (covering basic unphased biallelic genotypes, unphased dosages, and phased dosages) which don’t have this limitation, and are especially straightforward to read and write; but they don’t benefit from PGEN’s low-overhead genotype compression. A future version of this specification may add a way to store most header information in a separate file, so that sequential reading, sequential writing, and genotype compression are simultaneously possible (at the cost of more annoying file management).–&gt;</p><h3 id="Header"><a class="docs-heading-anchor" href="#Header">Header</a><a id="Header-1"></a><a class="docs-heading-anchor-permalink" href="#Header" title="Permalink"></a></h3><ul><li>Magic number: <code>0x6c 0x1b</code>. </li><li>Storage mode<ul><li><code>0x01</code>: PLINK 1 BED format. Supported in <code>SnpArrays.jl</code>.</li><li><code>0x02</code>: the simplest PLINK 2 fixed-width format for unphased genotypes. Difference from <code>0x01</code> are header and genotype encoding rule.</li><li><code>0x03</code>: fixed-width unphased dosage</li><li><code>0x04</code>: fixed-width phased dosage</li><li><strong><code>0x10</code></strong>: standard variable-width format. Vast majority of the PLINK 2 files will be in this mode. <strong>Currently, only this mode is supported in <code>PGEN.jl</code></strong>. </li></ul></li><li>Dataset dimensions, header body formatting<ul><li>number of variants, samples, bits per record type, bytes per allele counts (for multiallelic variants), if reference alleles are provisional, etc.</li></ul></li><li>Variant block offsets<ul><li>Where each block of <span>$2^{16}$</span> = 65,536 variant records begin. i.e. starting point of variant 1, 65<em>537, 131</em>073, ... </li></ul></li><li>Main header body<ul><li>Packed array of <span>$2^{16}$</span> record types, lengths, etc. </li></ul></li></ul><p>e.g., random access to 65540-th (65536 + 4) variant can be achieved by scannig for the 2nd entry of Variant block offsets and then scanning the first four entries of main header body. The starting point of the variant is calculated by the start of the second variant block plus first three variant record lengths. </p><h3 id="Variant-record"><a class="docs-heading-anchor" href="#Variant-record">Variant record</a><a id="Variant-record-1"></a><a class="docs-heading-anchor-permalink" href="#Variant-record" title="Permalink"></a></h3><p>Each variant record starts with the main track for unphased biallelic hard-call genotypes, followed by the ten optional tracks:</p><ol><li>Multiallelic hard-calls</li><li>Hardcall-phase information</li><li>Biallelic dosage existence</li><li>Biallelic dosage values</li><li>Multiallelic dosage existence</li><li>Multiallelic dosage values</li><li>Biallelic phased-dosage existence</li><li>Biallelic phased-dosage values</li><li>Multiallelic phased-dosage existence</li><li>Multiallelic phased-dosage values</li></ol><h3 id="Difflists"><a class="docs-heading-anchor" href="#Difflists">Difflists</a><a id="Difflists-1"></a><a class="docs-heading-anchor-permalink" href="#Difflists" title="Permalink"></a></h3><p>Many genotypes and dosages are compressed in a <strong>difflist</strong>. It is designed to represent a sparse list of differences from something else. It does so in a manner that is compact, and supports fast checking of whether a specific sample ID is in the list. Struct for difflist is in the struct <code>DiffList</code>. </p><h3 id="Main-track"><a class="docs-heading-anchor" href="#Main-track">Main track</a><a id="Main-track-1"></a><a class="docs-heading-anchor-permalink" href="#Main-track" title="Permalink"></a></h3><p>Each genotype is represented in two-bit little-endian ordering: e.g., the for the two bytes of <code>0x1b 0xd8</code> for 8 samples:</p><pre><code class="nohighlight hljs">byte 1         byte 2
 0x1b           0xd8
 00 01 10 11    11 01 10 00
-s4 s3 s2 s1    s8 s7 s6 s5</code></pre><table><tr><th style="text-align: center">Sample index (1-based)</th><th style="text-align: center">genotype category</th></tr><tr><td style="text-align: center">1</td><td style="text-align: center"><code>0b11</code></td></tr><tr><td style="text-align: center">2</td><td style="text-align: center"><code>0b10</code></td></tr><tr><td style="text-align: center">3</td><td style="text-align: center"><code>0b01</code></td></tr><tr><td style="text-align: center">4</td><td style="text-align: center"><code>0b00</code></td></tr><tr><td style="text-align: center">5</td><td style="text-align: center"><code>0b00</code></td></tr><tr><td style="text-align: center">6</td><td style="text-align: center"><code>0b10</code></td></tr><tr><td style="text-align: center">7</td><td style="text-align: center"><code>0b01</code></td></tr><tr><td style="text-align: center">8</td><td style="text-align: center"><code>0b11</code></td></tr></table><table><tr><th style="text-align: center">genotype category</th><th style="text-align: center">PLINK 1</th><th style="text-align: center">PLINK 2</th></tr><tr><td style="text-align: center">0 = <code>0b00</code> = <code>0x00</code></td><td style="text-align: center">homozygous A1</td><td style="text-align: center">homozygous REF</td></tr><tr><td style="text-align: center">1 = <code>0b01</code> = <code>0x01</code></td><td style="text-align: center">missing</td><td style="text-align: center">heterozygous REF-ALT</td></tr><tr><td style="text-align: center">2 = <code>0b10</code> = <code>0x02</code></td><td style="text-align: center">heterozygouus A1-A2</td><td style="text-align: center">homozygous ALT</td></tr><tr><td style="text-align: center">3 = <code>0b11</code> = <code>0x03</code></td><td style="text-align: center">homozygous A2</td><td style="text-align: center">missing</td></tr></table><ul><li>A1: First allele listed in PLINK 1 bim file</li><li>A2: Second allele listed in PLINK 1 bim file</li><li>REF: Reference allele</li><li>ALT: Alternate allele</li></ul><p>In PLINK 1, A1 was often ALT, and A2 was often REF. However, this was not set in stone. In UK Biobank data, A1 is REF and A2 is ALT. </p><p>Seven record types are supported, represented by the bottom three bits of record type:</p><ul><li><code>0</code>: no compression.</li><li><code>1</code>: “1-bit” representation. This starts with a byte indicating what the two most common categories are (value 1: categories 0 and 1; 2: 0 and 2; 3: 0 and 3; 5: 1 and 2; 6: 1 and 3; 9: 2 and 3); followed by a bitarray describing which samples are in the higher-numbered category; followed by a difflist with all (sample ID, genotype category value) pairs for the two less common categories.</li><li><code>2</code>: LD-compressed. A difflist with all (sample ID, genotype category value) pairs for samples in different categories than they were in in the previous non-LD-compressed variant. The first variant of a variant block (i.e. its index is congruent to 0 mod <span>$2^{16}$</span>) cannot be LD-compressed.</li><li><code>3</code>: LD-compressed, inverted. A difflist with all (sample ID, inverted genotype value) pairs for samples in different categories than they would be in the previous non-LD-compressed variant after inversion (categories 0 and 2 swapped). I.e. decoding can be handled in the same way as for variant record type 2, except for a final inversion step applied after the difflist contents have been patched in. This addresses spots where the reference genome is “wrong” for the population of interest.</li><li><code>4</code>: Difflist with all (sample ID, genotype category value) pairs for samples outside category 0.</li><li>~~<code>5</code>: Reserved for future use. (When all samples appear to be in category 1, that usually implies a systematic variant calling problem.)~~</li><li><code>6</code>: Difflist with all (sample ID, genotype category value) pairs for samples outside category 2.</li><li><code>7</code>: Difflist with all (sample ID, genotype category value) pairs for samples outside category 3</li></ul><h3 id="Multiallelic-hardcalls"><a class="docs-heading-anchor" href="#Multiallelic-hardcalls">Multiallelic hardcalls</a><a id="Multiallelic-hardcalls-1"></a><a class="docs-heading-anchor-permalink" href="#Multiallelic-hardcalls" title="Permalink"></a></h3><p>Exists if the 4th bit of variant record type is set. Based on the main track, it defines a &quot;patch set&quot; in the form of difflist which sample has alternate allele other than &quot;ALT1&quot;. </p><h3 id="Phased-heterozygous-hard-calls"><a class="docs-heading-anchor" href="#Phased-heterozygous-hard-calls">Phased heterozygous hard-calls</a><a id="Phased-heterozygous-hard-calls-1"></a><a class="docs-heading-anchor-permalink" href="#Phased-heterozygous-hard-calls" title="Permalink"></a></h3><p>Exists if the 5th bit of variant record type is set. Stores whether each heterozygous call is phased, and if phased, what the phase is. &quot;<code>0|1</code>&quot; or &quot;<code>1|0</code>&quot;. PGEN does not distinguish &quot;<code>0|0</code>&quot; from &quot;<code>0/0</code>&quot;, and &quot;<code>1|1</code>&quot; from &quot;<code>1/1</code>&quot;. </p><h3 id="Dosages"><a class="docs-heading-anchor" href="#Dosages">Dosages</a><a id="Dosages-1"></a><a class="docs-heading-anchor-permalink" href="#Dosages" title="Permalink"></a></h3><p>Dosages are stored in 16-bit integers (<code>UInt16</code>). <code>0x0000</code>...<code>0x8000</code>(<span>$2^{15}$</span>) represent diploid ALT allele dosage values between <code>0.0</code>..<code>2.0</code>. <code>0xffff</code> represents missing value. Three record types are supported, based on 6th and 7th bits of record type. Dosages are required to be consistent with hard-calls (should be close enough from genotype).</p><ul><li>6th bit is set and 7th bit is clear: Track 3 (Biallelic dosage existence) is a difflist indicating which samples have dosage information. </li><li>6th bit is clear and 7th bit is set: Track 3 is omitted and Track 4 (Biallelic dosage values) has an entry for every single sample.</li><li>6th bit and 7th bit are both set: Track 3 is a BitArray indicating dosage for which sample exists. </li></ul><p>Samples without dosage values are assumed to have dosage level identical to their respective genotypes.</p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../">« PGENFiles.jl Tutorial</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.6.0 on <span class="colophon-date" title="Sunday 1 September 2024 02:30">Sunday 1 September 2024</span>. Using Julia version 1.10.5.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
+s4 s3 s2 s1    s8 s7 s6 s5</code></pre><table><tr><th style="text-align: center">Sample index (1-based)</th><th style="text-align: center">genotype category</th></tr><tr><td style="text-align: center">1</td><td style="text-align: center"><code>0b11</code></td></tr><tr><td style="text-align: center">2</td><td style="text-align: center"><code>0b10</code></td></tr><tr><td style="text-align: center">3</td><td style="text-align: center"><code>0b01</code></td></tr><tr><td style="text-align: center">4</td><td style="text-align: center"><code>0b00</code></td></tr><tr><td style="text-align: center">5</td><td style="text-align: center"><code>0b00</code></td></tr><tr><td style="text-align: center">6</td><td style="text-align: center"><code>0b10</code></td></tr><tr><td style="text-align: center">7</td><td style="text-align: center"><code>0b01</code></td></tr><tr><td style="text-align: center">8</td><td style="text-align: center"><code>0b11</code></td></tr></table><table><tr><th style="text-align: center">genotype category</th><th style="text-align: center">PLINK 1</th><th style="text-align: center">PLINK 2</th></tr><tr><td style="text-align: center">0 = <code>0b00</code> = <code>0x00</code></td><td style="text-align: center">homozygous A1</td><td style="text-align: center">homozygous REF</td></tr><tr><td style="text-align: center">1 = <code>0b01</code> = <code>0x01</code></td><td style="text-align: center">missing</td><td style="text-align: center">heterozygous REF-ALT</td></tr><tr><td style="text-align: center">2 = <code>0b10</code> = <code>0x02</code></td><td style="text-align: center">heterozygouus A1-A2</td><td style="text-align: center">homozygous ALT</td></tr><tr><td style="text-align: center">3 = <code>0b11</code> = <code>0x03</code></td><td style="text-align: center">homozygous A2</td><td style="text-align: center">missing</td></tr></table><ul><li>A1: First allele listed in PLINK 1 bim file</li><li>A2: Second allele listed in PLINK 1 bim file</li><li>REF: Reference allele</li><li>ALT: Alternate allele</li></ul><p>In PLINK 1, A1 was often ALT, and A2 was often REF. However, this was not set in stone. In UK Biobank data, A1 is REF and A2 is ALT. </p><p>Seven record types are supported, represented by the bottom three bits of record type:</p><ul><li><code>0</code>: no compression.</li><li><code>1</code>: “1-bit” representation. This starts with a byte indicating what the two most common categories are (value 1: categories 0 and 1; 2: 0 and 2; 3: 0 and 3; 5: 1 and 2; 6: 1 and 3; 9: 2 and 3); followed by a bitarray describing which samples are in the higher-numbered category; followed by a difflist with all (sample ID, genotype category value) pairs for the two less common categories.</li><li><code>2</code>: LD-compressed. A difflist with all (sample ID, genotype category value) pairs for samples in different categories than they were in in the previous non-LD-compressed variant. The first variant of a variant block (i.e. its index is congruent to 0 mod <span>$2^{16}$</span>) cannot be LD-compressed.</li><li><code>3</code>: LD-compressed, inverted. A difflist with all (sample ID, inverted genotype value) pairs for samples in different categories than they would be in the previous non-LD-compressed variant after inversion (categories 0 and 2 swapped). I.e. decoding can be handled in the same way as for variant record type 2, except for a final inversion step applied after the difflist contents have been patched in. This addresses spots where the reference genome is “wrong” for the population of interest.</li><li><code>4</code>: Difflist with all (sample ID, genotype category value) pairs for samples outside category 0.</li><li>~~<code>5</code>: Reserved for future use. (When all samples appear to be in category 1, that usually implies a systematic variant calling problem.)~~</li><li><code>6</code>: Difflist with all (sample ID, genotype category value) pairs for samples outside category 2.</li><li><code>7</code>: Difflist with all (sample ID, genotype category value) pairs for samples outside category 3</li></ul><h3 id="Multiallelic-hardcalls"><a class="docs-heading-anchor" href="#Multiallelic-hardcalls">Multiallelic hardcalls</a><a id="Multiallelic-hardcalls-1"></a><a class="docs-heading-anchor-permalink" href="#Multiallelic-hardcalls" title="Permalink"></a></h3><p>Exists if the 4th bit of variant record type is set. Based on the main track, it defines a &quot;patch set&quot; in the form of difflist which sample has alternate allele other than &quot;ALT1&quot;. </p><h3 id="Phased-heterozygous-hard-calls"><a class="docs-heading-anchor" href="#Phased-heterozygous-hard-calls">Phased heterozygous hard-calls</a><a id="Phased-heterozygous-hard-calls-1"></a><a class="docs-heading-anchor-permalink" href="#Phased-heterozygous-hard-calls" title="Permalink"></a></h3><p>Exists if the 5th bit of variant record type is set. Stores whether each heterozygous call is phased, and if phased, what the phase is. &quot;<code>0|1</code>&quot; or &quot;<code>1|0</code>&quot;. PGEN does not distinguish &quot;<code>0|0</code>&quot; from &quot;<code>0/0</code>&quot;, and &quot;<code>1|1</code>&quot; from &quot;<code>1/1</code>&quot;. </p><h3 id="Dosages"><a class="docs-heading-anchor" href="#Dosages">Dosages</a><a id="Dosages-1"></a><a class="docs-heading-anchor-permalink" href="#Dosages" title="Permalink"></a></h3><p>Dosages are stored in 16-bit integers (<code>UInt16</code>). <code>0x0000</code>...<code>0x8000</code>(<span>$2^{15}$</span>) represent diploid ALT allele dosage values between <code>0.0</code>..<code>2.0</code>. <code>0xffff</code> represents missing value. Three record types are supported, based on 6th and 7th bits of record type. Dosages are required to be consistent with hard-calls (should be close enough from genotype).</p><ul><li>6th bit is set and 7th bit is clear: Track 3 (Biallelic dosage existence) is a difflist indicating which samples have dosage information. </li><li>6th bit is clear and 7th bit is set: Track 3 is omitted and Track 4 (Biallelic dosage values) has an entry for every single sample.</li><li>6th bit and 7th bit are both set: Track 3 is a BitArray indicating dosage for which sample exists. </li></ul><p>Samples without dosage values are assumed to have dosage level identical to their respective genotypes.</p></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../">« PGENFiles.jl Tutorial</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.6.0 on <span class="colophon-date" title="Monday 2 September 2024 02:25">Monday 2 September 2024</span>. Using Julia version 1.10.5.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
diff --git a/dev/index.html b/dev/index.html
index de4468c..4f91e43 100644
--- a/dev/index.html
+++ b/dev/index.html
@@ -232,4 +232,4 @@
     end
     
     # do someting with dosage values in `d`...
-end</code></pre><h2 id="Speed"><a class="docs-heading-anchor" href="#Speed">Speed</a><a id="Speed-1"></a><a class="docs-heading-anchor-permalink" href="#Speed" title="Permalink"></a></h2><p>The current PGEN package can read in ~2000 variants / second for UK Biobank data, which is about 4x faster than reading in BGEN-formatted data. </p></article><nav class="docs-footer"><a class="docs-footer-nextpage" href="PGEN_description/">PGEN format description »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.6.0 on <span class="colophon-date" title="Sunday 1 September 2024 02:30">Sunday 1 September 2024</span>. Using Julia version 1.10.5.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
+end</code></pre><h2 id="Speed"><a class="docs-heading-anchor" href="#Speed">Speed</a><a id="Speed-1"></a><a class="docs-heading-anchor-permalink" href="#Speed" title="Permalink"></a></h2><p>The current PGEN package can read in ~2000 variants / second for UK Biobank data, which is about 4x faster than reading in BGEN-formatted data. </p></article><nav class="docs-footer"><a class="docs-footer-nextpage" href="PGEN_description/">PGEN format description »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option><option value="catppuccin-latte">catppuccin-latte</option><option value="catppuccin-frappe">catppuccin-frappe</option><option value="catppuccin-macchiato">catppuccin-macchiato</option><option value="catppuccin-mocha">catppuccin-mocha</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.6.0 on <span class="colophon-date" title="Monday 2 September 2024 02:25">Monday 2 September 2024</span>. Using Julia version 1.10.5.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>

genotype category	PLINK 1	PLINK 2
0 = `0b00` = `0x00`	homozygous A1	homozygous REF
1 = `0b01` = `0x01`	missing	heterozygous REF-ALT
2 = `0b10` = `0x02`	heterozygouus A1-A2	homozygous ALT
3 = `0b11` = `0x03`	homozygous A2	missing