A model trained on Q20+ data (R10.4.1 flow cells with kit14 chemistry) for Clair3-Trio.
- How to use the Q20+ model
- About
c3t_hg002_dna_r1041_e82_400bps_sup
- Q20+ data used for training and testing
INPUT_DIR="[YOUR_INPUT_FOLDER]" # e.g. /input
REF=${_INPUT_DIR}/ref.fa # change your reference file name here
OUTPUT_DIR="[YOUR_OUTPUT_FOLDER]" # e.g. /output
THREADS="[MAXIMUM_THREADS]" # e.g. 8
MODEL_C3="r1041_e82_400bps_sup_v400" # q20 Clair3 model
MODEL_C3T="c3t_hg002_dna_r1041_e82_400bps_sup" # q20 Clair3-Trio model
docker run -it \
-v ${INPUT_DIR}:${INPUT_DIR} \
-v ${OUTPUT_DIR}:${OUTPUT_DIR} \
hkubal/clair3-trio:latest \
/opt/bin/run_clair3_trio.sh \
--ref_fn=${INPUT_DIR}/ref.fa \ ## change your reference file name here
--bam_fn_c=${INPUT_DIR}/child_input.bam \ ## change your child's bam file name here
--bam_fn_p1=${INPUT_DIR}/parent1_input.bam \ ## change your parent1's bam file name here
--bam_fn_p2=${INPUT_DIR}/parent2_input.bam \ ## change your parent2's bam file name here
--sample_name_c=${SAMPLE_C} \ ## change your child's name here
--sample_name_p1=${SAMPLE_P1} \ ## change your parent1's name here
--sample_name_p2=${SAMPLE_P2} \ ## change your parent2's name here
--threads=${THREADS} \ ## maximum threads to be used
--model_path_clair3="/opt/models/clair3_models/${MODEL_C3}" \
--model_path_clair3_trio="/opt/models/clair3_trio_models/${MODEL_C3T}" \
--output=${OUTPUT_DIR} ## absolute output path prefix
Check Usage for more options.
The Q20+ model are trained on data obtained from ONT (basecalled three GIAB samples (HG002, 3, 4) with fast5 available using dorado with the sup mode).
We trained the model while holding out chromosome 20 in training stages and preserving it for testing.
We compared the new model trained gainst Clair3 v1.0, r1041_e82_400bps_sup_v400 and PEPPER r0.8, ont_r10_q20.
All testing output are available at here and analysis results are in this table.
Sample | Reference | Aligner | Coverage | Basecaller | Chemistry |
---|---|---|---|---|---|
HG002 | GRCh38_no_alt | minimap2 | 70.13 | Dorado v4.0.0 SUP | R10.4.1 E8.2 |
HG003 | GRCh38_no_alt | minimap2 | 79.66 | Dorado v4.0.0 SUP | R10.4.1 E8.2 |
HG004 | GRCh38_no_alt | minimap2 | 64.72 | Dorado v4.0.0 SUP | R10.4.1 E8.2 |
We trained the model while holding out chromosome 20 in training stages and preserving it for testing.