Papet predictive models

This repository contains models to predict 5-methyl cytosine (5mC) from PacBio SequelII sequencing data.

These models were trained using PAcbio Prediction of Epigenetics Technology (Papet) model-kinetic software (see documentation) for more informations. These models can be used to predict 5mC using Papet predict software (see documentation)

Content

Six different types of models are present in this repository:

models of the raw kinetics, stored in the raw directory
models of the normalized kinetics, stored in normalized
models of the raw kinetics accounting for neighbouring position dependencies, stored in the diposition directory
models of the normalized kinetics accounting for neighbouring position dependencies, stored in the diposition_normalized directory
models of the raw kinetics accounting for all possible pairwise position dependencies, stored in the pairwise directory
models of the normalized kinetics accounting for all possible pairwise position dependencies, stored in the pairwise_normalized directory

About kinetic models

To learn more about PacBio kinetics and the kinetic models we devised, see the dedicated documention.

About the training

The models were trained using two SequelII whole genome sequencing datasets produced by PacBio, originating from hg002.

The first dataset was created by subjecting a hg002 DNA sample to whole genome amplification (WGA). The resulting DNA was used to create a library deprived of epigenetic modifications. Eventually, the library was sequenced using a SequelII sequencer.

The second dataset was created by subjecting a fraction of the 1st sample to a Sssl enzymatic treatment. The Sssl enzyme deposites 5mC to CpG specifically, leading to the creation of CpG methylated library. This library was sequencingd using a SequelII sequencer.

The resulting CCSs were mapped against hg38p8 genome reference.

The kinetic sequence normalization model was trained on the WGA dataset using papet model-sequence using 11-mers. All the kinetic signal models were then trained using a 17bp window, centered on each individual CpGs using papet model-kinetic.

The unmethylated CpG kinetic models were trained on the WGA dataset. The methylated CpG kinetic models were trained on the Sssl dataset.

Usage

The models need to be decompressed prior usage. Then, simply use them with papet predict with the --modelMeth and --modelUnmeth options.

Note that the file extensions (.rawkineticmodel, .normalizedkineticmodel, etc) must not be altered.

Authors

Romain Groux

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
models/sequelII/v1.0-alpha		models/sequelII/v1.0-alpha
.gitattributes		.gitattributes
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Papet predictive models

Table of content

Content

About kinetic models

About the training

Usage

Authors

About

Releases

Packages

License

ngs-ai-org/papet-models

Folders and files

Latest commit

History

Repository files navigation

Papet predictive models

Table of content

Content

About kinetic models

About the training

Usage

Authors

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages