Skip to content

A Drush-based loader for VCF files that follows the genotype storage rules outlined by ND genotypes.

License

Notifications You must be signed in to change notification settings

UofS-Pulse-Binfo/genotypes_loader

Repository files navigation

Tripal Dependency GitHub release (latest by date including pre-releases)

Build Status Maintainability Test Coverage

Genotypes Loader

This module provides a drush command to load genotypic data from a variety of file formats including Variant Call Format (VCF), Genotype Matrix and Genotype Flat-File formats as described below. It stores genotype calls in the custom chado-esque genotype_call table; whereas, all other meta data is stored in a chado-compliant manner.

Dependencies

  1. Tripal 3.x (Installation Instructions)
  2. PostgreSQL 9.3 (9.4+ recommended; tested with 11.3)

Installation

This module is installed by cloning it and it's dependencies in [your drupal site]/sites/all/modules and enabling it through the Drupal Administrative UI. Specifically, once you have a working Tripal environment:

cd [drupal root]/sites/all/modules
git clone https://github.com/uofs-pulse-binfo/genotypes_loader
drush pm-enable genotypes_loader

Features

  • Extensive configuration allowing for flexiblity in ontology terms used.
  • Drush command for easy import of VCF, genotypes matrix and flatfile formats.
  • Both prompt and option support for drush command, making it easy for scripting, as well as, human interaction.
  • genotype_call table for efficient data storage.

Documentation Documentation Status

Please visit our online documentation to learn more about installation, usage and features.

Demonstration

If you would like to evaluate this module using demonstration data, there is a full tutorial in our documentation.

Funding

This work is supported by Saskatchewan Pulse Growers [grant: BRE1516, BRE0601], Western Grains Research Foundation, Genome Canada [grant: 8302, 16302], Government of Saskatchewan [grant: 20150331], and the University of Saskatchewan.

Citation

Caron, C. and Sanderson, L.A. (2020). Genotypes Loader: Efficient large-scale genotypic data import into Chado. Version 1.0. University of Saskatchewan, Pulse Crop Research Group, Saskatoon, SK, Canada.