PySnpTools is a library for reading and manipulating genetic data.
Main Features:
-
SnpReader: Efficiently read genetic PLINK formats including *.bed/bim/fam files. Also, efficiently read parts of files, read kernel data, and standardize data. New features include multi-threaded BED reading, cluster-ready BED data, on-the-fly SNP generation, and larger in-memory data.
-
DistReader: Efficiently work with unphased BGEN format and other diploid, biallelic distribution data. Also, efficiently read parts of files. See Distribution IPython Notebook.
-
util: In one line, intersect and re-order IIDs from snpreader and other sources. Also, efficiently extract a submatrix from an ndarray.
-
IntRangeSet: Efficiently manipulate ranges of integers - for example, genetic position - with set operators including union, intersection, and set difference.
-
mapreduce1: Run loops locally, on multiple processors, or on any cluster.
-
filecache: Read and write files locally or from/to any remote storage.
pip install pysnptools
If you need support for BGEN files, instead do:
pip install pysnptools[bgen]
- Main Documentation with examples. It includes links to tutorial slides, notebooks, and video.
- Project Home and Full Annotated Bibliography
- Email the developers at fastlmm-dev@python.org.
- Join the user discussion and announcement list (or use web sign up).
- Open an issue on GitHub.