Skip to content

Simple generator of DNA barcodes for sequencing experiments

License

Notifications You must be signed in to change notification settings

lzamparo/DNAbarcodes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DNAbarcodes

Simple generator of DNA barcodes for sequencing experiments

A pair of simple scripts to generate DNA barcodes with nice properties:

  • arbitrary length of code
  • a minimum Hamming distance between each code
  • a GC content percentage upper and lower bound for each code
  • a guarantee that no code will contain runs of longer than three of the same base

Dependencies

  • Python 3.6 (though >3.4 may work)
  • Numba 0.36
  • Numpy > 1.12

Best best is to use an Anaconda python distribution. Anaconda 5.0.1 of Python 3.6 worked out of the box for me.

Usage

There are two scripts: generate_barcodes.py and validate_barcodes.py. generate_barcodes.py takes a length and filename, and outputs all possible codes of that length:

$ python generate_barcodes <length> <output file>

Once the potential barcode strings have been generated, you can run validate_barcodes.py to get your barcodes by specifying the file of all possible codes, the name where your barcodes should be written, and how many barcodes you need:

$ python validate_barcodes <input file> <output file> <number of codes>

About

Simple generator of DNA barcodes for sequencing experiments

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published