Skip to content

Latest commit

 

History

History
executable file
·
31 lines (23 loc) · 2.85 KB

README.md

File metadata and controls

executable file
·
31 lines (23 loc) · 2.85 KB

CRISPRiaDesign

This site hosts the sgRNA machine learning scripts used to generate the Weissman lab's next-generation CRISPRi and CRISPRa library designs (Horlbeck et al., eLife 2016). These are currently implemented as interactive scripts along with iPython notebooks with step-by-step instructions for creating new sgRNA libraries. Future plans include adding command line functions to make library design more user-friendly. Note that all sgRNA designs for CRISPRi/a human/mouse protein-coding gene libraries are included as supplementary tables in the eLife paper, so cloning of individual sgRNAs or construction of any custom sublibraries targeting protein-coding genes can simply refer to those tables. These scripts are primarily useful for the design of sgRNAs targeting novel or non-coding genes, or for organisms beyond human and mouse.

To apply the exact quantitative models used to generate the CRISPRi-v2 or CRISPRa-v2 libraries, follow the steps outlined in the Library_design_walkthrough (included as a Jupyter notebook or web page).

To see full example code for de novo machine learning, prediction of sgRNA activity for desired loci, and construction of new genome-scale CRISPRi/a libraries, see the CRISPRiaDesign_example_notebook (included as Jupyter notebook or web page).

Dependencies

External command line applications required:

  • ViennaRNA
  • Bowtie (not Bowtie2)

Large genomic data files required:

Links are to human genome files relied upon for the hCRISPRi-v2 and hCRISPRa-v2 machine learning--and required for the Library_design_walkthrough--but any organism/assembly may be used for design of new libraries or de novo machine learning. For convenience, the files referenced in Library_design_walkthrough in the folder "large_data_files" are also available here.

  • Genome sequence as FASTA (hg19)
  • FANTOM5 TSS annotation as BED (TSS_human)
  • Chromatin data as BigWig (MNase, DNase, FAIRE-seq)
  • HGNC table of gene aliases (not strictly required for the Library_design_walkthrough but useful in some steps)
  • Ensembl annotation as GTF (not strictly required for the Library_design_walkthrough but useful in some steps and in other functions; release 74 used for the published library designs)