Skip to content

Commit

Permalink
Refactoring to a proper package
Browse files Browse the repository at this point in the history
  • Loading branch information
ajank committed Jul 9, 2015
1 parent 0bc6cbc commit 7064326
Show file tree
Hide file tree
Showing 7 changed files with 487 additions and 461 deletions.
12 changes: 1 addition & 11 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,11 +1 @@
# History files
.Rhistory

# Example code in package build process
*-Ex.R

# R data files from past sessions
.Rdata

# RStudio files
.Rproj.user/
inst/doc
17 changes: 17 additions & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
Package: Romulus
Title: Robust Multi-State Identification of Transcription Factor Binding Sites
Version: 1.0.0.9000
Authors@R: person("Aleksander", "Jankowski", email = "ajank@mimuw.edu.pl", role = c("aut", "cre"))
Description: Romulus is a computational method to accurately identify individual
transcription factor binding sites from genome sequence information and
cell-type--specific experimental data, such as DNase-seq. It combines the
strengths of its predecessors, CENTIPEDE and Wellington, while keeping
the number of free parameters in the model robustly low. The method is
unique in allowing for multiple binding states for a single transcription
factor, differing in their cut profile and overall number of DNase I cuts.
URL: http://github.com/ajank/Romulus
Depends: R (>= 2.10)
License: GPL-3
LazyData: true
Suggests: knitr
VignetteBuilder: knitr
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
exportPattern("^[^\\.]")
33 changes: 33 additions & 0 deletions R/data.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
#' Annotations of 4,828 NRSF (REST) motif instances.
#'
#' A dataset containing the annotations of 4,828 NRSF (REST) motif instances in the human genome (hg19 assembly).
#'
#' @format A data frame with 4828 rows and 6 variables:
#' \describe{
#' \item{chrom}{chromosome}
#' \item{start}{starting base pair (1-based, inclusive)}
#' \item{end}{ending base pair}
#' \item{strand}{strand (\code{"-"} or \code{"+"})}
#' \item{score}{Position Weight Matrix score (log-likelihood)}
#' \item{signalValue}{ChIP-seq signal value in K562 cells}
#' }
#' @source Motif instances are downloaded from \url{http://homer.salk.edu/homer/} (HOMER Known Motifs track).
#'
#' ChIP-seq signal value was extracted from \url{http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeAwgTfbsUniform/wgEncodeAwgTfbsHaibK562NrsfV0416102UniPk.narrowPeak.gz}.
"NRSF.anno"

#' DNase I cuts around 4,828 NRSF (REST) motif instances.
#'
#' A dataset containing the exact numbers of DNase I cuts around 4,828 NRSF (REST) motif instances, split into forward and reverse strand cuts.
#'
#' @format An integer matrix with 4828 rows and 838 columns. Columns 1-419 correspond to forward strand cuts (200 bp upstream + 19 bp motif site + 200 bp downstream), while columns 420-838 correspond to reverse strand cuts at the same positions.
#'
#' @source Extracted from \url{http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeOpenChromDnase/wgEncodeOpenChromDnaseK562AlnRep1V2.bam}.
"NRSF.cuts"

#' Number of base pairs of margin for \code{NRSF.cuts}.
#'
#' Number of base pairs of upstream and downstream margin for \code{NRSF.cuts}.
#'
#' @format Integer, equal to 200 (base pairs).
"NRSF.margin"
Loading

0 comments on commit 7064326

Please sign in to comment.