Adding several simple normalizations. #311

fnothaft · 2014-07-14T09:58:45Z

Added two normalizations:

Normalization via Z Score
Target length normalization (e.g., RPKM)

@tdanford @carlyeks and I discussed the best place for this, and thought that it made the most sense to have it inside of ADAM, as downstream tools like RNAdam will depend on it, and the normalizations are useful primitives across many omics algorithms. Also, I would entertain any comments about whether org.bdgenomics.adam.rdd.normalization is the best package for this.

tdanford · 2014-07-14T10:02:24Z

adam-core/src/main/scala/org/bdgenomics/adam/rdd/normalization/LengthNormalization.scala

+ *
+ * @see pkn
+ */
+ def apply[I <: Interval, T](rdd: RDD[(Double, I, T)]): RDD[(Double, I, T)] = {


Just out of curiosity, why the (Double, I, T) ordering here? I would have expected most of these RDDs to have a form keyed off the interval (i.e. (I, Double)) and if you wanted to carry along extra information, maybe a key-value where the value was a pair (i.e. (I, (Double, T))).

tdanford · 2014-07-14T20:23:29Z

I'm still reviewing this @fnothaft, sorry for the delay...

carlyeks · 2014-07-22T16:06:59Z

Ping @tdanford.

tdanford · 2014-07-28T09:41:11Z

Ping @fnothaft -- see my question, above, about the (Double, I, T) type signature.

tdanford · 2014-07-29T10:22:30Z

Ping @fnothaft :-)

fnothaft · 2014-07-29T14:18:51Z

@tdanford just addressed your comment, and rebased the change on master.

carlyeks · 2014-07-29T15:31:28Z

Jenkins, retest this please.

tdanford · 2014-07-29T15:48:11Z

adam-core/src/main/scala/org/bdgenomics/adam/rdd/normalization/ZScoreNormalization.scala

+ * @tparam T Type of data passed along.
+ */
+ def apply[T](rdd: RDD[(Double, T)]): RDD[(Double, T)] = {
+ val cachedRdd = rdd.cache


Out of curiosity, Frank, why (here, and above as well) the explicit call to cache and unpersist? Given that these RDDs appear to only be used once...?

@tdanford cachedRdd is used in 1 count and 3 separate map calls.

No description provided.

Indeed; all your base are belong to me now.

tdanford · 2014-07-29T17:06:51Z

Jenkins, retest this please.

tdanford · 2014-07-29T17:09:03Z

I'm going to merge this anyway, since it builds in my hands -- and the tests that are failing aren't the ones you've added here.

Adding several simple normalizations.

tdanford · 2014-07-29T17:09:12Z

Thanks, @fnothaft!

tdanford reviewed Jul 14, 2014
View reviewed changes

tdanford reviewed Jul 29, 2014
View reviewed changes

[ADAM-311] Adding several simple normalizations.

5d5e596

tdanford added a commit that referenced this pull request Jul 29, 2014

Merge pull request #311 from fnothaft/simple-normalizations

70d81f8

Adding several simple normalizations.

tdanford merged commit 70d81f8 into bigdatagenomics:master Jul 29, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding several simple normalizations. #311

Adding several simple normalizations. #311

fnothaft commented Jul 14, 2014

tdanford Jul 14, 2014

tdanford commented Jul 14, 2014

carlyeks commented Jul 22, 2014

tdanford commented Jul 28, 2014

tdanford commented Jul 29, 2014

fnothaft commented Jul 29, 2014

carlyeks commented Jul 29, 2014

tdanford Jul 29, 2014

fnothaft Jul 29, 2014

tdanford Jul 29, 2014

fnothaft Jul 29, 2014

tdanford commented Jul 29, 2014

tdanford commented Jul 29, 2014

tdanford commented Jul 29, 2014

Adding several simple normalizations. #311

Adding several simple normalizations. #311

Conversation

fnothaft commented Jul 14, 2014

tdanford Jul 14, 2014

Choose a reason for hiding this comment

tdanford commented Jul 14, 2014

carlyeks commented Jul 22, 2014

tdanford commented Jul 28, 2014

tdanford commented Jul 29, 2014

fnothaft commented Jul 29, 2014

carlyeks commented Jul 29, 2014

tdanford Jul 29, 2014

Choose a reason for hiding this comment

fnothaft Jul 29, 2014

Choose a reason for hiding this comment

tdanford Jul 29, 2014

Choose a reason for hiding this comment

fnothaft Jul 29, 2014

Choose a reason for hiding this comment

tdanford commented Jul 29, 2014

tdanford commented Jul 29, 2014

tdanford commented Jul 29, 2014