Memory Optimization #386

debasish83 · 2015-03-22T07:16:38Z

Memory optimization are done to bring optimize.linear.NNLS runtime closer to mllib NNLS and optimize.proximal.QuadraticMinimizer default close to blas.dposv

NNLS:iterator pattern cleaned for speed, in-place gemv added,initialState API provided for state reuse;PowerMethod:specialized on DenseMatrix and DenseVector for speed;QuadraticMinimizer:iterator pattern cleaned for speed,memory optimization to bring runtime close to dposv

…tate API provided for state reuse;PowerMethod:specialized on DenseMatrix and DenseVector for speed;QuadraticMinimizer:iterator pattern cleaned for speed,memory optimization to bring runtime close to dposv

debasish83 · 2015-03-22T07:18:42Z

I will take a closer look on comparisons with ml cholesky solver tomorrow...the first iteration is always slow...I am not quite sure why...

debasish83 · 2015-03-22T16:11:28Z

By the way sorry to turn the code into a C-style code but I had to make sure no memories are allocated in the whole algorithm except through NNLS.initialize and QuadraticMinimizer.initialize since we were compared against native BLAS dposv :-)

When you review please let me know if you see any additional memory allocation in algorithm inner loop (iterations) since I am using lot of breeze overloaded functions and might have missed things...

May be there are ways to optimize initialize further so that first iteration also comes close to mllib numbers...

dlwh · 2015-03-22T19:24:55Z

math/src/main/scala/breeze/optimize/linear/NNLS.scala

i believe this is not what you want. This changes the operation.

scala> implicit class Foo(x: Int) { def dot(y: Int) = x * y } defined class Foo scala> 3 dot 2 + 3 res1: Int = 15 scala> 3.dot(2) + 3 res2: Int = 9

My bad...fixed it

debasish83 · 2015-03-23T06:33:44Z

any memory specific optimization ? Most of the time I run QuadraticMinimizer first and so hot spot is an issue...I will try the same run after warming JVM tomorrow and finish up the API changes in AM

dlwh · 2015-03-23T06:43:05Z

Remember it's not just that the JVM is warming up: It has to warm up for
each individual class. In general, methods aren't fully optimized until
they're called about 10,000 times. So really you should be running the
algorithm several times, then time it.

-- David

On Sun, Mar 22, 2015 at 11:33 PM, Debasish Das notifications@github.com
wrote:

any memory specific optimization ? Most of the time I run
QuadraticMinimizer first and so hot spot is an issue...I will try the same
run after warming JVM tomorrow and finish up the API changes in AM

—
Reply to this email directly or view it on GitHub
#386 (comment).

debasish83 · 2015-03-23T19:39:18Z

I added an API to provide upper triangular gram matrix and with that the runtime in the first iteration also dropped...I think QuadraticMinimizer should be able to replace the ML CholeskySolver now...

…x provided as primitive array for supporting normal equations

debasish83 · 2015-03-24T20:39:11Z

The first iteration issue is consistent with both NNLS and QuadraticMinimizer...Out of curiosity, I looked at the code and both mllib and Breeze back matrix and vector workspace with Arrays[Double]...so I am really not clear why there is the 2X difference only in initial iterations....Is it due to the overhead from traits that show up in DenseVector/DenseMatrix ?

During the solve things are clean and so I don't think there are cases where BLAS using native memory is faster than QuadraticMinimizer sending memory to LAPACK to work on...

dlwh · 2015-03-24T20:44:02Z

using a DenseVector for the first time incurs a lot of overhead: operators
are populated into the multimethod maps, lots of interfaces are loaded,
etc. You really clock each of these after running the exact same code path
at least once. E.g.

def time(x: =>Unit) {
for(i <- 0 until N) x

val in = System.currentTimeMillis();
x;
out = System.currentTimeMillis();
}

On Tue, Mar 24, 2015 at 1:39 PM, Debasish Das notifications@github.com
wrote:

The first iteration issue is consistent with both NNLS and
QuadraticMinimizer...Out of curiosity, I looked at the code and both mllib
and Breeze back matrix and vector workspace with Arrays[Double]...so I am
really not clear why there is the 2X difference only in initial
iterations....Is it due to the overhead from traits that show up in
DenseVector/DenseMatrix ?

During the solve things are clean and so I don't think there are cases
where BLAS using native memory is faster than QuadraticMinimizer sending
memory to LAPACK to work on...

—
Reply to this email directly or view it on GitHub
#386 (comment).

dlwh · 2015-03-25T05:50:24Z

what's the status here? Can I merge this? I really want to release a fix for the SparseVector bug

debasish83 · 2015-03-25T06:21:44Z

I am ok with this...moving from DenseVector/DenseMatrix to Array will make the code ugly

Memory Optimization

debasish83 · 2015-03-25T14:17:20Z

Please let me know when you cut 0.11.2...

dlwh · 2015-03-25T17:33:08Z

tongiht

On Wed, Mar 25, 2015 at 7:17 AM, Debasish Das notifications@github.com
wrote:

Please let me know when you cut 0.11.2...

—
Reply to this email directly or view it on GitHub
#386 (comment).

NNLS:iterator pattern cleaned for speed, in-place gemv added,initialS…

97e2e30

…tate API provided for state reuse;PowerMethod:specialized on DenseMatrix and DenseVector for speed;QuadraticMinimizer:iterator pattern cleaned for speed,memory optimization to bring runtime close to dposv

dlwh reviewed Mar 22, 2015
View reviewed changes

API changes as per reviews; added API for upper triangular gram matri…

54f7ee9

…x provided as primitive array for supporting normal equations

This was referenced Mar 24, 2015

[ML] SPARK-2426: Integrate Breeze NNLS with ML ALS apache/spark#5005

Closed

[ML][MLLIB] SPARK-2426: Integrate Breeze QuadraticMinimizer with ALS apache/spark#3221

Closed

dlwh added a commit that referenced this pull request Mar 25, 2015

Merge pull request #386 from debasish83/directopt

7be3895

Memory Optimization

dlwh merged commit 7be3895 into scalanlp:master Mar 25, 2015

Memory Optimization #386

Memory Optimization #386

Uh oh!

Conversation

debasish83 commented Mar 22, 2015

Uh oh!

debasish83 commented Mar 22, 2015

Uh oh!

debasish83 commented Mar 22, 2015

Uh oh!

dlwh Mar 22, 2015

Choose a reason for hiding this comment

Uh oh!

debasish83 Mar 23, 2015

Choose a reason for hiding this comment

Uh oh!

debasish83 commented Mar 23, 2015

Uh oh!

dlwh commented Mar 23, 2015

Uh oh!

debasish83 commented Mar 23, 2015

Uh oh!

debasish83 commented Mar 24, 2015

Uh oh!

dlwh commented Mar 24, 2015

Uh oh!

dlwh commented Mar 25, 2015

Uh oh!

debasish83 commented Mar 25, 2015

Uh oh!

debasish83 commented Mar 25, 2015

Uh oh!

dlwh commented Mar 25, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants