Skip to content

Conversation

@coderxiang
Copy link
Contributor

When I run the LBFGS implementation of Spark on a moderate size proprietary data set (~ 5M samples), I see the resultant vector diverges from some standard solver (e.g., liblinear, sci-learn) on around half of the coefficients. The computation usually stops around 20 iterations although I set the iteration number to be 100 with a very strict tolerance variable (1e-15). The formulation is strongly convex so the solution should be unique. Configing the improvementTol variable inside FirstOrderMinimizer.scala can solve this issue.

Two ways of doing this:

  1. Exposing the improvementTol variable to LBFGS, as is done in the PR
  2. Change the default value of improvementTol in FirstOrderMinimizer.scala

I chose the first one here.

@dlwh
Copy link
Member

dlwh commented Mar 2, 2015

thanks. Yeah, I need to figure out the convergence checks.

dlwh added a commit that referenced this pull request Mar 2, 2015
expose improvementTol inside FirstOrderMinimizer
@dlwh dlwh merged commit 7d42e37 into scalanlp:master Mar 2, 2015
@coderxiang coderxiang deleted the lbfgs-tol branch March 2, 2015 22:11
@dbtsai
Copy link
Contributor

dbtsai commented Mar 3, 2015

@dlwh, are you going to publish this so I can upgrade Spark 1.3 dependency before it's released? thanks.

@dlwh
Copy link
Member

dlwh commented Mar 3, 2015

sure. what's the timeline?

On Mon, Mar 2, 2015 at 6:17 PM, DB Tsai notifications@github.com wrote:

@dlwh https://github.com/dlwh, are you going to publish this so I can
upgrade Spark 1.3 dependency before it's released? thanks.


Reply to this email directly or view it on GitHub
#373 (comment).

@dbtsai
Copy link
Contributor

dbtsai commented Mar 3, 2015

Is OWLQN affected by this issue? Spark 1.3 is RC2, so if we want to change, we need to be fast. Thanks.

@dlwh
Copy link
Member

dlwh commented Mar 3, 2015

ok, Debasish has been working on his new optimization algorithm. I imagine
that that's post Spark 1.3 in any case?

On Mon, Mar 2, 2015 at 6:49 PM, DB Tsai notifications@github.com wrote:

Is OWLQN affected by this issue? Spark 1.3 is RC2, so if we want to
change, we need to be fast. Thanks.


Reply to this email directly or view it on GitHub
#373 (comment).

@dlwh
Copy link
Member

dlwh commented Mar 3, 2015

owlqn's convergence conditions are also fixed by this.

On Mon, Mar 2, 2015 at 7:17 PM, David Hall david.lw.hall@gmail.com wrote:

ok, Debasish has been working on his new optimization algorithm. I imagine
that that's post Spark 1.3 in any case?

On Mon, Mar 2, 2015 at 6:49 PM, DB Tsai notifications@github.com wrote:

Is OWLQN affected by this issue? Spark 1.3 is RC2, so if we want to
change, we need to be fast. Thanks.


Reply to this email directly or view it on GitHub
#373 (comment).

@dbtsai
Copy link
Contributor

dbtsai commented Mar 3, 2015

Sounds great. Then let's publish the breeze with this fix, and I'll open another PR to Spark.

@dlwh
Copy link
Member

dlwh commented Mar 3, 2015

0.11 is released. Will take a bit to propagate to maven central.

asfgit pushed a commit to apache/spark that referenced this pull request Mar 4, 2015
…ce bug

LBFGS and OWLQN in Breeze 0.10 has convergence check bug.
This is fixed in 0.11, see the description in Breeze project for detail:

scalanlp/breeze#373 (comment)

Author: Xiangrui Meng <meng@databricks.com>
Author: DB Tsai <dbtsai@alpinenow.com>
Author: DB Tsai <dbtsai@dbtsai.com>

Closes #4879 from dbtsai/breeze and squashes the following commits:

d848f65 [DB Tsai] Merge pull request #1 from mengxr/AlpineNow-breeze
c2ca6ac [Xiangrui Meng] upgrade to breeze-0.11.1
35c2f26 [Xiangrui Meng] fix LRSuite
397a208 [DB Tsai] upgrade breeze

(cherry picked from commit 76e20a0)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
asfgit pushed a commit to apache/spark that referenced this pull request Mar 4, 2015
…ce bug

LBFGS and OWLQN in Breeze 0.10 has convergence check bug.
This is fixed in 0.11, see the description in Breeze project for detail:

scalanlp/breeze#373 (comment)

Author: Xiangrui Meng <meng@databricks.com>
Author: DB Tsai <dbtsai@alpinenow.com>
Author: DB Tsai <dbtsai@dbtsai.com>

Closes #4879 from dbtsai/breeze and squashes the following commits:

d848f65 [DB Tsai] Merge pull request #1 from mengxr/AlpineNow-breeze
c2ca6ac [Xiangrui Meng] upgrade to breeze-0.11.1
35c2f26 [Xiangrui Meng] fix LRSuite
397a208 [DB Tsai] upgrade breeze
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants