Skip to content

Conversation

@jkbradley
Copy link
Member

What changes were proposed in this pull request?

Allow Spark 2.x to load instances of LDA, LocalLDAModel, and DistributedLDAModel saved from Spark 1.6.

How was this patch tested?

I tested this manually, saving the 3 types from 1.6 and loading them into master (2.x). In the future, we can add generic tests for testing backwards compatibility across all ML models in SPARK-15573.

@jkbradley
Copy link
Member Author

To reviewers: This code was taken and modified from [https://github.com//pull/14112]. @GayathriMurali should be the primary author when we merge this into master and branch-2.0

@jkbradley
Copy link
Member Author

CC @hhbyyh Would you mind taking a look? thanks!

@SparkQA
Copy link

SparkQA commented Sep 9, 2016

Test build #65168 has finished for PR 15034 at commit a931b25.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

*/
def getAndSetParams(model: LDAParams, metadata: Metadata): Unit = {
VersionUtils.majorMinorVersion(metadata.sparkVersion) match {
case (1, 6) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we didn't support LDA serialization in 1.5, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, it was 1.6

@hhbyyh
Copy link
Contributor

hhbyyh commented Sep 10, 2016

Thanks @jkbradley. This matches what I had in mind. LGTM.

@jkbradley
Copy link
Member Author

Thanks @hhbyyh !

@SparkQA
Copy link

SparkQA commented Sep 19, 2016

Test build #3279 has finished for PR 15034 at commit a931b25.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Sep 20, 2016

Test build #65621 has finished for PR 15034 at commit a931b25.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JoshRosen
Copy link
Contributor

@jkbradley, it looks like this is legitimately failing MiMa (not sure why it passed on the first run...):

[error]  * the type hierarchy of object org.apache.spark.ml.clustering.LDA is different in current version. Missing types {org.apache.spark.ml.util.DefaultParamsReadable}
[error]    filter with: ProblemFilters.exclude[MissingTypesProblem]("org.apache.spark.ml.clustering.LDA$")

@jkbradley
Copy link
Member Author

@JoshRosen That is weird; MiMa passes for me locally, but I see that it shouldn't. I added a MiMaException; this should not be a problem for users.

@SparkQA
Copy link

SparkQA commented Sep 22, 2016

Test build #65789 has finished for PR 15034 at commit 3b499df.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member Author

I'll go ahead and merge this. Thanks @hhbyyh for reviewing it!

@asfgit asfgit closed this in f4f6bd8 Sep 22, 2016
jkbradley pushed a commit to jkbradley/spark that referenced this pull request Sep 22, 2016
Allow Spark 2.x to load instances of LDA, LocalLDAModel, and DistributedLDAModel saved from Spark 1.6.

I tested this manually, saving the 3 types from 1.6 and loading them into master (2.x).  In the future, we can add generic tests for testing backwards compatibility across all ML models in SPARK-15573.

Author: Joseph K. Bradley <joseph@databricks.com>

Closes apache#15034 from jkbradley/lda-backwards.
@jkbradley jkbradley deleted the lda-backwards branch September 22, 2016 23:43
asfgit pushed a commit that referenced this pull request Sep 23, 2016
… backport

## What changes were proposed in this pull request?

Allow Spark 2.x to load instances of LDA, LocalLDAModel, and DistributedLDAModel saved from Spark 1.6.
Backport of #15034 for branch-2.0

## How was this patch tested?

I tested this manually, saving the 3 types from 1.6 and loading them into master (2.x).  In the future, we can add generic tests for testing backwards compatibility across all ML models in SPARK-15573.

Author: Gayathri Murali <gayathri.m.softie@gmail.com>
Author: Joseph K. Bradley <joseph@databricks.com>

Closes #15205 from jkbradley/lda-backward-2.0.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants