Skip to content

Conversation

@MLnick
Copy link
Contributor

@MLnick MLnick commented Jun 27, 2016

This PR adds the breaking changes from SPARK-14810 to the migration guide.

How was this patch tested?

Built docs locally.

@MLnick
Copy link
Contributor Author

MLnick commented Jun 27, 2016

Will be merged once #13378 is merged.

@SparkQA
Copy link

SparkQA commented Jun 27, 2016

Test build #61303 has finished for PR 13924 at commit 28e0412.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member

I just merged #13378

@MLnick MLnick force-pushed the SPARK-15643-migration-guide branch from 28e0412 to 6ef09a3 Compare June 28, 2016 20:29
@MLnick MLnick changed the title [WIP][SPARK-15643][DOC][ML] Add breaking changes to ML migration guide [SPARK-15643][DOC][ML] Add breaking changes to ML migration guide Jun 28, 2016
@MLnick
Copy link
Contributor Author

MLnick commented Jun 28, 2016

@yanboliang @jkbradley @mengxr updated.

@SparkQA
Copy link

SparkQA commented Jun 28, 2016

Test build #61408 has finished for PR 13924 at commit 6ef09a3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


**Linear algebra classes for DataFrame-based APIs**

Spark's linear algebra dependencies were moved to a new project, `spark-mllib-local`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be "mllib-local" (no "spark-")

@jkbradley
Copy link
Member

Done with review pass. Thanks for the PR!


# convert DataFrame columns
convertedVecDF = MLUtils.convertVectorColumnsToML(vecDF)
convertedMatrxDF = MLUtils.convertMatrixColumnsToML(matrixDF)
Copy link
Contributor Author

@MLnick MLnick Jun 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, it looks like we don't have single instance conversion methods asML / fromML in Python linalg classes (I commented on SPARK-15944.

Not sure if this is intended or we just missed them. One can do newVec = Vectors.dense(oldVec) (or vice versa for sparse) in Python directly, so if that is the expected way to do things I can add that here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That may have just been overlooked, but that's a good point that there is already a decent way to do the conversion. Could you please just note that way here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkbradley Ah sorry - I mispoke. It happens to work for dense vectors because it effectively calls np.array(DenseVector), but not for sparse. Workaround is fairly ugly: mlSV = NewVectors.sparse(mllibSV.size, zip(mllibSV.indices, mllibSV.values)), or something similar.

I'd say we should have some convenience methods like in Scala/Java?

Copy link
Contributor Author

@MLnick MLnick Jun 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created SPARK-16328 and #13997.

@SparkQA
Copy link

SparkQA commented Jun 29, 2016

Test build #61464 has finished for PR 13924 at commit ac49f31.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member

The changes look good, so just the Python item remains. Thanks!

@MLnick
Copy link
Contributor Author

MLnick commented Jun 30, 2016

@jkbradley updated Python example assuming #13997 will get merged - refer #13924 (comment).

@SparkQA
Copy link

SparkQA commented Jun 30, 2016

Test build #61545 has finished for PR 13924 at commit c2ce7cd.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 30, 2016

Test build #61546 has finished for PR 13924 at commit 919bfe9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member

LGTM
Merging with master and branch-2.0 now that #13997 has been merged
Thank you!

asfgit pushed a commit that referenced this pull request Jul 1, 2016
This PR adds the breaking changes from [SPARK-14810](https://issues.apache.org/jira/browse/SPARK-14810) to the migration guide.

## How was this patch tested?

Built docs locally.

Author: Nick Pentreath <nickp@za.ibm.com>

Closes #13924 from MLnick/SPARK-15643-migration-guide.

(cherry picked from commit 4a981dc)
Signed-off-by: Joseph K. Bradley <joseph@databricks.com>
@asfgit asfgit closed this in 4a981dc Jul 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants