Skip to content

Conversation

@zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

ClusteringEvaluator support array input

How was this patch tested?

added tests

@SparkQA
Copy link

SparkQA commented Jun 14, 2018

Test build #91828 has finished for PR 21563 at commit b126bd4.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 14, 2018

Test build #91830 has finished for PR 21563 at commit d6e76e3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure this is the right way. Probably we can face the same issue everywhere we are using DatasetUtils.columnToVector. Probably it is better to fix the problem there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mgaido91 Thanks for your reviewing!
I have considered this, however there exists a problem:
if we want to append metadata into the transformed column (like using method .as(alias: String, metadata: Metadata)) in DatasetUtils.columnToVector, how can we get the name of transformed column?
The only way to do this I know is:

val metadata = ...
val vectorCol = ..
val vectorName = dataset.select(vectorCol) .schema.head.name
vectorCol.as(vectorName, metadata)

Copy link
Contributor

@mgaido91 mgaido91 Jun 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have the new column we are returning, so we can get its name with .expr.sql

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mgaido91 I think it maybe nice to first add a name getter for column

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can propose that

@SparkQA
Copy link

SparkQA commented Jun 14, 2018

Test build #91836 has finished for PR 21563 at commit eb6022c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zhengruifeng zhengruifeng force-pushed the clu_eval_support_array branch from eb6022c to 9064e7b Compare July 26, 2018 07:02
@zhengruifeng
Copy link
Contributor Author

@mgaido91 I am sorry to make a force push to update my git username in this PR.
Since I found that my current PRs are not linked to my account and it is troublesome to track them.

and I found that .expr.sql may not be a good way to get the output column name:

val df = Seq((1,2),(3,4),(1,1),(2,-1),(3,1)).toDF("a","b")
val z = (col("a") + col("b")).as("x").as("z")
scala> z.expr.sql
res15: String = (`a` + `b`) AS `x` AS `z`

So I still keep the orignal method.

@MLnick @mengxr @jkbradley Would you please help reveiwing this? Thanks!

@SparkQA
Copy link

SparkQA commented Jul 26, 2018

Test build #93586 has finished for PR 21563 at commit 9064e7b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zhengruifeng
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Aug 1, 2018

Test build #93863 has finished for PR 21563 at commit 9064e7b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zhengruifeng
Copy link
Contributor Author

@mengxr I notice that you open a ticket for supporting integer type labels in ClusteringEvalutator, would you like to shepherd this pr too?

@mengxr
Copy link
Contributor

mengxr commented Aug 2, 2018

LGTM. Merged into master. Thanks!

@asfgit asfgit closed this in 57d9949 Aug 2, 2018
@zhengruifeng zhengruifeng deleted the clu_eval_support_array branch August 2, 2018 07:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants