Skip to content

Conversation

@BryanCutler
Copy link
Member

Currently, JavaWrapper is only a wrapper class for pipeline classes that have Params and JavaCallable is a separate mixin that provides methods to make Java calls. This change simplifies the class structure and to define the Java wrapper in a plain base class along with methods to make Java calls. Also, renames Java wrapper classes to better reflect their purpose.

Ran existing Python ml tests and generated documentation to test this change.

@BryanCutler
Copy link
Member Author

@jkbradley I noticed a couple of potential issues with current JavaCallable that I took care of here. First, trying to detach the Java object on __del__ could cause a problem if the Python object was copied and one instance gets garbage collected. This would cause the remaining Python objects to have an invalid Java object reference, so I removed the call to detach. Second, assigning the Spark context in the constructor as self._sc = SparkContext._active_spark_context could be a problem if the Python object is created outside an active SparkContext and then the user does something to _call_java within a valid context. So I think it is better to only use SparkContext._active_spark_context when needed and calls will fail then if outside a valid context.

@SparkQA
Copy link

SparkQA commented Apr 11, 2016

Test build #55532 has finished for PR 12304 at commit c1d41c7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member

Thanks for the PR. This reorg looks fine. I agree with what you said about del and active_spark_context.

My main request is that "JavaWrapperParams" be renamed to "JavaParams" to match JavaModel, JavaEstimator, etc.

Also, in util.py lines 102, 181: Update "JavaWrapper" to "JavaParams" in comments

@BryanCutler
Copy link
Member Author

I was thinking that the other classes, like JavaModel, actually wrap that object so I used a slightly different name - but I agree, JavaParams sounds much better and it's an abstract class that will end up wrapping an object with Params, so I think it's fine.

@BryanCutler
Copy link
Member Author

I made that change and fixed the comments in util.py

@SparkQA
Copy link

SparkQA commented Apr 12, 2016

Test build #55661 has finished for PR 12304 at commit 8121a3d.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class JavaEvaluator(JavaParams, Evaluator):
    • class JavaParams(JavaWrapper, Params):
    • class JavaEstimator(JavaParams, Estimator):
    • class JavaTransformer(JavaParams, Transformer):

@SparkQA
Copy link

SparkQA commented Apr 12, 2016

Test build #55665 has finished for PR 12304 at commit 0fa51bb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member

LGTM
Merging with master
Thanks!

@asfgit asfgit closed this in fc3cd2f Apr 13, 2016
@BryanCutler BryanCutler deleted the pyspark-cleanup-JavaWrapper-SPARK-14472 branch December 2, 2016 00:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants