-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-17437] Add uiWebUrl to JavaSparkContext and pyspark.SparkContext #15000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
That's probably OK; Java users can already pretty easily get this anyway but not Python users. |
|
Jenkins test this please |
|
Test build #65057 has finished for PR 15000 at commit
|
|
LGTM |
|
My only hesitation about this is that this property really only exists to print it in the shell. Is there a good use case for it otherwise? I know it's minor but want to make sure we're not just doing this for parity. |
|
Well, here's the use case I want it for: I'm building some plugins for JupyterHub to make it more Spark-aware, and I want to be able to link the user out to the right WebUI for their kernel. Short of somehow making the launcher override the user's own |
|
Is this Java or Pyspark? In Java you can still get this property directly from the underlying SparkContext. |
|
PySpark. I don't think anyone runs Java through Jupyter, haha. |
|
Ah right dumb question. Yeah I think it makes some sense ... maybe not even for Java because there are lots of methods we don't plumb through because you can easily access them directly rom scala. Python, OK. |
|
@srowen: Just to make sure I understand, are you asking me to remove the Java accessor here, and just plumb straight through to the Scala object from PySpark? Or is it fine as-is? |
|
Looking at context.py, it seems that it accesses things directly from SparkContext via _jsc.sc() where possible. I think that means you can just expose this in Pyspark without exposing it separately in JavaSparkContext. |
|
|
|
yeah, I just try the following statement and it works. But I think it is no harm to expose it in JavaSparkContext as well. |
|
I don't see value in exposing it, and many other things aren't exposed via JSC. It's really only things that need a different API to make sense in Java. |
And leave only the PySpark one, which now plumbs straight to the Scala SparkContext.
|
As requested, removed the Java property and left only the PySpark one. I'll admit I didn't appreciate that you could access the Scala SparkContext straight from PySpark originally (I figured the property would have to be propagated through the Java wrapper first), so the urgency of this patch is much lessened for me now that I know I can just do |
|
It's still probably reasonable to plumb it through but I'll leave it open a bit for comments. |
|
@davies do you have an opinion? |
|
Test build #3280 has finished for PR 15000 at commit
|
|
merged to master |
|
Hey guys, How can I do same thing using sparkR? |
What changes were proposed in this pull request?
The Scala version of
SparkContexthas a handy field calleduiWebUrlthat tells you which URL the SparkUI spawned by that instance lives at. This is often very useful because the value forspark.ui.portin the config is only a suggestion; if that port number is taken by another Spark instance on the same machine, Spark will just keep incrementing the port until it finds a free one. So, on a machine with a lot of running PySpark instances, you often have to start trying all of them one-by-one until you find your application name.Scala users have a way around this with
uiWebUrlbut Java and Python users do not. This pull request fixes this in the most straightforward way possible, simply propagating this field through theJavaSparkContextand into pyspark through the Java gateway.Please let me know if any additional documentation/testing is needed.
How was this patch tested?
Existing tests were run to make sure there were no regressions, and a binary distribution was created and tested manually for the correct value of
sc.uiWebPortin a variety of circumstances.