-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-21642][CORE] Use FQDN for DRIVER_HOST_ADDRESS instead of ip address #18846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
OK to test |
|
This looks reasonable, cc @cloud-fan |
|
ok to test |
|
Should we also apply this change to |
|
Test build #80323 has finished for PR 18846 at commit
|
|
I'm in favor of the change but I don't know the history on why its an ip to start with so we should make sure its not going to break usecases. |
|
retest this please |
|
It seems that Spark started using IP address in SPARK-6440 (#5424). As far as I looked at the change, this change was introduced only for visibility. |
|
Sorry, I forgot to answer one of your questions.
Yes, I tested it manually. InetAddress.getCanonicalHostName() returned ip address if the client failed reverse DNS lookup. |
|
Test build #80347 has finished for PR 18846 at commit
|
|
Does this affect how we detect driver/executor connection lost? |
|
In my understanding, this does not affect driver/executor's heartbeat behavior since this PR only changes DRIVER_HOST_ADDRESS. The reason I want to apply this change is the client of Spark context web UI is not Spark components. (The client will be web browser and Hadoop ResourceManager's webproxy) Since it is difficult to change these clients' behavior, I thought we should change Spark side behavior. |
|
retest this please |
|
@thideeeee I think DRIVER_HOST_ADDRESS will be used to generate the driver url. Could you check if this line still works after your change? spark/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala Line 131 in b35660d
|
|
Test build #80443 has finished for PR 18846 at commit
|
|
retest this please |
|
Test build #80446 has finished for PR 18846 at commit
|
|
@zsxwing Thank you for the pointer. I tested manually, as far as I tested, Spark works as expected even if we apply this patch. I was able to confirm that driver/executor shut down when its connection lost. On the other hand, all the tests were failed. All the reasons of test failure were different, it looks the test failed randomly. Still looking why the test failed. |
|
retest this please |
|
Test build #80535 has finished for PR 18846 at commit
|
|
retest this please |
1 similar comment
|
retest this please |
|
Jenkins, test this please |
|
Test build #80752 has finished for PR 18846 at commit
|
|
LGTM, merging to master! |
…dress ## What changes were proposed in this pull request? The patch lets spark web ui use FQDN as its hostname instead of ip address. In current implementation, ip address of a driver host is set to DRIVER_HOST_ADDRESS. This becomes a problem when we enable SSL using "spark.ssl.enabled", "spark.ssl.trustStore" and "spark.ssl.keyStore" properties. When we configure these properties, spark web ui is launched with SSL enabled and the HTTPS server is configured with the custom SSL certificate you configured in these properties. In this case, client gets javax.net.ssl.SSLPeerUnverifiedException exception when the client accesses the spark web ui because the client fails to verify the SSL certificate (Common Name of the SSL cert does not match with DRIVER_HOST_ADDRESS). To avoid the exception, we should use FQDN of the driver host for DRIVER_HOST_ADDRESS. Error message that client gets when the client accesses spark web ui: javax.net.ssl.SSLPeerUnverifiedException: Certificate for <10.102.138.239> doesn't match any of the subject alternative names: [] ## How was this patch tested? manual tests Author: Hideaki Tanaka <tanakah@amazon.com> Closes apache#18846 from thideeeee/SPARK-21642. (cherry picked from commit d695a52)
…dress ## What changes were proposed in this pull request? The patch lets spark web ui use FQDN as its hostname instead of ip address. In current implementation, ip address of a driver host is set to DRIVER_HOST_ADDRESS. This becomes a problem when we enable SSL using "spark.ssl.enabled", "spark.ssl.trustStore" and "spark.ssl.keyStore" properties. When we configure these properties, spark web ui is launched with SSL enabled and the HTTPS server is configured with the custom SSL certificate you configured in these properties. In this case, client gets javax.net.ssl.SSLPeerUnverifiedException exception when the client accesses the spark web ui because the client fails to verify the SSL certificate (Common Name of the SSL cert does not match with DRIVER_HOST_ADDRESS). To avoid the exception, we should use FQDN of the driver host for DRIVER_HOST_ADDRESS. Error message that client gets when the client accesses spark web ui: javax.net.ssl.SSLPeerUnverifiedException: Certificate for <10.102.138.239> doesn't match any of the subject alternative names: [] ## How was this patch tested? manual tests Author: Hideaki Tanaka <tanakah@amazon.com> Closes apache#18846 from thideeeee/SPARK-21642.
What changes were proposed in this pull request?
The patch lets spark web ui use FQDN as its hostname instead of ip address.
In current implementation, ip address of a driver host is set to DRIVER_HOST_ADDRESS. This becomes a problem when we enable SSL using "spark.ssl.enabled", "spark.ssl.trustStore" and "spark.ssl.keyStore" properties. When we configure these properties, spark web ui is launched with SSL enabled and the HTTPS server is configured with the custom SSL certificate you configured in these properties.
In this case, client gets javax.net.ssl.SSLPeerUnverifiedException exception when the client accesses the spark web ui because the client fails to verify the SSL certificate (Common Name of the SSL cert does not match with DRIVER_HOST_ADDRESS).
To avoid the exception, we should use FQDN of the driver host for DRIVER_HOST_ADDRESS.
Error message that client gets when the client accesses spark web ui:
javax.net.ssl.SSLPeerUnverifiedException: Certificate for <10.102.138.239> doesn't match any of the subject alternative names: []
How was this patch tested?
manual tests