-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-25337][SQL][TEST] runSparkSubmit should provide non-testing mode
#22340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
runSparkSubmit should provide non-testing moderunSparkSubmit should provide non-testing mode
|
cc @srowen and @cloud-fan |
|
Test build #95705 has finished for PR 22340 at commit
|
|
How does the non-test mode resolve the class path issue? |
|
Yeah same question, but I see why that could cause a problem. Is the point here that while this is a test, the spark-submit run by the test should be run 'normally'? I am happy for a solution just what to understand the implications. I wonder why the classpath stuff hasn't caused a problem before if so, but who knows? |
|
Previously, in the class path, new Spark classes are behind the old Spark classes. So, new ones are unseen. However, Spark 2.4.0 reveals this bug due to the recent data source class changes. |
|
Also, cc @gatorsmile . |
|
I see. It does seem like we don't want to run with test env variables in this context. I was going to ask if we ever do? should this function always strip the env variables for testing? I can see being conservative and restricting it to this case. It seems like just stripping |
|
|
Thank you for approval, @srowen . |
|
Could you review this Scala-2.12 PR again, @cloud-fan and @gatorsmile ? |
|
Seems okay to me too |
|
Thank you, @HyukjinKwon ! |
|
Merged to master (which looks like is still 2.4) |
|
Thank you for merging, @srowen . |
|
It's great to have this in Spark 2.4! |
What changes were proposed in this pull request?
HiveExternalCatalogVersionsSuiteScala-2.12 test has been failing due to class path issue. It is marked asABORTEDbecause it fails atbeforeAllduring data population stage.The root cause of the failure is that
runSparkSubmitmixes 2.4.0-SNAPSHOT classes and old Spark (2.1.3/2.2.2/2.3.1) together duringspark-submit. This PR aims to providenon-testmode execution mode torunSparkSubmitby removing the followings.Previously, in the class path, new Spark classes are behind the old Spark classes. So, new ones are unseen. However, Spark 2.4.0 reveals this bug due to the recent data source class changes.
How was this patch tested?
Manual test. After merging, it will be tested via Jenkins.