Skip to content

Conversation

@mariusvniekerk
Copy link
Member

What changes were proposed in this pull request?

Adds a flag to sc.addJar to add the jar to the current classloader

How was this patch tested?

Unit tests, manual tests

This is a continuation of the pull request in #9313 and is mostly a rebase of that moved to master with SparkR additions.

cc @holdenk

@HyukjinKwon
Copy link
Member

cc @felixcheung too.

@shivaram
Copy link
Contributor

Jenkins, ok to test

@SparkQA
Copy link

SparkQA commented Oct 29, 2016

Test build #67744 has finished for PR 15666 at commit 2b1e98e.

  • This patch fails R style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 29, 2016

Test build #67759 has finished for PR 15666 at commit 7f37d3a.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 29, 2016

Test build #67773 has finished for PR 15666 at commit 9d838b3.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • |public class $className implements java.io.Serializable

@SparkQA
Copy link

SparkQA commented Oct 29, 2016

Test build #67774 has finished for PR 15666 at commit d4416d9.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 30, 2016

Test build #67776 has finished for PR 15666 at commit fccb141.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


#' Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
#'
#' The `path` passed can be either a local file, a file in HDFS (or other Hadoop-supported
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use \code{path} instead of path in R doc


val uri = if (path.contains("\\")) {
// For local paths with backslashes on Windows, URI throws an exception
key = env.rpcEnv.fileServer.addJar(new File(path))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to be changing existing behavior?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this change gets the URI for the windows URI which is used later on to construct a File instance. That should allow the windows special case to work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we construct file and then go with toURI its safe? Have we tested this on Windows?

addJar(path, false)
}

def addJar(path: String, addToCurrentClassLoader: Boolean) {
Copy link
Member

@felixcheung felixcheung Oct 31, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of changing SparkContext.addJar as a broad public API, would it be better to have a more narrow change in a Python/R only helper?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping it in the Scala makes it simpler for other spark Scala interpeters (eg toree, zeppelin) to make use of this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about if we made the the Scala API as a Developer API since we might end up wanting to change the Scala side in the future?

@SparkQA
Copy link

SparkQA commented Nov 1, 2016

Test build #67860 has finished for PR 15666 at commit 26b39de.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

Build started: [SparkR] ALL PR-15666
Diff: master...spark-test:3930DC55-94D4-4307-8E77-BF16007C6AAD

@mariusvniekerk
Copy link
Member Author

@HyukjinKwon there seems to be something weird with the appveyor checks?

@HyukjinKwon
Copy link
Member

@mariusvniekerk Yes, it was (please refer #15709). There is a problem to re-trigger the test (it seems even committers are unable to do). So, I ran this via another account and left a comment above. So, I guess it is fine now. If you are worried of the failure mark, as a (a bit ugly) workaround, you could maybe close and re-open this to re-trigger. I just tested in spark-test#11.

@mariusvniekerk
Copy link
Member Author

Rebased.

@SparkQA
Copy link

SparkQA commented Dec 17, 2016

Test build #70293 has finished for PR 15666 at commit 1b1a0e9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@holdenk holdenk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up :) Would be good to have @JoshRosen or someone take a look as well. I'll try and do a more thorough read-through later on this week :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is moved code, but could we just use createJarWithClasses or do the R tests require these inner functions?

If we do keep this though I'd give it a clearer name and description because without reading the code, createDummyJar and createJarWithClasses don't seem all that different.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah when i wrote this that didn't exist yet. Changing.

Copy link
Member Author

@mariusvniekerk mariusvniekerk Dec 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The R tests do indeed verify that they can call the internal functions.

I can revert that part of the changes.

In R\pkg\inst\tests\testthat\jarTest.R

helloTest <- SparkR:::callJStatic("sparkrtest.DummyClass",
                                  "helloWorld",
                                  "Dave")
stopifnot(identical(helloTest, "Hello Dave"))

basicFunction <- SparkR:::callJStatic("sparkrtest.DummyClass",
                                      "addStuff",
                                      2L,
                                      2L)
stopifnot(basicFunction == 4L)

@SparkQA
Copy link

SparkQA commented Dec 23, 2016

Test build #70534 has finished for PR 15666 at commit 73df5a4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mariusvniekerk
Copy link
Member Author

@holdenk Anything i can do from my side to help this guy along?

@holdenk
Copy link
Contributor

holdenk commented Feb 1, 2017

Thanks for the ping, I was running into an issue this would have simplified yesterday so I was meaning to follow up. I'll see if maybe @felixcheung is around for Spark Summit East and we could go over together in person there (other wise I'll see if I can find another reviewer to rope in with me while I'm there :)).

@mariusvniekerk
Copy link
Member Author

Yeah I'll be there

Copy link
Contributor

@holdenk holdenk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, lets meetup and chat IRL. I've got some questions with regards to the windows handling. Maybe we can find @felixcheung at spark summit east too and get back up to speed quickly on the state of this because it would be nice to have :)

addJar(path, false)
}

def addJar(path: String, addToCurrentClassLoader: Boolean) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about if we made the the Scala API as a Developer API since we might end up wanting to change the Scala side in the future?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attempt to is interesting wording here - when would we expect this to fail and should we explicitly call that out?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add to doc that already loaded urls will have no effect if a url is already present.


val uri = if (path.contains("\\")) {
// For local paths with backslashes on Windows, URI throws an exception
key = env.rpcEnv.fileServer.addJar(new File(path))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we construct file and then go with toURI its safe? Have we tested this on Windows?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I'm a bit confused, we already supposedly have a safe uri object for windows above, does it throw an exception on getPath? Or is it that \ is always local and getScheme doesn't return that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have backslashes we are in a local path on windows.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a :param: annotation here as well for path & addToCurrentClassLoader

@SparkQA
Copy link

SparkQA commented Feb 9, 2017

Test build #72650 has finished for PR 15666 at commit 21643b5.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 9, 2017

Test build #72657 has finished for PR 15666 at commit 9fbcd9b.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@holdenk
Copy link
Contributor

holdenk commented Feb 13, 2017

Thanks for adding the R side tests :) Since this touches core (albeit as a developer API) we should probably also run this by someone who's worked on addJar (like @brkyvz ).

@mariusvniekerk
Copy link
Member Author

Ah thanks

destDir, "sparkrTests", "DummyClassForAddJarTest")

spark.addJar(jarName, addToCurrentClassLoader = TRUE)
testClass <- newJObject("sparkrTests.DummyClassForAddJarTest")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm.. it seems this line this time.

1. Error: add jar should work and allow usage of the jar on the driver node (@test_context.R#178) 
 
1: newJObject("sparkrTests.DummyClassForAddJarTest") at C:/projects/spark/R/lib/SparkR/tests/testthat/test_context.R:178
2: invokeJava(isStatic = TRUE, className, methodName = "<init>", ...)
3: handleErrors(returnStatus, conn)
4: stop(readString(conn))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, helpful log:

ERROR RBackendHandler: <init> on sparkrTests.DummyClassForAddJarTest failed
java.lang.ClassNotFoundException: sparkrTests.DummyClassForAddJarTest
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
	at org.apache.spark.util.Utils$.classForName(Utils.scala:230)
	at org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:143)
	at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:108)
	at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:40)
	at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
	at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
	at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293)
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:267)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
	at java.lang.Thread.run(Thread.java:745)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i suspect that the windows path didn't make it properly into the classloader

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just left few super minor comments. If commits should be pushed, maybe these could be improved.

Let me try to take a look on this weekend as I have an environment to test this on Windows.

val excSource = new JavaSourceFromString(new File(srcDir, className).toURI.getPath,
s"""package $packageName;
|
|public class $className implements java.io.Serializable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This indentation seems inconsistent BTW)

| public static int addStuff(int arg1, int arg2) { return arg1 + arg2; }
|}
""".
stripMargin)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(We could make this lined.)

Copy link
Member

@HyukjinKwon HyukjinKwon Oct 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make this inlined.


for (
schedulingMode <- Seq("local_mode", "non_local_mode")
) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just personal taste.. maybe

Seq("local_mode", "non_local_mode").foreach { schedulingMode =>
  ...
}

?

val currentCL = Utils.getContextOrSparkClassLoader
currentCL match {
case cl: MutableURLClassLoader =>
cl.addURL(uri.toURL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(We could make this inlined too..)

@HyukjinKwon
Copy link
Member

Feel free to push any commit. Let me trigger the build with my account like a bot.

@SparkQA
Copy link

SparkQA commented Mar 6, 2017

Test build #74006 has finished for PR 15666 at commit 8713be0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 6, 2017

Test build #74023 has finished for PR 15666 at commit 7a918a6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • |public class $className implements java.io.Serializable

#' spark.addJar("/path/to/something.jar", TRUE)
#'}
#' @note spark.addJar since 2.2.0
spark.addJar <- function(path, addToCurrentClassLoader = FALSE) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we want to add it to the driver classpath by default?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly for backwards compatibility.

test_that("add jar should work and allow usage of the jar on the driver node", {
sparkR.sparkContext()

destDir <- paste0(tempdir(), "/", "testjar")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may want to use https://stat.ethz.ch/R-manual/R-devel/library/base/html/file.path.html instead for windows compatibility


val uri = if (path.contains("\\")) {
// If we have backslashes here this is a windows path, and URI will throws an exception,
// so we use this constrcutor instead
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

constructor

/**
* Adds a JAR dependency for all tasks to be executed on this `SparkContext` in the future.
* @param path can be either a local file, a file in HDFS (or other Hadoop-supported filesystems),
* an HTTP, HTTPS or FTP URI, or local:/path for a file on every worker node.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: alignment

@param path can be either...
            blah more blah

#' filesystems), an HTTP, HTTPS or FTP URI, or local:/path for a file on every worker node.
#' If \code{addToCurrentClassLoader} is true, add the jar to the current threads' classloader. In
#' general adding to the current threads' class loader will impact all other application threads
#' unless they have explicitly changed their class loader.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should

add the jar to the current threads' classloader. In
 +#' general adding to the current threads' class loader will impact all other application threads
 +#' unless they have explicitly changed their class loader.

be reworded for R or Python? It should be the JVM. threads or classloader aren't familiar concepts to R users.

}


#' Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't really expose SparkContext in R actually

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case do we want to bother having this method for R?

spark.addJar <- function(path, addToCurrentClassLoader = FALSE) {
sc <- getSparkContext()
normalizedPath <- suppressWarnings(normalizePath(path))
scala_sc <- callJMethod(sc, "sc")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd just combine 342 & 344 and call it sc

sc <- callJMethod(getSparkContext(), "sc")

#' @rdname spark.addJar
#' @param path The path of the jar to be added
#' @param addToCurrentClassLoader Whether to add the jar to the current driver classloader.
#' Default is FALSE.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove default value since it's in the signature already below

#' spark.addJar("/path/to/something.jar", TRUE)
#'}
#' @note spark.addJar since 2.2.0
spark.addJar <- function(path, addToCurrentClassLoader = FALSE) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spark.addJar needs to be added to the NAMESPACE file, otherwise it won't be accessible. Tests are running within the same namespace so it's always able to access private methods.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry that I could not check this in the past weekend. Honestly, after I updated my Mac OS, somehow all the VMs are broken and I have to rebuild the env... so I ran some tests with AppVeyor. I will update you when I finished this.

sparkR.sparkContext()

destDir <- paste0(tempdir(), "/", "testjar")
jarName <- callJStatic("org.apache.spark.TestUtils", "createDummyJar",
Copy link
Member

@HyukjinKwon HyukjinKwon Mar 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked what it's got right after createDummyJar call. It was C:\Users\appveyor\AppData\Local\Temp\1\Rtmpg5tOYo/testjar

sc <- getSparkContext()
normalizedPath <- suppressWarnings(normalizePath(path))
scala_sc <- callJMethod(sc, "sc")
invisible(callJMethod(scala_sc, "addJar", normalizedPath, addToCurrentClassLoader))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right after this call, the path normalizedPath was C:\projects\spark\R\lib\SparkR\tests\testthat\file:\C:\Users\appveyor\AppData\Local\Temp\1\Rtmpg5tOYo\testjar\sparkrTests-DummyClassForAddJarTest-1489422615481.jar

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(C:\projects\spark\R\lib\SparkR\tests\testthat\ was indeed printed together in the logs..).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is normalizepath doing that to the url?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't handle file:\C: prefix - where is that coming from?

@SparkQA
Copy link

SparkQA commented Mar 18, 2017

Test build #74781 has finished for PR 15666 at commit 8f956c5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 18, 2017

Test build #74782 has finished for PR 15666 at commit e1a59c9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 17, 2017

Test build #78211 has finished for PR 15666 at commit d76880b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mariusvniekerk
Copy link
Member Author

@HyukjinKwon Any hints what's needed to get the R stuff passing? I don't really have a windows testbed that I can use.

@jiangxb1987
Copy link
Contributor

Should we continue with this? @HyukjinKwon

@HyukjinKwon
Copy link
Member

Thanks for asking this. I completely forgot this one. Will try to make some time to take a look within few days.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay but I think I need another close look. Just left some comments while figuring out the R test failure on Windows.


destDir <- file.path(tempdir(), "testjar")
jarName <- callJStatic("org.apache.spark.TestUtils", "createDummyJar",
destDir, "sparkrTests", "DummyClassForAddJarTest")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it looks the problem is here. createDummyJar returns URI string, for example,

> normalizePath("file:/C:/a/b/c")
[1] "C:\\Users\\IEUser\\workspace\\spark\\file:\\C:\\a\\b\\c"
Warning message:
In normalizePath(path.expand(path), winslash, mustWork) :
  path[1]="file:/C:/a/b/c": The filename, directory name, or volume label syntax
 is incorrect

This looks ending up with an weird path like "C:\\Users\\IEUser\\workspace\\spark\\file:\\C:\\a\\b\\c".

}



Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

little nit: I guess we just need a single newline.

test_that("add jar should work and allow usage of the jar on the driver node", {
sparkR.sparkContext()

destDir <- file.path(tempdir(), "testjar")
Copy link
Member

@HyukjinKwon HyukjinKwon Oct 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove this tempdir() btw (I mean removing the actual directory after this test).


destDir <- file.path(tempdir(), "testjar")
jarName <- callJStatic("org.apache.spark.TestUtils", "createDummyJar",
destDir, "sparkrTests", "DummyClassForAddJarTest")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

little nit:

jarName <- callJStatic("org.apache.spark.TestUtils", "createDummyJar",
                        destDir, "sparkrTests", "DummyClassForAddJarTest")

// For local paths with backslashes on Windows, URI throws an exception
new File(path).toURI
} else {
new URI(path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we maybe just use Utils.resolveURI(path)? looks the logic is similar.

new URI(path)
}
cl.addURL(uri.toURL)
case _ => logWarning(s"Unsupported cl $currentCL will not update jars thread cl")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd sayclass loader instead of cl.

| public static int addStuff(int arg1, int arg2) { return arg1 + arg2; }
|}
""".
stripMargin)
Copy link
Member

@HyukjinKwon HyukjinKwon Oct 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make this inlined.

Adds a JAR dependency for all tasks to be executed on this SparkContext in the future.
The `path` passed can be either a local file, a file in HDFS (or other Hadoop-supported
filesystems), an HTTP, HTTPS or FTP URI, or local:/path for a file on every worker node.
If addToCurrentClassLoader is true, add the jar to the current threads' classloader.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

little nit: addToCurrentClassLoader ->`addToCurrentClassLoader` and ads' cl -> ads' cl.

# We shouldn't be able to load anything from the package before it is added
self.assertFalse(isinstance(jvm.pysparktests.DummyClass, JavaClass))
# Generate and compile the test jar
destDir = os.path.join(SPARK_HOME, "python/test_support/jar")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove this directory too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead you want to use a temp directory?

@HyukjinKwon
Copy link
Member

I am sorry for this delay @mariusvniekerk. Would you have some time to proceed this one please?

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@jiangxb1987
Copy link
Contributor

ping @mariusvniekerk

@mariusvniekerk
Copy link
Member Author

This was superceded by #19643

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants