-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-2471] remove runtime scope for jets3t #1402
Conversation
QA tests have started for PR 1402. This patch merges cleanly. |
It's set that way just because it is only used by Hadoop's FileSystem to access S3. Code shouldn't call it directly. Maven should therefore include it in the runtime classpath and assembled jars, but not the compile classpath. I seem to distantly recall that sbt doesn't quite do this. I hope there is an equivalent for sbt as that's the right thing to do. It's not the end of the world to make jets3t appear on the compile classpath. (So I assume the sbt assembly still has to be supported? the Maven one should be OK in this regard) |
I see. sbt-assembly doesn't pick up runtime only jars by default or maybe it doesn't read pom correctly. @ScrapCodes Do you know whether we can tell sbt-assembly to include runtime jars? It is weird that it doesn't do that by default or maybe it is the sbt-pom-reader's problem. |
Hi, I just checked it is correctly included by doing Prashant Sharma On Mon, Jul 14, 2014 at 2:35 PM, Xiangrui Meng notifications@github.com
|
It is actually intended that way, assembly is not intended as a replacement for runtime classpath. One way to go about is to not have it as runtime dependency. |
What is the assembly then? I always took this term to mean "all of the runtime dependencies together". How would I make a runnable JAR in SBT in general? |
assembly should just include compile scope. Take for example the case of slf4j where end user can bring in any of the log4j or logback. So they are supposed to be at runtime classpath. And putting them in assembly jar will kill the purpose of having this pluggability at runtime. |
The SLF4J binding is a runtime dependency, and it may be one that a down-stream consumer wants to override. But leaving it out entirely yields no runtime binding at all -- no logging. There is an argument for not including any SLF4J runtime dependency in Spark at all, and which requires users to add their own to get any log messages. Debatable -- but a different question. SLF4J is a bit of a funny example though. What about other runtime dependencies? jets3t is a good one. It has to be on the runtime classpath, so needs to go into a runtime assembly JAR. It does not need to appear on the compile classpath, and shouldn't ideally. This is not the same as "provided" scope, where the user is expected to bring the artifact from the local environment. What's the standard SBT magic for this? |
QA results for PR 1402: |
@ScrapCodes I agree with @srowen that the assembly jar should include runtime libraries except those marked "provided". If we cannot find a way to let sbt understand maven runtime scope, we should merge this change because without jets3t all S3 operations are broken. |
LGTM for now - I opened a separate JIRA to improve the sbt plug-in to fix this. |
The assembly jar (built by sbt) doesn't include jets3t if we set it to runtime only, but I don't know whether it was set this way for a particular reason. CC: srowen ScrapCodes Author: Xiangrui Meng <meng@databricks.com> Closes apache#1402 from mengxr/jets3t and squashes the following commits: bfa2d17 [Xiangrui Meng] remove runtime scope for jets3t
The assembly jar (built by sbt) doesn't include jets3t if we set it to runtime only, but I don't know whether it was set this way for a particular reason.
CC: @srowen @ScrapCodes