Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove adam-cli jar from classpath during adam-submit #749

Merged
merged 1 commit into from
Aug 7, 2015

Conversation

fnothaft
Copy link
Member

The ./bin/compute-adam-jars.sh script does not remote the adam-cli JAR from the classpath. This exposes SPARK-1921 when running on YARN.

@@ -24,6 +24,6 @@ CLASSPATH=$("$ADAM_REPO"/bin/compute-adam-classpath.sh)

# list of jars to ship with spark; trim off the first and last from the CLASSPATH
# TODO: brittle? assumes appassembler always puts the $BASE/etc first and the CLI jar last
ADAM_JARS=$(echo "$CLASSPATH" | tr ":" "," | cut -d "," -f 2-)
ADAM_JARS=$(echo "$CLASSPATH" | tr ":" "\n" | tail -n +2 | head -n -1 | perl -pe 's/\n/,/ unless eof')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can't make heads or tails of this statement! rimshot

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up top! makes high five gesture

@ryan-williams
Copy link
Member

So this undoes the "fix" #669 of #663. Can you say more about how this exposes SPARK-1921 @fnothaft?

@fnothaft
Copy link
Member Author

fnothaft commented Aug 3, 2015

Ah! Good catch @ryan-williams. You all are running Spark-on-YARN on a CDH5 setup at Mt. Sinai, right? Perhaps the behavior is different between our clusters (which wouldn't be surprising, I think the bug is caused by a race condition), but we are in the middle of bringing up a CDH5 cluster at Berkeley and saw the exact behavior described in SPARK-1921 (JAR timestamps differ). By removing the adam-cli jar from the JARS list when calling spark-submit, you don't copy the JAR twice, and thus don't have the timestamp race. It should be easy to make this work for both adam-submit and adam-shell; I'll fix this up.

@fnothaft
Copy link
Member Author

fnothaft commented Aug 3, 2015

@ryan-williams just pushed a new commit to address your comment. Doing things correctly actually makes the adam-submit code a bit cleaner, for certain definitions of clean (specifically, we get rid of the brittle way of determining the CLI jar). Let me know what you think, and I'll squash down and rebase.


ADAM_JARS=$(echo "$ADAM_JARS" | tail -n +1 | perl -pe 's/\n/,/ unless eof')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, tail -n +1 doesn't do anything, does it? do you mean tail -n +2 to skip the first line?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er, fun times. You are correct. This should've been head -n -1 (working too late in the night...) but I swapped that because BSD head doesn't support the -n -1 flag. Let me find a good, portable fix and I'll ping back.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we have compute-adam-jars.sh take one cmdline flag, that tells it to strip the CLI jar or not, and then have it do the perl -pe 's/\n/,/ unless eof', since that's what both callers want?

Then compute-adam-jars.sh could do some more structured things to find and remove the CLI jar, and we'd expose a clean interface to that to the two caller scripts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. put a grep -v <some unique adam cli jar identifier> in compute-adam-jars.sh

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conceptually, I agree with what you are saying. In practice though, I actually think I like this approach a bit better, since it allows us to remove the fragile CLI jar finding code and fall back on the appassembler classpath instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I was missing that you had that grep method in this file previously; it's certainly nice to see it deleted here.

I think there are two orthogonal questions:

  1. find the CLI jar via the assumption that appassembler puts it last on the classpath vs. by greping.
  2. do the above in adam-submit vs. in compute-adam-jars.sh.

I'm ok with either approach to 1), but am somewhat interested in moving 2) to compute-adam-jars.sh, that way adam-submit and adam-shell both save a | perl -pe 's/\n/,/ unless eof', and we can create a cleaner interface to the | head -n -1 that is currently here in adam-submit.

Anyways, the way you have it wfm too, just articulating the options I see.

@ryan-williams
Copy link
Member

this lgtm btw, unless I've persuaded you here to push CLI-jar detection and removal into compute-adam-jars.sh

@massie
Copy link
Member

massie commented Aug 4, 2015

@fnothaft Do you plan to move cli-jar detection tin compute-adam-jars.sh or is this ready to be merged?

@ryan-williams If this is ready to be merged, feel free to hit the button and merge it.

@fnothaft
Copy link
Member Author

fnothaft commented Aug 4, 2015

This isn't ready to merge yet; I still need to make fixes.

@fnothaft fnothaft force-pushed the fix-compute-jars branch 2 times, most recently from fb37213 to 5787a80 Compare August 5, 2015 01:59
@fnothaft
Copy link
Member Author

fnothaft commented Aug 5, 2015

@ryan-williams just pushed a new round of fixes. Let me know what you think of these.

@ryan-williams
Copy link
Member

Builds failed with

hudson.plugins.git.GitException: Could not checkout null with start point 49a466a4d424b6acd1730c5daafe12f8e40d26b4

idk what that's about. Looking at the code now

@fnothaft
Copy link
Member Author

fnothaft commented Aug 5, 2015

Jenkins, retest this please.

@ryan-williams there's a bug in the Jenkins Git plugin where fetching force-pushes fails occasionally.

@@ -22,40 +22,15 @@ SCRIPT_DIR="$(cd `dirname $0`/..; pwd)"

# Get list of required jars for ADAM
ADAM_JARS=$("$SCRIPT_DIR"/bin/compute-adam-jars.sh)
ADAM_CLI_JAR=$(echo "$ADAM_JARS" | rev | cut -d',' -f1 | rev )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh, isn't $ADAM_JARS already one comma-delimited line at this point? What's the point of the two revs if so?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I am not so great at cut, but it isn't clear to me that there is an easy way to take just the last item from a comma separated list using cut. However, you can fake this by reversing the list, taking the first item, and reversing the list again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blindly copied from http://stackoverflow.com/questions/22727107/how-to-find-the-last-field-using-cut-linux

data=foo,bar,baz,qux
last=${data##*,}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh, I read man rev as reversing the order of lines in a file, not reversing each line's contents, my mistake @fnothaft!

@heuermh's trick above seems maybe a little cleaner, but I'm agnostic at this point and have held up this PR long enough so whatever you want to do sgtm @fnothaft!

@fnothaft
Copy link
Member Author

fnothaft commented Aug 7, 2015

Just pushed a last commit with Michael's suggested change, and with all things squashed down and rebased on master. This should be good to merge.

@ryan-williams
Copy link
Member

lgtm

@massie
Copy link
Member

massie commented Aug 7, 2015

+1

(Feel free to merge @ryan-williams)

@heuermh
Copy link
Member

heuermh commented Aug 7, 2015

the latest diff looks like it adds back the $ADDL_JARS stuff -- was this intentional?
https://github.com/fnothaft/adam/blob/fix-compute-jars/bin/adam-submit#L54

@fnothaft
Copy link
Member Author

fnothaft commented Aug 7, 2015

@heuermh thanks for the catch; not intentional, rather this comes from a rebase gone wrong... I will fix.

@fnothaft
Copy link
Member Author

fnothaft commented Aug 7, 2015

Fixed. Sorry; the rebase on #754 was kind of nasty; lots of messy conflicts.

@ryan-williams
Copy link
Member

want to retest this @fnothaft ? looks like git force-push problem again

@ryan-williams
Copy link
Member

good catch @heuermh btw! I looked at it and just assumed it was intentional, woops

@fnothaft
Copy link
Member Author

fnothaft commented Aug 7, 2015

Jenkins, test this please.

@fnothaft
Copy link
Member Author

fnothaft commented Aug 7, 2015

This is passing now; can I get a merge? I will rebase #757 once this is merged.

ryan-williams added a commit that referenced this pull request Aug 7, 2015
Remove adam-cli jar from classpath during adam-submit
@ryan-williams ryan-williams merged commit 1cf8ebb into bigdatagenomics:master Aug 7, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants