Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WARNING: Jenkins update to 2.60.1 requires Java 8 #775

Closed
1 of 4 tasks
gibfahn opened this issue Jun 28, 2017 · 50 comments
Closed
1 of 4 tasks

WARNING: Jenkins update to 2.60.1 requires Java 8 #775

gibfahn opened this issue Jun 28, 2017 · 50 comments

Comments

@gibfahn
Copy link
Member

gibfahn commented Jun 28, 2017

Needs doing:

  • Create a job to run a java -version (and probably a which java/where java) on each machine, so we can see where we're going to have issues. If the job failed on Java <8 that would be ideal
  • Upgrade ansible scripts for machines which need it.
  • Apply scripts to machines and get us ready to upgrade
  • Upgrade and make sure things still work (some machines may not be using the Java in the path for Jenkins connection)

Note for @nodejs/build @nodejs/jenkins-admins considering clicking the shiny red button to update Jenkins

image

In the changelog for 2.60.1 it notes that it requires Java 8 to be installed on all the hosts.

2.60.1 is the first Jenkins LTS release that requires Java 8 to run. If you're using the Maven Project type, please note that it needs to use a JDK capable of running Jenkins, i.e. JDK 8 or up. If you configure an older JDK in a Maven Project, Jenkins will attempt to find a newer JDK and use that automatically. If your SSH Slaves fail to start and you have the plugin install the JRE to run them, make sure to update SSH Slaves Plugin to at least version 1.17 (1.20 recommended).

I'm pretty sure we don't have Java 8 on all our hosts, so updating will stop those hosts being able to connect.

@gibfahn
Copy link
Member Author

gibfahn commented Jun 28, 2017

I guess the plan would be to confirm that all our machines are running Java 8 and then update.

@jbergstroem
Copy link
Member

$ ssh ci java -version              
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

@gibfahn
Copy link
Member Author

gibfahn commented Jun 28, 2017

@jbergstroem sure, but what about the individual hosts? For example I'm pretty sure AIX is using Java 7 (see here).

For background, we upgraded our Jenkins instance, and a bunch of our slaves were suddenly unable to connect, so we had to revert.

@jbergstroem
Copy link
Member

@gibfahn I didn't get that even agents now require java 8. What kind of crazy is this :(

@jbergstroem
Copy link
Member

So, anyone with any ideas on how we move forward here? We are out of LTS meaning any security bugs (expect them) will hit us pretty hard.

@bnoordhuis
Copy link
Member

There is a Java 8 JDK for AIX (link), same for FreeBSD. The only problem might be SmartOS (which frankly I couldn't care less about.)

@geek
Copy link
Member

geek commented Jul 10, 2017

SmartOS has Java 8 support https://gist.github.com/dekobon/305883fb6d776b0c9fc1

@jbergstroem
Copy link
Member

There's a few older systems like centos5 which doesn't pack it either. If this is only a matter of adding more package source repos then it might not be as bad then.

@gibfahn
Copy link
Member Author

gibfahn commented Aug 22, 2017

Discussed in the meeting (minutes: #845), consensus was that we need to deal with this. I'll put some concrete steps in the first comment.

@joaocgreis
Copy link
Member

joaocgreis commented Aug 28, 2017

Here is a first version of the job: https://ci.nodejs.org/view/All/job/check-java-version/2/ (Build WG only) feel free to edit as needed. I left Windows out because all Windows machines already have Java 8.

@rvagg
Copy link
Member

rvagg commented Aug 28, 2017

I improvified it: cross-platform bash execution, use the parent-of-parent to figure out how java is executed (not using which java which may be incorrect). Changing the grep of the version line to be OK with OpenJDK and other types of JRE.

https://ci.nodejs.org/view/All/job/check-java-version/13/ is the latest run. A quick visual inspection suggests that discounting the Pi's, we're more than 1/2 way there.

@gibfahn
Copy link
Member Author

gibfahn commented Aug 28, 2017

Did someone delete the job? I'm not seeing one called Java.

use the parent-of-parent to figure out how java is executed (not using which java which may be incorrect).

I was going to suggest we make grandparent java != which java an error, unless we're aware of some specific exceptions it'd be nice if the default Java (in the path) was always the one used to run Jenkins.

@rvagg
Copy link
Member

rvagg commented Aug 28, 2017

@gibfahn log in to ci.nodejs.org before clicking the link above, I believe it's been made private, which is probably a good idea.

@gibfahn
Copy link
Member Author

gibfahn commented Aug 28, 2017

Okay, updated it to be unstable (yellow) if Java is at least 8 but Jenkins doesn't use the one in the path.

@gibfahn
Copy link
Member Author

gibfahn commented Sep 1, 2017

So I was looking at this and I realised that it doesn't get the full java path in a lot of cases, because it's printing the comm part of the ps field, which is usually just the basename of the command (so it just returns java).

I changed it from:

JAVA_CMD=$(ps -p $HUDSON_PID -o comm=)

to

JAVA_CMD=$(ps -p $HUDSON_PID -o args= | awk '{print $1}')

@gibfahn
Copy link
Member Author

gibfahn commented Sep 4, 2017

Updated the job to handle windows (although how good that windows handling is is questionable, @refack @joaocgreis feel free to double-check/correct).

https://ci.nodejs.org/view/All/job/check-java-version/37/

This should be pretty much accurate.

@refack
Copy link
Contributor

refack commented Sep 4, 2017

https://ci.nodejs.org/view/All/job/check-java-version/37/label=test-rackspace-win2012r2-x64-9/console
Looks sane ✔️

@joaocgreis
Copy link
Member

Looks good. Two notes:

  • I recommend having set -e on all jobs (set -x is already there). When it makes a script fail we usually want it to.
  • We explicitly don't want to run Jenkins as a server on Windows. There were issues with it in the past, though I suspect they might have been fixed by the recent stdio changes.

@gibfahn
Copy link
Member Author

gibfahn commented Sep 6, 2017

I recommend having set -e on all jobs (set -x is already there). When it makes a script fail we usually want it to.

Jenkins runs with -ex by default doesn't it? Seems to looking at how the script gets called in the console output:

[check-java-version] $ /bin/sh -xe /tmp/hudson1653446783271619934.sh

The set -x was originally a set +x, it's just there so I can flip it for debugging.

@gibfahn
Copy link
Member Author

gibfahn commented Sep 6, 2017

We explicitly don't want to run Jenkins as a server on Windows. There were issues with it in the past, though I suspect they might have been fixed by the recent stdio changes.

Okay, good to know. What issues were those? We converted to running as a service and haven't had any issues with it (unless we didn't realise they were related).

Does the jenkins.bat file get run automatically run on reboot? I've actually found the service to be a bit of a pain, it has to be run as an administrator, and it has to be restarted if you change passwords.

@joaocgreis
Copy link
Member

Jenkins runs with -ex by default doesn't it?

True. I tend to forget that because I always add the #!/bin/bash -ex line to make the script run bash and not dash. The current script looks good as is.

Okay, good to know. What issues were those? We converted to running as a service and haven't had any issues with it (unless we didn't realise they were related).

I didn't investigate it at the time, just converted the job to not use a service. I remember something because input/output was not available, thus I suspect it was fixed by nodejs/node#11863 . Good to know it works as a service now, did you somehow try to verify if all tests are still running as expected? We have many tests that skip, some we want them to skip on CI (Windows unsupported, things like that), but I suspect some may start skipping because of changes like that, when they should really run. This makes CI very brittle in my opinion, but is the balance we have to make users able to easily test locally, and I don't have a good idea for fixing it.

The CI machines use autologon and run jenkins.bat on startup. It doesn't work perfectly every time, sometimes autologon fails and Jenkins only starts when we manually login to the machine, but has been good enough.

@gibfahn
Copy link
Member Author

gibfahn commented Sep 6, 2017

True. I tend to forget that because I always add the #!/bin/bash -ex line to make the script run bash and not dash.

Fair enough, I actually removed the shebang because it doesn't work with Git Bash.

Good to know it works as a service now, did you somehow try to verify if all tests are still running as expected?

Internally (IBM) we use it as a service, and we do full CI runs, without ignoring flaky tests. Haven't seen any issues with that. I should check whether things are being skipped though, that's a good point!

@refack
Copy link
Contributor

refack commented Sep 6, 2017

Can we stop calling it Git Bash, it's MSYS Bash. Git for Windows is just the package that deploys it...

@gibfahn
Copy link
Member Author

gibfahn commented Sep 6, 2017

Can we stop calling it Git Bash, it's MSYS Bash. Git for Windows is just the package that deploys it...

image

https://git-for-windows.github.io/

It's the version of bash that comes with Git for Windows. As they changed the name from Msysgit to Git for Windows (see StackOverflow), no-one knows what MSYS Bash is. I prefer Git Bash, everyone knows what that means.

@rvagg
Copy link
Member

rvagg commented Oct 12, 2017

Sorry @mhdawson, I didn't realise you hadn't been added to build/release already! I've done that now and you should be able to grab the nodejs_build_release key from in secrets.

@mhdawson @gdams would it be OK if we go ahead with the Jenkins upgrade and just pull those AIX PPC machines from the test pool for now and add them back in when they're done?

@mhdawson
Copy link
Member

@rvagg I assume we are going to start with the test ci since that one is more broadly accessible right ? If we at least still have AIX in the release CI it will make sure we don't miss any releases, although I'm hoping we'll get the machines updated by tomorrow anyway. So long way of saying I think its reasonable to go ahead with the Jenkins upgrade and pull AIX until we get them updated

@rvagg
Copy link
Member

rvagg commented Oct 12, 2017

Yep, and in the past we've treated test ci as a staging for changes to release ci since it's more critical anyway, we can break test ci and it's annoying to contributors but if we break release ci cause critical problems for users. So we'll see what the fall-out is for an upgrade of test ci first! I'll go ahead and get it done now and pull out AIX from node-test-commit if they don't reconnect (which is what we expect will happen I think?).

@mhdawson
Copy link
Member

I went ahead and put Java 8 on release-osuosl-ubuntu1404-ppc64_be-1 and release-osuosl-ubuntu1404-ppc64_le-1 but I'll wait on switching over /etc/init/jenkins.conf to using it until we've soaked the change a bit more on the test machines.

@rvagg
Copy link
Member

rvagg commented Oct 12, 2017

Upgraded ci.nodejs.org, still waiting for the Pi's to all reconnect but they normally take a while anyway so I'll be patient. test-osuosl-aix61-ppc64_be-1 and test-osuosl-aix61-ppc64_be-2 did not come back, as expected I suppose.

@rvagg
Copy link
Member

rvagg commented Oct 13, 2017

Thanks everyone for contributing to this so far btw, it's been a tough job so far.

@rvagg
Copy link
Member

rvagg commented Oct 13, 2017

@nodejs/jenkins-admins FYI I've had to untick "Build if SCM changes" in the Advanced section of the MultiJob sub-jobs in node-test-commit, node-test-commit-arm-fanned and node-test-commit-windows-fanned, I guess this is new, or the default to ticked is new or something, because it's causing problems with the fanned jobs but also it's just plain annoying when testing if something works. I'm not seeing a good case for having it on in our infra.

@rvagg
Copy link
Member

rvagg commented Oct 13, 2017

Looking good: https://ci.nodejs.org/job/node-test-commit/13093/

still got the arm-fanned ESM addon tests failing, @refack is having a play with trying to fix that. It's a @nodejs/build thing rather than a core thing so if there are more of us that can lend him a hand then he'd probably appreciate it.

@gdams
Copy link
Member

gdams commented Oct 13, 2017

okay I have brought both the AIX machine back online using a temporary version of IBM Java 8 located in /home/iojs/jdk8/

@mhdawson
Copy link
Member

mhdawson commented Oct 13, 2017

There seems be a problem running the test job on AIX, investigating.

@mhdawson
Copy link
Member

Looks like it might be the way the agent was started. Don't think /etc/rc.d/rc2.d/S20jenkins start was used as these were not edited on the 2 AIX machines.

Edited and restarted agent and testing now, seems to be getting further.

@mhdawson
Copy link
Member

Ok, tests jobs on both machines passed so I think we are back to being good.

@gibfahn
Copy link
Member Author

gibfahn commented Oct 13, 2017

Looks like it might be the way the agent was started. Don't think /etc/rc.d/rc2.d/S20jenkins start was used as these were not edited on the 2 AIX machines.

We really should document this somewhere.

@refack
Copy link
Contributor

refack commented Oct 13, 2017

Cross-ref #916

gibfahn added a commit to gibfahn/build that referenced this issue Oct 22, 2017
Would be great to get some of these Tips & Tricks written down.

Refs: nodejs#775 (comment)
gibfahn added a commit to gibfahn/build that referenced this issue Oct 22, 2017
Would be great to get some of these Tips & Tricks written down.

Refs: nodejs#775 (comment)
gibfahn added a commit to gibfahn/build that referenced this issue Oct 22, 2017
Would be great to get some of these Tips & Tricks written down.

Refs: nodejs#775 (comment)
gibfahn added a commit that referenced this issue Nov 3, 2017
Would be great to get some of these Tips & Tricks written down.

Refs: #775 (comment)
@rvagg
Copy link
Member

rvagg commented Nov 13, 2017

Going to close this, there's some ansible work that's left undone from #775 (comment) and the IBM machines but mostly they relate to older distros that we're going to be phasing out soon(ish) and shouldn't need to ansible provision again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants