Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wait for SSH Connectivity before delegating to SSH Launcher. #150

Closed
wants to merge 2 commits into from

Conversation

mvtorres
Copy link

No description provided.

@@ -57,8 +56,6 @@ private static SSHLauncher getSSHLauncher(InspectContainerResponse detail, Docke
String host = hostUrl.getHost();

LOGGER.log(Level.INFO, "Creating slave SSH launcher for " + host + ":" + port);

PortUtils.waitForPort(host, port);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not clear why this is any better than what's already there :
PortUtils.waitForPort is already attempting to wait for the remote SSH to be available before continuing ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our jenkins server and docker server is on the same box. As soon as the docker container is started, the container ssh port forward on 127.0.0.1, and the port, is already up. But the SSH daemon may not have finished starting up. When I tested this scenario, PortUtils.waitForPort does not block, since the port is already up. It only does a port is open check. The new change tries to do an ssh connection, which fails if the SSH daemon is not ready.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also had issues that jenkins slave failed to connect to ssh. I think it was because of ssh wasn't started. Then slave and image waited timeouts.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So did you get the slave node showing as offline/suspended, and when you look at the slave log, you get a connection terminated on Ssh Slave plugin 1.6, or IllegalStateException on Ssh Slave plugin 1.9?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had no chance to collect info, but sounds like your described. I also tried press manually launch slave and it connected successfully, but jobs and slave state wasn't provisioned further.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO ssh connection itself should be more responsive and should have timeout and retry that we should be able to handle

@magnayn
Copy link
Contributor

magnayn commented Feb 2, 2015

It's true that SSH seems to be a PITA!

I think that, by rights, SSHLauncher ought to be dealing with waiting for SSH ports to become available / responsive -- it isn't currently but might one day as I can't believe docker-cloud is the only plugin to have this as an issue.

I think it'd be the best way to 'hide' the waitForPort (which seems to work for many people) and 'ssh actually responding' bits in a DelegatingComputerLauncher (maybe rename RetryingComputerLauncher to something like WaitForSSHComputerLauncher)

@KostyaSha
Copy link
Member

As temp solution i thought about some delay before opening connection after port becomes available. But i think somebody should check with ssh-slaves devs how we can handle this retries or why they doesn't work now.

@mvtorres
Copy link
Author

mvtorres commented Feb 2, 2015

I did explore upgrading to the latest ssh-slave and delegating the retry logic to it. See #149.

However, it didn't fix the issue I was experiencing because of

https://github.com/jenkinsci/ssh-slaves-plugin/blob/master/src/main/java/hudson/plugins/sshslaves/SSHLauncher.java#L1154

it only retries if the ssh port is closed.

@mvtorres
Copy link
Author

mvtorres commented Feb 2, 2015

I checked in the refactored version.

@mvtorres
Copy link
Author

Closing...The fix for this issue was incorporated into #230

@mvtorres mvtorres closed this Jun 23, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants