Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jenkins 47225: Fix ProcStarter #236

Merged
merged 5 commits into from
Oct 30, 2017
Merged

Conversation

MattLud
Copy link
Contributor

@MattLud MattLud commented Oct 16, 2017

Addressed to allow fixes for #232

@marvinthepa
Copy link
Contributor

Looks good at first glance.

Could you please try to reproduce the problems that @scoheb had with this, ideally in a test?

I think I reproduced it using withEnv(["a.b=c"]). Checking with sh 'env' on the jnlp slave, the "invalid" environment variable a.b was not exported either. So I guess we can safely ignore those invalid environment variables in the container as well.

@MattLud
Copy link
Contributor Author

MattLud commented Oct 16, 2017 via email

@MattLud
Copy link
Contributor Author

MattLud commented Oct 17, 2017

Looks like the error is getting swallowed up in the k8s client. Looks like the best approach will be to just detect "invalid" environment variables and skip trying to export them.

I'm guessing implement something like this - https://stackoverflow.com/a/2821183

@scoheb
Copy link
Contributor

scoheb commented Oct 17, 2017

I would vote for fixing them rather than skipping them...that way at least users can see what has happened by doing an 'env'...or we log warnings if we skip them.

How about this: https://paste.fedoraproject.org/paste/6vTS7Gv4RD9qBY~VkA8EMg

@MattLud
Copy link
Contributor Author

MattLud commented Oct 17, 2017

Fixing it is fine by me, though the current behavior on a regular executor is to skip that environment variable.

@scoheb
Copy link
Contributor

scoheb commented Oct 17, 2017

@MattLud

In that case, let's go with what the current behavior is!

@marvinthepa
Copy link
Contributor

@MattLud
Copy link
Contributor Author

MattLud commented Oct 24, 2017

Fixed that broken build - Let me know if there's anything else this PR needs!

).getBytes(StandardCharsets.UTF_8)
);
}
//Check that key is bash compliant.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally would go "uncle bob style" and extract a method with a nice name instead of commenting:

if (isLegalShellVariableName(entry.getKey()) {
   ...

Comments are very prone to code rot.

@marvinthepa marvinthepa merged commit a2c450e into jenkinsci:master Oct 30, 2017
@marvinthepa
Copy link
Contributor

@MattLud:

There seems to be something wrong with this (which I unfortunately only recognized AFTER I merged this).
Seems like the exit code is not picked up correctly:

Executing shell script inside container [ssh-client] of pod [jenkins-slave-r7hwz-mlgt4]
Executing command: "ssh-agent" 
printf "EXITCODE %3d" $?; exit
SSH_AUTH_SOCK=/tmp/ssh-s0bZApqhjHuY/agent.15; export SSH_AUTH_SOCK;
SSH_AGENT_PID=16; export SSH_AGENT_PID;
echo Agent pid 16;
EXITCODE   0SSH_AUTH_SOCK=/tmp/ssh-s0bZApqhjHuY/agent.15
SSH_AGENT_PID=16
Executing shell script inside container [ssh-client] of pod [jenkins-slave-r7hwz-mlgt4]
Executing command: "ssh-add" "/home/jenkins/workspace/empty@tmp/private_key_5661184397489639826.key" 
printf "EXITCODE %3d" $?; exit
EXITCODE   0EXITCODE   0Identity added: /home/jenkins/workspace/empty@tmp/private_key_5661184397489639826.key (/home/jenkins/workspace/empty@tmp/private_key_5661184397489639826.key)
Identity added: /home/jenkins/workspace/empty@tmp/private_key_5661184397489639826.key (/home/jenkins/workspace/empty@tmp/private_key_5661184397489639826.key)
[Pipeline] // sshagent
[Pipeline] }
[Pipeline] // container
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // podTemplate
[Pipeline] End of Pipeline
ERROR: Failed to run ssh-add
Finished: FAILURE

@MattLud
Copy link
Contributor Author

MattLud commented Oct 30, 2017 via email

@MattLud
Copy link
Contributor Author

MattLud commented Oct 30, 2017

Also, what is the jnlp image/pod template being used?

@marvinthepa
Copy link
Contributor

@MattLud: I used the pipeline and key from the test, but I was also able to reproduce using other keys.

My current suspicion is that the output of "ssh-agent" and the "printf "EXITCODE %3d" $?" (which is introduced by the kubernetes plugin) are interleaved (I don't know why).

I guess this will all change completely with #239

@marvinthepa
Copy link
Contributor

#242 might be related as well

@MattLud
Copy link
Contributor Author

MattLud commented Nov 9, 2017

Yep - you're right that they are interleaved - We finally bumped into the issue related to this and it seems to be race condition sensitive -

For failures, your log will show
EXITCODE 0EXITCODE 0Identity added: .....
but on successful builds it'll be

Identity added: ...
EXITCODE   0EXITCODE   0

@MattLud
Copy link
Contributor Author

MattLud commented Nov 10, 2017

After further testing, including #242, I still could not find a fix.

Adding a sleep 1 to printf \"" + EXIT_COMMAND_TXT + " %3d\" $?; " + EXIT + NEWLINE; seems to have worked around the race condition for now - this was based on 940 test cycles with SSH agent that were originally failing every 2-3 tests.


//check if the cmd is sourced from Jenkins, rather than another plugin; if so, skip cmdEnvs as we are getting other environment variables
for (String cmd : cmdEnvs) {
if (cmd.startsWith(JENKINS_HOME)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why was this introduced? It is preventing environment variable injecting wrappers to inject env into sh steps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From #232 - "One issue that cropped up was that when executing sh commands, we would get the jnlp agent's injected environment variables as well, causing obvious problems such as JAVA_HOME being overwritten. The unsatisfying answer was to check for the presence of JENKINS_HOME in the cmdenvs and skip if present."

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aroq
Copy link

aroq commented Dec 13, 2017

Guys, do you have any updates on the exicode issue above? I had to downgrade to the previous version of the plugin because of failed ssh-add.

@balihb
Copy link

balihb commented Dec 13, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants