[JENKINS-39835] - Be extra defensive about Errors and Exceptions #133

jtnord · 2016-11-17T19:03:43Z

JENKINS-39835 Be even more defensive then against leaving connections dangling.

NB: no testing performed of this fix - it is a direct port from the internal code that was submitted to make the JNLP v4 protocol.
@reviewbybees esp @stephenc @oleg-nenashev

[JENKINS-39835] Be even more defensive then against leaving connections dangling.

ghost · 2016-11-17T19:59:18Z

This pull request originates from a CloudBees employee. At CloudBees, we require that all pull requests be reviewed by other CloudBees employees before we seek to have the change accepted. If you want to learn more about our process please see this explanation.

stephenc · 2016-11-17T22:02:43Z

🐝

oleg-nenashev · 2016-11-17T23:47:09Z

src/main/java/org/jenkinsci/remoting/protocol/impl/NIONetworkLayer.java

+ } finally {
+ // incase this was an OOMErr and logging caused another OOMErr
+ recvKey.cancel();
+ onRecvClosed();


I suppose this code should rethrow errors afterwards

It was not doing it before the change.

It was not catching Errors before the change

if it does that you could kill the thread / leave the system in a bad state.

amuniz · 2016-11-18T10:02:22Z

🐝

oleg-nenashev

I do not feel suppressing Errors is a good idea here. After closing the connection, we should try to rethrow it. 🐛

oleg-nenashev · 2016-11-18T12:31:26Z

@jtnord Do you think we need a similar fix in stable-2.x?

jtnord · 2016-11-18T14:19:01Z

@oleg-nenashev if you throw the error will likely kill the thread and you may not end up in the state you think you want to end up in.

I'm not comfortable re-throwing Errors, so if you would like that then I will leave it up to yourself, cloudbees oss-team or someone else to provide.

jtnord · 2016-11-18T14:40:11Z

Do you think we need a similar fix in stable-2.x?

This does not contain the JNLP4v protocol (and hence this code) does it?

stephenc · 2016-11-19T08:36:10Z

W.r.t. Killing the thread. Thinking about this more... You would only ever kill a worker thread, which should be ok as they are just assigned to this operation and the threadpool will replace if they die

oleg-nenashev · 2016-12-16T22:41:04Z

@jtnord @stephenc Should we proceed with this PR?

jtnord · 2016-12-16T22:46:51Z

Yes this is 100% needed - and I feel this is a regression in remoting and the new LTS will be picking this up so we are likely to see reports of this more.

Basically ops read is no longer registered and only adds it later on (when all data has been read), leaving hanging network. Has been observed in the wild on the pre OSS version of the code.

oleg-nenashev · 2016-12-16T23:57:01Z

I still do not like the handling of some fatal errors, but I agree to merge it taking the potential impact into account.

oleg-nenashev · 2016-12-16T23:57:52Z

BTW I also think that killing the worker thread is fine

jtnord · 2016-12-19T18:51:42Z

@oleg-nenashev if the worker thread dies handling a read then whatever is connected is dead as far as OP_READ. as OP_READ is removed when we start the thread and only removed when we have read all of the data available in the callback (ie the thread that dies).

See line 147 in this block

Did you see my proposal on the Jenkins-dev list for handling uncaught exceptions :-)

oleg-nenashev · 2016-12-19T19:04:12Z

@jtnord

Did you see my proposal on the Jenkins-dev list for handling uncaught exceptions

Show me the link

jglick · 2016-12-19T19:22:53Z

src/main/java/org/jenkinsci/remoting/protocol/impl/NIONetworkLayer.java

+ LogRecord record = new LogRecord(Level.SEVERE, "[{0}] Uncaught {1}");
+ record.setThrown(t);
+ record.setParameters(new Object[]{stack().name(), t.getClass().getSimpleName()});
+ }


The whole block could be written more simply as

LOGGER.log(Level.SEVERE, "Uncaught error in " + stack().name(), t);

without any real loss.

jglick · 2016-12-19T19:24:22Z

src/main/java/org/jenkinsci/remoting/protocol/impl/NIONetworkLayer.java

+ record.setParameters(new Object[]{stack().name(), t.getClass().getSimpleName()});
+ }
+ } finally {
+ // incase this was an OOMErr and logging caused another OOMErr


Easier and probably safer to set a boolean flag at the top of the method indicating whether it completed normally; wrap everything in a try-finally block that closes the channel if not.

oleg-nenashev · 2016-12-19T21:23:38Z

OK, let's go forward with this PR as is.
@jtnord , my plan is to have remoting-3.2.1 with this fix only if @olivergondza agrees. Actually #138 may also deserve backporting due to the quite wide exposure of the issue. @olivergondza WDYT?

olivergondza · 2016-12-22T11:06:03Z

I am ok with the code as it is - logging the errors is fine. It seems reasonable safe to have it backported into .2.

oleg-nenashev · 2016-12-22T11:21:05Z

OK, going forward with the merge. Will be in this weekly if @kohsuke is available

Be extra defensive about Errors and Exceptions

ec9b5c1

[JENKINS-39835] Be even more defensive then against leaving connections dangling.

oleg-nenashev reviewed Nov 17, 2016

View reviewed changes

oleg-nenashev requested changes Nov 18, 2016

View reviewed changes

oleg-nenashev changed the title ~~Be extra defensive about Errors and Exceptions~~ [JENKINS-39835] - Be extra defensive about Errors and Exceptions Nov 18, 2016

oleg-nenashev added needs-fix needs-review labels Nov 18, 2016

oleg-nenashev approved these changes Dec 16, 2016

View reviewed changes

jglick approved these changes Dec 19, 2016

View reviewed changes

oleg-nenashev added backporting-candidate and removed needs-fix needs-review labels Dec 22, 2016

oleg-nenashev merged commit 32674f6 into master Dec 22, 2016

oleg-nenashev mentioned this pull request Dec 25, 2016

[FIXED JENKINS-39835] - Update remoting to 3.4 jenkinsci/jenkins#2679

Merged

oleg-nenashev added a commit that referenced this pull request Dec 25, 2016

Changelog: Noting 3.4 and #133

17d139f

jeffret-b deleted the jtnord-patch-1 branch January 28, 2020 19:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[JENKINS-39835] - Be extra defensive about Errors and Exceptions #133

[JENKINS-39835] - Be extra defensive about Errors and Exceptions #133

jtnord commented Nov 17, 2016 •

edited

Loading

ghost commented Nov 17, 2016

stephenc commented Nov 17, 2016

oleg-nenashev Nov 17, 2016

amuniz Nov 18, 2016

oleg-nenashev Nov 18, 2016

jtnord Nov 18, 2016

amuniz commented Nov 18, 2016

oleg-nenashev left a comment •

edited

Loading

oleg-nenashev commented Nov 18, 2016

jtnord commented Nov 18, 2016

jtnord commented Nov 18, 2016

stephenc commented Nov 19, 2016

oleg-nenashev commented Dec 16, 2016

jtnord commented Dec 16, 2016 •

edited

Loading

oleg-nenashev commented Dec 16, 2016

oleg-nenashev commented Dec 16, 2016

jtnord commented Dec 19, 2016 •

edited

Loading

oleg-nenashev commented Dec 19, 2016

jglick Dec 19, 2016

jglick Dec 19, 2016

oleg-nenashev commented Dec 19, 2016

olivergondza commented Dec 22, 2016

oleg-nenashev commented Dec 22, 2016

[JENKINS-39835] - Be extra defensive about Errors and Exceptions #133

[JENKINS-39835] - Be extra defensive about Errors and Exceptions #133

Conversation

jtnord commented Nov 17, 2016 • edited Loading

ghost commented Nov 17, 2016

stephenc commented Nov 17, 2016

oleg-nenashev Nov 17, 2016

Choose a reason for hiding this comment

amuniz Nov 18, 2016

Choose a reason for hiding this comment

oleg-nenashev Nov 18, 2016

Choose a reason for hiding this comment

jtnord Nov 18, 2016

Choose a reason for hiding this comment

amuniz commented Nov 18, 2016

oleg-nenashev left a comment • edited Loading

Choose a reason for hiding this comment

oleg-nenashev commented Nov 18, 2016

jtnord commented Nov 18, 2016

jtnord commented Nov 18, 2016

stephenc commented Nov 19, 2016

oleg-nenashev commented Dec 16, 2016

jtnord commented Dec 16, 2016 • edited Loading

oleg-nenashev commented Dec 16, 2016

oleg-nenashev commented Dec 16, 2016

jtnord commented Dec 19, 2016 • edited Loading

oleg-nenashev commented Dec 19, 2016

jglick Dec 19, 2016

Choose a reason for hiding this comment

jglick Dec 19, 2016

Choose a reason for hiding this comment

oleg-nenashev commented Dec 19, 2016

olivergondza commented Dec 22, 2016

oleg-nenashev commented Dec 22, 2016

jtnord commented Nov 17, 2016 •

edited

Loading

oleg-nenashev left a comment •

edited

Loading

jtnord commented Dec 16, 2016 •

edited

Loading

jtnord commented Dec 19, 2016 •

edited

Loading