-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JENKINS-52165] Calling Controller.watch API #63
Conversation
By the way jenkinsci/support-core-plugin#128 should make it easier to see the impact of this on actual Remoting usage. Have not yet tried it myself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge conflict, but the code itself LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merge conflict
Approved once the upstream PRs are integrated && the conflict is fixed |
This looks good to go if merge conflicts fixed? ping @jglick. |
@vivek The upstream isn't really ready-to-go yet, and the whole thing needs some sort of proper and thorough testing in a simulated environment. Oleg's review found some issues already. We could of course YOLO-merge all this but I make no promises on the fallout if so. |
Well, the upstream “issues” are just things like Javadoc that should be improved, not user-visible problems, but yes this all deserves testing in a realistic environment. Once @svanoort is ready to start that I will polish up all the PRs. |
|
||
private static final long MIN_RECURRENCE_PERIOD = 250; // ¼s | ||
private static final long MAX_RECURRENCE_PERIOD = 15000; // 15s | ||
private static final float RECURRENCE_PERIOD_BACKOFF = 1.2f; | ||
private static final long WATCHING_RECURRENCE_PERIOD = Main.isUnitTest ? /* 5s */5_000: /* 5m */300_000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably be switched to nonfinal
constant initialized via system property and overridden in the relevant test only.
this.remote = ws.getRemote(); | ||
setupTimer(); | ||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should introduce a system property to act as a kill switch and fall back to polling.
…which do not seem to have closed or broken the channel.
/** If set to false, disables {@link Execution#watching} mode. */ | ||
@SuppressWarnings("FieldMayBeFinal") | ||
private static boolean USE_WATCHING = !"false".equals(System.getProperty(DurableTaskStep.class.getName() + ".USE_WATCHING")); | ||
|
||
/** | ||
* Represents one task that is believed to still be running. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will flesh this out with an explanation of the expected state changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(done)
os = ps; | ||
} | ||
try { | ||
IOUtils.copy(stream, os); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like all PrintStream#write
overloads synchronize on this
and null-check out
before writing (since it is nulled out once the stream is closed). Do we need to do either of those here as well? This usage should always be from the same thread IIUC, but do other threads have access to listener.getLogger()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, that sounds safer.
…tStream.write would have.
if (ps.getClass() == PrintStream.class) { | ||
// Try to extract the underlying stream, since swallowing exceptions is undesirable and PrintStream.checkError is useless. | ||
OutputStream os = (OutputStream) printStreamDelegate.get(ps); | ||
synchronized (os) { // like PrintStream.write overloads do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean synchronized (ps)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed that seems to be the lock, will change.
if (ps.getClass() == PrintStream.class) { | ||
// Try to extract the underlying stream, since swallowing exceptions is undesirable and PrintStream.checkError is useless. | ||
synchronized (ps) { // like PrintStream.write overloads do | ||
IOUtils.copy(stream, (OutputStream) printStreamDelegate.get(ps)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for the PrintStream to be closed? If so, we'd get an NPE because printStreamDelegate.get(ps)
will be null.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should not be closed. If it is, we expect this method to throw an exception. I can make it throw a specific exception for better diagnostics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I guess it would be clear to anyone investigating where the NPE came from anyway, so it probably doesn't need a specific exception, but 3f8ced4 won't hurt.
|
||
// ditto | ||
@Override public void problem(Exception x) { | ||
Functions.printStackTrace(x, listener().getLogger()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to work. Due to an apparent mistake in jenkinsci/pipeline-cloudwatch-logs-plugin#5 affecting agent-side, but not master-side, logging, I saw in the build log:
java.lang.NullPointerException
at io.jenkins.plugins.pipeline_log_fluentd_cloudwatch.FluentdLogger$FluentdOutputStream.eol(FluentdLogger.java:128)
at hudson.console.LineTransformationOutputStream.eol(LineTransformationOutputStream.java:60)
at hudson.console.LineTransformationOutputStream.write(LineTransformationOutputStream.java:56)
at hudson.console.LineTransformationOutputStream.write(LineTransformationOutputStream.java:74)
at java.io.PrintStream.write(PrintStream.java:480)
at org.jenkinsci.plugins.credentialsbinding.impl.BindingStep$Filter$1.eol(BindingStep.java:183)
at hudson.console.LineTransformationOutputStream.eol(LineTransformationOutputStream.java:60)
at hudson.console.LineTransformationOutputStream.write(LineTransformationOutputStream.java:56)
at hudson.console.LineTransformationOutputStream.write(LineTransformationOutputStream.java:74)
at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1793)
at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1769)
at org.apache.commons.io.IOUtils.copy(IOUtils.java:1744)
at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$HandlerImpl.output(DurableTaskStep.java:571)
at org.jenkinsci.plugins.durabletask.FileMonitoringTask$Watcher.run(FileMonitoringTask.java:477)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
with the build proceeding nonetheless.
private static final Field printStreamDelegate; | ||
static { | ||
try { | ||
printStreamDelegate = FilterOutputStream.class.getDeclaredField("out"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, reflection hacks: the universal panacea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I could not find a better way to do this.
} catch (Exception x) { | ||
// Try to report it to the master. | ||
try { | ||
execution.problem(x); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I predict we'll be seeing a lot of this particular report when there are network issues. Thankfully it won't be that often due to the between recurrence periods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's as good as it's going to get for now -- happy to see us doing something with errors besides letting the PrintStream swallow it
[JENKINS-52165] Calling Controller.watch API
JENKINS-52165 Split out of #21. Downstream of jenkinsci/durable-task-plugin#60. Subsumes #73. Should improve performance by switching durable task logging and exit status checking from being pull (initiated by master at intervals) to push (initiated by agent at close intervals).
@reviewbybees