[JEP-210] Log handling rewrite #17

jglick · 2016-09-23T22:47:56Z

way to get a log

Using jitpack for jenkinsci/workflow-api-plugin#17, and picking up jenkinsci/maven-hpi-plugin#46.

Also using jitpack and linking to jenkinsci/maven-hpi-plugin#46 & jenkinsci/workflow-api-plugin#17 & jenkinsci/workflow-support-plugin#15.

…able-task-plugin#29 & jenkinsci/workflow-api-plugin#17 & jenkinsci/workflow-support-plugin#15 & jenkinsci/workflow-job-plugin#27

carlossg · 2018-06-18T09:59:17Z

src/main/java/org/jenkinsci/plugins/workflow/flow/FlowExecutionOwner.java

+     * @param start the start position to begin reading from (normally 0)
+     * @throws EOFException if the start position is larger than the log size (or you may simply return EOF immediately when read)
+     */
+    public @Nonnull InputStream getLog(long start) throws IOException {


Yes, ought to be.

…erridden.

Otherwise when resuming a build you can get: … org.jenkinsci.plugins.workflow.support.pickles.TryRepeatedly$1 run WARNING: null java.io.IOException: java.io.IOException: not implemented at org.jenkinsci.plugins.workflow.log.BrokenLogStorage.overallListener(BrokenLogStorage.java:52) at org.jenkinsci.plugins.workflow.flow.FlowExecutionOwner.getListener(FlowExecutionOwner.java:130) at org.jenkinsci.plugins.workflow.support.pickles.TryRepeatedly$1.run(TryRepeatedly.java:96) at … Caused by: java.io.IOException: not implemented at org.jenkinsci.plugins.workflow.flow.FlowExecutionOwner$DummyOwner.getExecutable(FlowExecutionOwner.java:166) at io.jenkins.plugins.pipeline_log_fluentd_cloudwatch.PipelineBridge.forBuild(PipelineBridge.java:55) at org.jenkinsci.plugins.workflow.log.LogStorage.of(LogStorage.java:115) ... 10 more

src/main/java/org/jenkinsci/plugins/workflow/log/FileLogStorage.java

…omplains about, causing bad display of logs from before upgrade.

src/main/java/org/jenkinsci/plugins/workflow/log/FileLogStorage.java

svanoort · 2018-09-26T00:46:27Z

src/main/java/org/jenkinsci/plugins/workflow/log/FileLogStorage.java

+                            lastLine = line;
+                        }
+                    }
+                    if (lastLine != null) {


What if the content did not finish writing and the "lastId" has been truncated? (i.e. the line has only partially written)

Then the line → step mapping for this build will not be perfect. For example, in classic UI it would mean that some text would be incorrectly not associated with a step block but just shown as part of general output. Too minor to worry about.

Sounds like a bug to me, but I'll let you fix it when the first user reports it.

If log-index has been corrupted for some reason, then by definition some aspects of the line → step mapping will be incorrect. The only question is how much of an effort we make to reconstruct some of it.

src/main/java/org/jenkinsci/plugins/workflow/log/FileLogStorage.java

svanoort · 2018-09-26T16:31:19Z

src/main/java/org/jenkinsci/plugins/workflow/log/FileLogStorage.java

+                    // TODO can probably be done a bit more efficiently with FileChannel methods
+                    byte[] data = new byte[(int) (lastTransition - pos)];
+                    raf.readFully(data);
+                    buf.write(data);


🐜 Could we not write this a chunk at a time or stream it? This is going to generate a lot of GC pressure due to excess array allocation.

Sort of beside the point when the whole buf is memory-allocated. Which is hard to get around in current core versions since the only [Annotated]LargeText constructors take a file (useless) or a memory buffer. To write this properly, we would likely need for the Source and Session interfaces to be made public.

Answering my own question after digging in a bit more: no, because our ByteBuffer cannot be pre-sized for our content since its ensureCapacity method is private.

For those who can't follow the code: essentially what we're doing it allocating an array containing all the log content and then copying it into the Jenkins-internal ByteBuffer implement (org.kohsuke.stapler.framework.io.ByteBuffer) which will dynamically resize to fit the content. Essentially we're double-allocating.

However the ByteBuffer will resize to either double or fit content (whichever is larger). Which means that if we copied a chunk at a time, we'd likely end up triggering multiple backing-array resize operations for the ByteBuffer.

Kind of a wash in that case, without a way to more effectively size the buffer, which would require upstream changes to Stapler (which can be done later).

AFAICT the FileChannel comment -- which is another clever option -- won't work because our ByteBuffer (drumroll please) does not extend off of java.nio.ByteBuffer... instead it extends OutputStream 😆

So... I smell some room for core improvements in LargeText/AnnotatedLargeText that reduce our tendency to throw around (needless) byte array allocations. Might help with some of the humongous object allocations I've seen spiking our GC logs.

I smell some room for core improvements in LargeText/AnnotatedLargeText

Optimizing buffer allocations would be nice, but the critical improvement needed in these APIs is the ability to use an opaque, implementation-dependent cursor rather than a long, which is fine for the built-in storage but terrible for e.g. CloudWatch Logs. That would require changes in both the classic UI and Blue Ocean.

Fair enough -- nothing to do here at the moment but I'd like to leave the conversation marked unresolved to nudge us about future enhancements to do.

src/main/java/org/jenkinsci/plugins/workflow/log/FileLogStorage.java

svanoort · 2018-09-26T20:52:36Z

src/main/java/org/jenkinsci/plugins/workflow/log/FileLogStorage.java

+             BufferedReader indexBR = index.isFile() ? Files.newBufferedReader(index.toPath(), StandardCharsets.UTF_8) : new BufferedReader(new NullReader(0))) {
+            String line;
+            long pos = -1; // -1 if not currently in this node, start position if we are
+            while ((line = indexBR.readLine()) != null) {


For those keeping track at home: any time we try to read a single node's logs we trigger a full read of the entire index file, and as we scan we look for lines with our FlowNode ID. Which thankfully is smaller than the entire logfile by probably orders of magnitude.

But still this feels both overly complicated and inefficient - with O(m) performance to find all a FlowNode's logs, where m is the size of the index. Is there a reason we can't simply store when FlowNode is "done" for logging purposes? We have that logic handy already.

store when FlowNode is "done"

Not sure what you are proposing exactly.

@jglick We know how to compute when a FlowNode is "closed" for logging, depending on if it's an AtomNode, StartNode, or EndNode. This limits the scope we'd need to scan.

Now, if we specially marked the offset for the first and last log content for a FlowNode, and stored it specially (say as fields in our LogAction implementation, or showing boundaries a bit more clearly in the index) we could further limit the amount of scanning done.

However... eh, this can be optimized later.

stored it specially (say as fields in our LogAction implementation

I preferred to avoid duplicating information like this. In other words, log + log-index should be self-contained.

Now another line type (not a transition) could be added to log-index indicating that a given FlowNode has now been closed and that thus there is no need to search past that point in the file looking for subsequent blocks. The extra complexity just to avoid reading the rest of what would usually be a short text file (smaller than a typical “disk” block size I think) does not seem worth it.

Please leave this comment unresolved so we know in the future where to start if we need to optimize.

src/main/java/org/jenkinsci/plugins/workflow/log/FileLogStorage.java

svanoort

Overall a -0 review: needs some explanatory comments or this won't maintainable - and lodged a couple questions that I'd like to see an answer to before giving this a firm +1 / -1 (they may be buglets, but potentially not worth handling).

jglick · 2018-09-28T14:03:41Z

src/main/java/org/jenkinsci/plugins/workflow/log/FileLogStorage.java

+
+/**
+ * Simple implementation of log storage in a single file that maintains a side file with an index indicating where node transitions occur.
+ * Each line in the index file is a byte offset, optionally followed by a space and then a node ID.


Note: I had considered use of Data{Input,Output}Stream whereby the format would be a sequence of 64-bit offsets followed by node ID (as null-terminated UTF-8, or maybe even finally declare that these must be ints). Would make the log-index file more compact and quicker to parse, at the expense of it being harder to debug problems. I do not have strong feelings either way.

svanoort

Approving -- with caveats that we'll probably have to do some quick-fixes as we discover totally unexpected failure modes with the way we're storing and indexing logs. That's kind of a given for a change this big though, you just grin and bear it.

[JENKINS-38381] Prototype of log handling rewrite.

ca68b43

jglick added the work-in-progress label Sep 23, 2016

This was referenced Sep 23, 2016

[JEP-210] Log handling rewrite jenkinsci/workflow-support-plugin#15

Merged

[JEP-210] Log handling rewrite jenkinsci/workflow-job-plugin#27

Merged

jglick added 4 commits September 26, 2016 15:05

Pick up jenkinsci/maven-hpi-plugin#25.

7c2b090

Merge branch 'master' into logs-JENKINS-38381

58c3424

Clarification of getListener Javadoc.

bfa0a61

2.16 parent

6846366

recampbell assigned oleg-nenashev and unassigned oleg-nenashev Oct 5, 2016

jglick added 2 commits December 7, 2016 15:24

Merge branch 'master' into logs-JENKINS-38381

dee29e7

Updated parent to pick up jitpack.io support.

99f1c51

jglick added a commit to jglick/workflow-support-plugin that referenced this pull request Dec 7, 2016

Merge branch 'master' into logs-JENKINS-38381

32acc6f

Using jitpack for jenkinsci/workflow-api-plugin#17, and picking up jenkinsci/maven-hpi-plugin#46.

jglick added a commit to jglick/workflow-job-plugin that referenced this pull request Dec 7, 2016

Merge branch 'master' into logs-JENKINS-38381

211bd14

Also using jitpack and linking to jenkinsci/maven-hpi-plugin#46 & jenkinsci/workflow-api-plugin#17 & jenkinsci/workflow-support-plugin#15.

jglick added 6 commits December 20, 2016 20:38

getLog should take a long start argument.

ea218b9

Merge branch 'master' into logs-JENKINS-38381

2a7494c

Merge branch 'master' into logs-JENKINS-38381

c2dd464

Merge branch 'incrementals' into logs-JENKINS-38381

834dce0

Merge branch 'master' into logs-JENKINS-38381

c2c17ba

Merge branch 'master' into logs-JENKINS-38381

2f7745d

carlossg reviewed Jun 18, 2018

View reviewed changes

jglick added 4 commits June 18, 2018 10:59

Marking getLog as beta, and at least for now making it fail if not ov…

69b13f7

…erridden.

Added boolean complete flag to FlowExecutionOwner.getLog.

e1e30db

Refactored API to center around more abstract LogStorage interface.

58f0128

Javadoc error.

239019e

jglick mentioned this pull request Jul 25, 2018

[JENKINS-52692,JENKINS-38313] - External task logging API jenkinsci/jenkins#3557

Closed

4 tasks

jglick added 3 commits July 31, 2018 10:02

Merge branch 'master' into logs-JENKINS-38381

b1d5771

Updates.

5b65e4f

Documenting and at least smoke-testing AutoCloseable on step logs.

c9c8e89

jglick mentioned this pull request Sep 5, 2018

[JENKINS-45693] Defined TaskListenerDecorator #76

Merged

jglick added the work-in-progress label Sep 7, 2018

jglick commented Sep 7, 2018

View reviewed changes

src/main/java/org/jenkinsci/plugins/workflow/log/FileLogStorage.java Show resolved Hide resolved

This was referenced Sep 7, 2018

[JENKINS-53016] Harden (Model)HyperlinkNote against display names containing newlines jenkinsci/jenkins#3580

Merged

Simplified Run.writeLogTo jenkinsci/jenkins#3612

Merged

FileLogStorage was calling Reader.read() past EOF, which NullReader c…

79e3cc4

…omplains about, causing bad display of logs from before upgrade.

jglick removed the work-in-progress label Sep 7, 2018

jglick added 2 commits September 7, 2018 17:16

CI outage, so may as well bump up parent!

d3c3b14

Windows test failure due to a lock on log-index.

ba604ce

dwnusbaum approved these changes Sep 24, 2018

View reviewed changes

svanoort reviewed Sep 25, 2018

View reviewed changes

src/main/java/org/jenkinsci/plugins/workflow/log/FileLogStorage.java Show resolved Hide resolved

svanoort reviewed Sep 26, 2018

View reviewed changes

src/main/java/org/jenkinsci/plugins/workflow/log/FileLogStorage.java Show resolved Hide resolved

svanoort reviewed Sep 26, 2018

View reviewed changes

src/main/java/org/jenkinsci/plugins/workflow/log/FileLogStorage.java Show resolved Hide resolved

svanoort reviewed Sep 26, 2018

View reviewed changes

src/main/java/org/jenkinsci/plugins/workflow/log/FileLogStorage.java Outdated Show resolved Hide resolved

svanoort reviewed Sep 26, 2018

View reviewed changes

src/main/java/org/jenkinsci/plugins/workflow/log/FileLogStorage.java Outdated Show resolved Hide resolved

svanoort reviewed Sep 26, 2018

View reviewed changes

src/main/java/org/jenkinsci/plugins/workflow/log/FileLogStorage.java Outdated Show resolved Hide resolved

svanoort reviewed Sep 26, 2018

View reviewed changes

jglick added 2 commits September 28, 2018 10:58

Review comments from @svanoort.

be9be0c

Defend against hypothetical negative array sizes.

0293799

jglick commented Sep 28, 2018

View reviewed changes

jglick requested a review from svanoort September 28, 2018 15:02

jglick added 2 commits September 28, 2018 17:05

Merge branch 'master' into logs-JENKINS-38381

a63fe88

Tweaking handling of corrupt content in index-log.

999cef3

svanoort approved these changes Oct 3, 2018

View reviewed changes

jglick merged commit 6f39eb9 into jenkinsci:master Oct 4, 2018

jglick deleted the logs-JENKINS-38381 branch October 4, 2018 20:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[JEP-210] Log handling rewrite #17

[JEP-210] Log handling rewrite #17

jglick commented Sep 23, 2016 •

edited

Loading

carlossg Jun 18, 2018

jglick Jun 18, 2018

svanoort Sep 26, 2018

jglick Sep 26, 2018

svanoort Sep 26, 2018

jglick Sep 28, 2018

svanoort Sep 26, 2018

jglick Sep 26, 2018

svanoort Sep 26, 2018

svanoort Sep 26, 2018

jglick Sep 28, 2018

svanoort Oct 1, 2018

svanoort Sep 26, 2018 •

edited

Loading

jglick Sep 28, 2018

svanoort Sep 28, 2018

jglick Oct 1, 2018

svanoort Oct 1, 2018

svanoort left a comment

jglick Sep 28, 2018

svanoort left a comment

[JEP-210] Log handling rewrite #17

[JEP-210] Log handling rewrite #17

Conversation

jglick commented Sep 23, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

svanoort Sep 26, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

svanoort left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

svanoort left a comment

Choose a reason for hiding this comment

jglick commented Sep 23, 2016 •

edited

Loading

svanoort Sep 26, 2018 •

edited

Loading