InstrumentedStreams for input & output streams #1314

schlosna · 2022-01-12T03:12:04Z

Before this PR

Tracking InputStream and OutputStream progress and throughput required rolling your own stream wrappers.

After this PR

==COMMIT_MSG==
InstrumentedStreams for input & output streams

Track bytes read/written via meter and throughput histogram
==COMMIT_MSG==

Possible downsides?

Track bytes read/written via meter and throughput histogram

changelog-app · 2022-01-12T03:12:08Z

Generate changelog in `changelog/@unreleased`

Type

Description

InstrumentedStreams for input & output streams

Track bytes read/written via meter and throughput histogram

Check the box to generate changelog(s)

Generate changelog entry

schlosna · 2022-01-12T03:14:20Z

Need baseline bump #1313 to merge first to fix FilterOutputStreamSlowMultibyteWrite (see palantir/gradle-baseline#2031)

schlosna · 2022-01-12T03:16:29Z

tritium-lib/src/main/java/com/palantir/tritium/io/ForwardingOutputStream.java

+import java.io.OutputStream;
+import java.util.Objects;
+
+abstract class ForwardingOutputStream extends FilterOutputStream {


I left these Forwarding*Stream as package private for now. If one really wants, Apache commons-io's Proxy*Stream provides similar framework

carterkozak · 2022-01-13T15:11:48Z

tritium-lib/src/main/java/com/palantir/tritium/io/InstrumentedInputStream.java

+final class InstrumentedInputStream extends ForwardingInputStream {
+    private final Meter bytes;
+    private final Histogram throughput;
+    private long start;


concurrent stream access never works great, but we may want to move start to a parameter instead rather than object field.

Yeah, are you thinking passing startNanos as another arg to after?

From API consumer perspective, I need to JavaDoc these to make it clear on the args as there will be both long

If we keep the throughput histograms, I think that would be cleaner. I commented about some questionable aspects of the throughput histograms given we don't know which operations actually flush and incur cost, or how much data is included within those operations. Perhaps if we do keep the throughput histograms, we should instead accumulate total bytes written for the lifespan of the stream, and sum all the time spent in read/write/flush/close.

carterkozak · 2022-01-13T15:12:18Z

tritium-lib/src/main/java/com/palantir/tritium/io/InstrumentedOutputStream.java

+    protected void after(long bytesWritten) {
+        double elapsedSeconds = (System.nanoTime() - start) / 1_000_000_000.0;
+        long bytesPerSecond = Math.round(bytesWritten / elapsedSeconds);
+        throughput.update(bytesPerSecond);


I'm not entirely sure I'd always trust the throughput value because all the work may occur in flush() and close(), where write methods largely push data into a buffer.

Given the variance in histogram values, we may be better off only using the meter (which is limited to the reporting interval). What do you think?

Yeah, I'm leaning toward removing the throughput histogram. There might be some value in tracking a histogram of write sizes to identify small reads/writes.

+1, we can always add that sort of thing later on if we need it

carterkozak · 2022-01-13T15:13:56Z

tritium-lib/src/main/java/com/palantir/tritium/io/InstrumentedStreams.java

+     * @param throughput bytes read per second
+     * @return instrumented input stream
+     */
+    public static InputStream input(InputStream in, Meter bytes, Histogram throughput) {


Thoughts on taking a TaggedMetricRegistry + name, and using metric-schema to define a standard structure? That way we can define reusable dashboards.

Yep, will do when I have some cycles

Interested in thoughts for the tagging structure. Right now I have a single type tag that one sets to distinguish streams. We will need to be cautious with tag cardinality, though we can enforce compile time tagging.

Example from https://17129-66020851-gh.circle-artifacts.com/0/~/artifacts/junit/tritium-lib/test/classes/com.palantir.tritium.io.InstrumentedStreamsTest.html

io.stream.read:{libraryName=tritium, libraryVersion=unknown, type=test-in} count = 2147483648 mean rate = 146651358.56 events/second 1-minute rate = 155940259.30 events/second 5-minute rate = 156866618.01 events/second 15-minute rate = 157027104.69 events/second io.stream.write:{libraryName=tritium, libraryVersion=unknown, type=gzip-out} count = 2147483648 mean rate = 146654007.80 events/second 1-minute rate = 155943552.02 events/second 5-minute rate = 156869248.25 events/second 15-minute rate = 157029620.15 events/second io.stream.write:{libraryName=tritium, libraryVersion=unknown, type=raw-out} count = 10409991 mean rate = 710861.07 events/second 1-minute rate = 752956.85 events/second 5-minute rate = 757115.18 events/second 15-minute rate = 757835.58 events/second

👍 I like it. I'm not sure if it's worthwhile to limit values to compile-time constants because that prevents the toll from being used within another library, even when the cardinality is known to be low.

Easier to open that up later than to ratchet it down, happy to merge with that constraint.

svc-autorelease · 2022-01-13T21:12:45Z

Released 0.37.0

InstrumentedStreams for input & output streams

3f592b1

Track bytes read/written via meter and throughput histogram

schlosna added merge when ready autorelease labels Jan 12, 2022

schlosna and others added 2 commits January 11, 2022 22:17

Update InstrumentedStreamsTest.java

b9ccdcb

Add generated changelog entries

bf9ce7c

schlosna commented Jan 12, 2022

View reviewed changes

schlosna added the update me Keep PR updated with any merged changes label Jan 12, 2022

Merge develop into ds/instrumented-streams

fab5bbc

schlosna marked this pull request as ready for review January 12, 2022 03:22

policy-bot bot requested a review from tetigi January 12, 2022 03:22

schlosna requested review from carterkozak and removed request for tetigi January 12, 2022 03:22

carterkozak reviewed Jan 13, 2022

View reviewed changes

bulldozer-bot bot and others added 5 commits January 13, 2022 17:21

Merge refs/heads/develop into ds/instrumented-streams

2faec14

Merge refs/heads/develop into ds/instrumented-streams

e036504

I/O metrics use tagged metric schema

883de34

test copied bytes

6332695

Merge refs/heads/develop into ds/instrumented-streams

b9505c9

schlosna requested a review from carterkozak January 13, 2022 20:53

carterkozak approved these changes Jan 13, 2022

View reviewed changes

bulldozer-bot bot merged commit d9bcd62 into develop Jan 13, 2022

bulldozer-bot bot deleted the ds/instrumented-streams branch January 13, 2022 21:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

InstrumentedStreams for input & output streams #1314

InstrumentedStreams for input & output streams #1314

schlosna commented Jan 12, 2022

changelog-app bot commented Jan 12, 2022 •

edited by schlosna

Loading

schlosna commented Jan 12, 2022

schlosna Jan 12, 2022

carterkozak Jan 13, 2022

schlosna Jan 13, 2022

carterkozak Jan 13, 2022

carterkozak Jan 13, 2022

schlosna Jan 13, 2022

carterkozak Jan 13, 2022

carterkozak Jan 13, 2022

schlosna Jan 13, 2022

schlosna Jan 13, 2022

carterkozak Jan 13, 2022

carterkozak Jan 13, 2022

svc-autorelease commented Jan 13, 2022

InstrumentedStreams for input & output streams #1314

InstrumentedStreams for input & output streams #1314

Conversation

schlosna commented Jan 12, 2022

Before this PR

After this PR

Possible downsides?

changelog-app bot commented Jan 12, 2022 • edited by schlosna Loading

Generate changelog in changelog/@unreleased

schlosna commented Jan 12, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

svc-autorelease commented Jan 13, 2022

changelog-app bot commented Jan 12, 2022 •

edited by schlosna

Loading

Generate changelog in `changelog/@unreleased`