Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose producer metrics with go-metrics #746

Merged
merged 1 commit into from
Nov 22, 2016

Conversation

slaunay
Copy link
Contributor

@slaunay slaunay commented Sep 19, 2016

  • provide the following metrics:
    • batch-size histogram (global and per topic)
    • record-send-rate meter (global and per topic)
    • records-per-request histogram (global and per topic)
    • compression-rate histogram (global and per topic)
  • add metrics.Registry parameter to the encode function
  • provide underlying MessageSet when encoding fake compressed "message"
  • use len(msg.Set.Messages) for counting records
  • use payloadSize property in Message for knowing the size of the compressed payload
  • expose the configured metric registry in ProduceRequest
  • expose current offset in packetEncoder for batch size metric
  • expose real encoder flag in packetEncoder for recording metrics only once
  • record metrics in produce_request.go
  • add unit tests and functional tests
  • use Spew library for building better DeepEqual error message when comparing raw bodies
  • add documentation for new metrics

}
// Better be safe than sorry when computing the compression ratio
if len(messageBlock.Msg.Value) != 0 {
compressionRatio := float64(messageBlock.Msg.payloadSize) /
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this calculation correct? Compression ratios are usually given as uncompressed / compressed not the other way around. Or am I misunderstanding which value is which?

Perhaps you'd better rename payloadSize to compressedPayloadSize to be explicit about this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sens as a metric, because it will give you a basic percentage,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the Java client for that metric, they named it compression-rate-avg (The average compression rate of record batches) and it is the inverse of a regular compression ratio indeed, computed as a float:
https://github.com/apache/kafka/blob/7115c66aefb11efa802e61a42bcc13fadf92598d/clients/src/main/java/org/apache/kafka/common/record/Compressor.java#L150

But I found it confusing the first time I tried to interpret that metric from the Java client to be honest so I can change it to expose 200 for representing a compression ratio of 2.0 instead of 50.

Renaming payloadSize to compressedPayloadSize make sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, I think I'd prefer to do the normal thing than to follow upstream just for the sake of following upstream. It's always a toss-up though.

@eapache
Copy link
Contributor

eapache commented Oct 1, 2016

With the exception of the compression ratio issue I think this looks fine. @wvanbergen care to take a look?

Copy link
Contributor

@wvanbergen wvanbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice addition 👍

Do we have any idea what the performance impact of enabling metric collection is?

}
// Better be safe than sorry when computing the compression ratio
if len(messageBlock.Msg.Value) != 0 {
compressionRatio := float64(messageBlock.Msg.payloadSize) /
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sens as a metric, because it will give you a basic percentage,

@slaunay
Copy link
Contributor Author

slaunay commented Oct 11, 2016

I published some benchmark results from the first PR about metrics in #701.

But here are others against commit b17c88c when using the NOOP implementation for metrics and the real one:

# With real metrics
$ go test -v -bench=. -run=^$
BenchmarkProducerSmall-8                      100000         12213 ns/op
BenchmarkProducerMedium-8                      20000         60276 ns/op
BenchmarkProducerLarge-8                        3000        509129 ns/op
BenchmarkProducerSmallSinglePartition-8       200000          9557 ns/op
BenchmarkProducerMediumSnappy-8               100000         15738 ns/op
PASS
ok      github.com/Shopify/sarama   13.239s
$ go test -v -bench=. -run=^$
BenchmarkProducerSmall-8                      100000         15210 ns/op
BenchmarkProducerMedium-8                      20000         67049 ns/op
BenchmarkProducerLarge-8                        2000        537554 ns/op
BenchmarkProducerSmallSinglePartition-8       200000          6921 ns/op
BenchmarkProducerMediumSnappy-8               200000         13041 ns/op
PASS
ok      github.com/Shopify/sarama   13.759s
# With NOOP metrics
$ export METRICS_DISABLE=1
$ go test -v -bench=. -run=^$
BenchmarkProducerSmall-8                      100000         16104 ns/op
BenchmarkProducerMedium-8                      20000         63805 ns/op
BenchmarkProducerLarge-8                        3000        523485 ns/op
BenchmarkProducerSmallSinglePartition-8       200000          5578 ns/op
BenchmarkProducerMediumSnappy-8               100000         14790 ns/op
PASS
ok      github.com/Shopify/sarama   9.684s
$ go test -v -bench=. -run=^$
BenchmarkProducerSmall-8                      100000         12086 ns/op
BenchmarkProducerMedium-8                      20000         54810 ns/op
BenchmarkProducerLarge-8                        3000        402602 ns/op
BenchmarkProducerSmallSinglePartition-8       200000          6702 ns/op
BenchmarkProducerMediumSnappy-8               100000         13575 ns/op
PASS
ok      github.com/Shopify/sarama   9.232s

Results are very similar again so it probably comes down to I/O and caching on my VM for the differences.

@slaunay
Copy link
Contributor Author

slaunay commented Oct 11, 2016

Let me know what you think about the compression ratio format and I'll rebase the changes against the master branch to get rid of the new conflicts.

- provide the following metrics:
  - batch-size histogram (global and per topic)
  - record-send-rate meter (global and per topic)
  - records-per-request histogram (global and per topic)
  - compression-ratio histogram (global and per topic)
- add metrics.Registry parameter to the encode function
- provide underlying MessageSet when encoding fake compressed "message"
- use len(msg.Set.Messages) for counting records
- use compressedSize property in Message for knowing the size of the
  compressed payload
- expose the configured metric registry in ProduceRequest
- expose current offset in packetEncoder for batch size metric
- expose real encoder flag in packetEncoder for recording metrics only
  once
- record metrics in produce_request.go
- add unit tests and functional tests
- use Spew library for building better DeepEqual error message when
  comparing raw bodies
- add documentation for new metrics
@slaunay slaunay force-pushed the enhancement/producer-metrics branch from b17c88c to 124e7c6 Compare October 25, 2016 17:53
@slaunay
Copy link
Contributor Author

slaunay commented Oct 25, 2016

I rebased against master to fix the merge conflict and applied the following changes:

  • use compression ratio times 100 (e.g. 500 correspond to a 5:1 compression ratio) and rename the metric name from compression-rate (JVM name) to compression-ratio
  • renamed payloadSize field to compressedSize on Message struct
  • update documentation and tests accordingly

@eapache eapache merged commit 3f392a5 into IBM:master Nov 22, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants