Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore ways to compress event_collection messages #125

Open
emilsoman opened this issue Jul 11, 2015 · 6 comments
Open

Explore ways to compress event_collection messages #125

emilsoman opened this issue Jul 11, 2015 · 6 comments

Comments

@emilsoman
Copy link
Contributor

Currently, the msgpack data that we are sending across to the client is uncompressed. This means an objectspace dump can come to 100s of MBs , and running a sampling profiler for 10 minutes or so can come to GBs of data ! I'm exploring ways to compress the events we are sending.

Points to note :

  1. Compressing and uncompressing should be as fast as possible to cause minimal overhead. I've chosen LZ4 as the algorithm for its promising benchmark results.
  2. Compressing will be most fruitful if we have a good chunk of data to compress. It's better to compress event_collection messages which aggregates a lot of messages in it.

I've added LZ4 compression on msgpack data for event_collection messages just before sending the data out over zmq in this commit : c3dcfe7 . The results look very promising:

Object space dumps in a smallish Rails app gets around 77% of size savings consistently. CPU samples get a whopping 90% savings and other events also get around 70-80% size reduction.

@kgrz
Copy link
Contributor

kgrz commented Jul 11, 2015

Related to this, will there be a noticeable change if the keys for events
were strings now?

On Sat, 11 Jul 2015 at 15:21 Emil Soman notifications@github.com wrote:

Currently, the msgpack data that we are sending across to the client is
uncompressed. This means an objectspace dump can come to 100s of MBs , and
running a sampling profiler for 10 minutes or so can come to GBs of data !
I'm exploring ways to compress the events we are sending.

Points to note :

  1. Compressing and uncompressing should be as fast as possible to cause
    minimal overhead. I've chosen LZ4 as the algorithm for its promising
    benchmark results.
  2. Compressing will be most fruitful if we have a good chunk of data to
    compress. It's better to compress event_collection messages which
    aggregates a lot of messages in it.

I've added LZ4 compression on msgpack data for event_collection messages
just before sending the data out over zmq in this commit : c3dcfe7
c3dcfe7
. The results look very promising:

Object space dumps in a smallish Rails app gets around 77% of size savings
consistently. CPU samples get a whopping 90% savings and other events also
get around 70-80% size reduction.


Reply to this email directly or view it on GitHub
#125.

@emilsoman
Copy link
Contributor Author

Not sure about that, but can't beat numeric keys for sure.

@emilsoman
Copy link
Contributor Author

@iffyuva @ishankhare07 once we're done with showing CPU profiling on the UI, we'll explore this a bit more and see if this can become a bottleneck in the client.

@iffyuva
Copy link
Member

iffyuva commented Jul 21, 2015

@emilsoman agreed, not a priority

@stereobooster
Copy link

stereobooster commented Sep 7, 2017

If speed of compression is concern here, then you can use https://github.com/google/snappy
See also http://facebook.github.io/zstd/

@emilsoman
Copy link
Contributor Author

@stereobooster we have already evaluated these and decided compression in not a priority till we have a usable profiling feature for development environment. Because my focus is on other projects, I'm not actively working on any rbkit features atm. PRs are welcome if you're interested in contributing. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants