Skip to content
This repository has been archived by the owner on Aug 30, 2023. It is now read-only.

Feature Request: Compress Attachments #10

Closed
SamKirkland opened this issue Sep 28, 2020 · 12 comments
Closed

Feature Request: Compress Attachments #10

SamKirkland opened this issue Sep 28, 2020 · 12 comments

Comments

@SamKirkland
Copy link

SamKirkland commented Sep 28, 2020

This solution is great and would let us easily recreate errors, saving us time.

Adding a option to compress the attachment using gzip or similar prior to upload would reduce attachment file size by 80-90% in most cases. A webworker could be used to reduce performance issues if those appear during compression on the UI thread.

In testing average attachment size for a mid-sized html page was 3.6MB.
These files gzipped are only ~530KB - or 14% the original file size. Which would allow 1.8 million "videos" for only $225/month under the current 1,000 GB plan, an order of magnitude cheaper than competitors in the space. In my opinion sentry has a big opportunity here to capture sales funneling to logRocket, asayer, sessionstack, fullstory, etc
Almost of the feature set for a tenth the price!

Attachments would need to be decompressed on the sentry's web player, however this repo seems to be out of date compared to the one used today by sentry.

@alexcroox
Copy link

rrweb uses Pako for compression which is a really heavy library. It might be worth seeing what other lightweight alternatives there are such as https://github.com/101arrowz/fflate

@dcramer
Copy link
Member

dcramer commented Sep 29, 2020

@SamKirkland were you mostly thinking on the storage side? re: cost of storage - that is, no change to the upload? my bigger concern has been the cost of bandwidth for users when submitting replays to Sentry. Thinking about mobile apps and other bandwidth restricted devices..

@alexcroox maybe @Yuyz0112 has thoughts on the compression schema used

@Yuyz0112
Copy link

I've already seen the fflate package and it may be a good alternative to pako.

@SamKirkland
Copy link
Author

SamKirkland commented Sep 29, 2020

@dcramer both storage & upload speed/bandwidth are a concern. Storage costs are the main blocker for us right now.

So I just looked into this a bit more. It's a undocumented feature of both the rrweb recorder and replayer to use pack - that's what @alexcroox and @Yuyz0112 are referring to. I've got a implementation working now, the player sentry.io is using in prod surprisingly worked out of the box with the compressed data!!! You must be using the full min bundle instead of the module version?

Compression converted a 2.8MB file into a 881KB file, not as good as even gzip level 1 which resulted in 450KB. But still a welcome improvement. Compression could be improved by compressing the entire file instead of each chunk individually - I believe this is what the compression to-do item in the readme is referring to.

@Yuyz0112 I'll submit a PR adding documentation for this.

I'm closing this ticket out. Thanks for the help guys!

@SamKirkland
Copy link
Author

Well I've got some bad news. The way rrweb currently compresses on the UI thread noticeably slows down the site. On any frame capture (mouse move, scroll, dom change) the events are compressed. For this particular use-case we only send events once a exception occurs so it doesn't make much sense to compress the data before a exception. Memory usage will be a bit higher without compression but with a short checkout this is a minor issue.

Here are some logs from my local PC running a desktop CPU without throttling. I haven't tested on mobile yet.
image

image

@dcramer
Copy link
Member

dcramer commented Sep 29, 2020

Possible compression (and anything else that may be heavy) could be handed off to a web worker? I'm out of my element at this point.

@SamKirkland
Copy link
Author

Yep, I can make a PR for the webworker compression to gzip.

Can you open source your integration used on sentry.io's side for the player? If we compress the attachment we will need to decompress it on the other side. I can make the necessary changes on that side as well.

Reopening issue - this will be used to track gzip compression of the following in a webworker:

const client = Sentry.getCurrentHub().getClient()!;
const endpoint = self.attachmentUrlFromDsn(
    client.getDsn(),
    event.event_id
);
const formData = new FormData();
formData.append(
    'rrweb',
    new Blob([JSON.stringify({ events: self.events })], {
        type: 'application/json',
    }),
    'rrweb.json'
);

fetch(endpoint, {
    method: 'POST',
    body: formData,
})
.catch((ex) => {
    // we have to catch this otherwise it throws an infinite loop in Sentry
    console.error(ex);
});

@SamKirkland SamKirkland reopened this Sep 29, 2020
@dcramer
Copy link
Member

dcramer commented Sep 29, 2020

It'd be complex to pull the component out of Sentry itself but it's all in our public repo:

https://github.com/getsentry/sentry/blob/master/src/sentry/static/sentry/app/components/events/rrwebIntegration.jsx

https://github.com/getsentry/sentry/blob/master/src/sentry/static/sentry/app/components/events/rrwebReplayer.jsx

We just use rrweb-player under the hoo0d.

@alexcroox
Copy link

alexcroox commented Sep 29, 2020

Could you just compress at checkout?

@SamKirkland
Copy link
Author

@alexcroox that would remove some of the lower impact compression calls. But the high impact compression calls (200+ ms) are checkout calls.

@101arrowz
Copy link

Now that rrweb has moved to fflate, using a web worker is trivially easy. I opted to keep things synchronous to avoid making rrweb-io/rrweb#448 cause a major version bump, but for smoother operation, the packing API should become a callback-based interface. The code could check the size of the event with jsonString.length, then use either zlibSync and unzlibSync for small events (under, say 200kB) or zlib and unzlib for large events, then call the original callback with the returned value. Docs for async operations

@bruno-garcia
Copy link
Member

The new Replay SDK does compress the payloads.

Please note this repository has been superseded and will no longer be updated, check https://github.com/getsentry/sentry-javascript/tree/master/packages/replay#sentry-session-replay instead, we're happy respond to issues, review and merge PRs there.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants