Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RequestParsingError: Failed to decompress data. Not a gzipped file (b'un') #4816

Closed
sentry-io bot opened this issue Jun 21, 2021 · 11 comments
Closed

RequestParsingError: Failed to decompress data. Not a gzipped file (b'un') #4816

sentry-io bot opened this issue Jun 21, 2021 · 11 comments
Assignees

Comments

@sentry-io
Copy link

sentry-io bot commented Jun 21, 2021

This seems related to session recordings being flakey.

See: #4815


Sentry Issue: POSTHOG-2PA

BadGzipFile: Not a gzipped file (b'un')
  File "posthog/utils.py", line 371, in load_data_from_request
    data = gzip.decompress(data)
  File "gzip.py", line 548, in decompress
    return f.read()
  File "gzip.py", line 292, in read
    return self._buffer.read(size)
  File "gzip.py", line 479, in read
    if not self._read_gzip_header():
  File "gzip.py", line 427, in _read_gzip_header
    raise BadGzipFile('Not a gzipped file (%r)' % magic)

RequestParsingError: Failed to decompress data. Not a gzipped file (b'un')
  File "posthog/api/capture.py", line 119, in get_event
    data = load_data_from_request(request)
  File "posthog/utils.py", line 373, in load_data_from_request
    raise RequestParsingError("Failed to decompress data. %s" % (str(error)))
@Twixes Twixes removed the stale label Aug 26, 2021
@pauldambra pauldambra self-assigned this Sep 27, 2021
@pauldambra
Copy link
Member

That error is thrown when the first two bytes of the gzipped file are incorrect

The gzip format begins with a "magic number" https://datatracker.ietf.org/doc/html/rfc1952#page-6

b'\037\213'.decode('ISO-8859–1') -> '\x1f\x8b'

From gzip.py

    def _read_gzip_header(self):
        magic = self._fp.read(2)
        if magic == b'':
            return False

        if magic != b'\037\213':
            raise BadGzipFile('Not a gzipped file (%r)' % magic)

suggests that when this error is being thrown we're receiving corrupt or incomplete data

@pauldambra
Copy link
Member

Ha :) the data is b'undefined' explaining Not a gzipped file (b'un')

pauldambra added a commit that referenced this issue Sep 27, 2021
* switches to a more human-readable error message for #4816

we are intermittently receiving undefined as the request body from the session recorder. This makes the error for that more readable
@pauldambra
Copy link
Member

@mariusandra the next step that occurs to me is to log out from here https://github.com/PostHog/posthog-js/blob/800f4e373f46a0861383f7162c0aaf72d2a439d5/src/compression.js#L20 to Sentry

Ideally, we can determine whether the undefined comes from the rrweb library or from our compression of the data

Not sure if we log from that library historically, don't want to break things for users. Ideally I can force a log/error into Sentry, when it is available, and then return from the method unchanged so we preserve existing behaviour

@pauldambra
Copy link
Member

we coordinate two libraries here https://github.com/PostHog/posthog-js/blob/master/src/compression.js#L21

rrweb generates the snapshot events, fflate compresses the content

one of these could be generating the undefined unexpectedly

@mariusandra
Copy link
Collaborator

Great work tracking this down this far! Very curious to find out where it eventually breaks down... 🤔

pauldambra added a commit to PostHog/posthog-js that referenced this issue Sep 28, 2021
mariusandra pushed a commit to PostHog/posthog-js that referenced this issue Sep 29, 2021
* speculative logging for PostHog/posthog#4816

* use capture metrics to send debug messages home

* adds tests and only sends debug "metrics" when debug is enabled

* only check one flag before logging debug 'metrics'

* update tests now that debug flag is not used in capture metrics:

* add information to README on how to use yalc to develop locally

* catch errors in new debug logging to avoid this check affecting users on error

* remove usage of debug enabled as a parameter to capture metrics

* add instructions to remove yalc linkage
@pauldambra
Copy link
Member

Still seeing this event but not seeing capture metrics emitting debug information :'(

@pauldambra
Copy link
Member

The call chain is effectively

const jsonData = JSON.stringify(data)
// some method calls passing things around
const inputToGzip = fflate.strToU8(jsonData)

if strToU8 receives undefined it throws
ifstrToU8 receives the string "undefined" it converts it to an array of ascii character value integers because it doesn't care about the content

But!

JSON.stringify returns undefined if it receives undefined. So the code would be throwing not passing undefined to the API
If JSON.stringify receives the string 'undefined' it returns the string '"undefined"' (NB wrapped in double quotes)

The API is logging 'undefined' not '"undefined"' as the error value.

Suggesting that the input to strToU8 and strToU8 itself are not the problem

@pauldambra
Copy link
Member

pauldambra commented Sep 30, 2021

The call chain is closer to

const jsonData = JSON.stringify(data)
// some method calls passing things around
gzipSync(fflate.strToU8(jsonData))

if jsonData is undefined then gzipSync does not swallow the exception

pauldambra added a commit to PostHog/posthog-js that referenced this issue Sep 30, 2021
pauldambra added a commit to PostHog/posthog-js that referenced this issue Sep 30, 2021
mariusandra pushed a commit to PostHog/posthog-js that referenced this issue Sep 30, 2021
pauldambra added a commit that referenced this issue Oct 7, 2021
#6287)

* tags request origins in Sentry when loading request data from the body

relates to #4816
@pauldambra
Copy link
Member

pauldambra commented Oct 13, 2021

We now log referrer to Sentry when this error occurs. Since the release of the specualtive fix in PostHog/posthog-js#300 we see fewer instances - 37 in the last 24 hours

In those 37 we've only seen two customers with the undefined bug

They're both on old versions of the library. And consistently send an empty body via sendBeacon on unload

@mariusandra
Copy link
Collaborator

That's amazing! This took a lot of effort and head scratching, but I'm very glad it's fixed! 🎉

Going to close this issue.

@pauldambra
Copy link
Member

For future reference the fix for this went into posthog-js version 1.14.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants