Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection reset by peer from BufferedConsumer #104

Closed
dkarp0 opened this issue Aug 6, 2021 · 10 comments · Fixed by #123
Closed

Connection reset by peer from BufferedConsumer #104

dkarp0 opened this issue Aug 6, 2021 · 10 comments · Fixed by #123

Comments

@dkarp0
Copy link

dkarp0 commented Aug 6, 2021

Since upgrading to 4.9.0, we've been getting the following error appearing frequently in sentry:

ConnectionResetError: [Errno 104] Connection reset by peer
  File "urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "urllib3/connectionpool.py", line 445, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
    # Permission is hereby granted, free of charge, to any person obtaining a copy
  File "urllib3/connectionpool.py", line 440, in _make_request
    httplib_response = conn.getresponse()
  File "http/client.py", line 1344, in getresponse
    response.begin()
  File "http/client.py", line 307, in begin
    version, status, reason = self._read_status()
  File "http/client.py", line 268, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "socket.py", line 669, in readinto
    return self._sock.recv_into(b)
  File "ssl.py", line 1241, in recv_into
    return self.read(nbytes, buffer)
  File "ssl.py", line 1099, in read
    return self._sslobj.read(len, buffer)
ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
  File "requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "urllib3/connectionpool.py", line 755, in urlopen
    retries = retries.increment(
  File "urllib3/util/retry.py", line 532, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "urllib3/packages/six.py", line 769, in reraise
    raise value.with_traceback(tb)
  File "urllib3/connectionpool.py", line 699, in urlopen
    httplib_response = self._make_request(
  File "urllib3/connectionpool.py", line 445, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
    # Permission is hereby granted, free of charge, to any person obtaining a copy
  File "urllib3/connectionpool.py", line 440, in _make_request
    httplib_response = conn.getresponse()
  File "http/client.py", line 1344, in getresponse
    response.begin()
  File "http/client.py", line 307, in begin
    version, status, reason = self._read_status()
  File "http/client.py", line 268, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "socket.py", line 669, in readinto
    return self._sock.recv_into(b)
  File "ssl.py", line 1241, in recv_into
    return self.read(nbytes, buffer)
  File "ssl.py", line 1099, in read
    return self._sslobj.read(len, buffer)
ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
  File "__init__.py", line 615, in _write_request
    response = self._session.post(
  File "requests/sessions.py", line 590, in post
    return self.request('POST', url, data=data, json=json, **kwargs)
  File "requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "requests/adapters.py", line 498, in send
    raise ConnectionError(err, request=request)
MixpanelException: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
  File "__init__.py", line 731, in _flush_endpoint
    self._consumer.send(endpoint, batch_json, api_key=self._api_key)
  File "__init__.py", line 594, in send
    self._write_request(self._endpoints[endpoint], json_message, api_key, api_secret)
  File "__init__.py", line 623, in _write_request
    six.raise_from(MixpanelException(e), e)
  File "<string>", line 3, in raise_from
    # Permission is hereby granted, free of charge, to any person obtaining a copy
MixpanelException: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
  File "/usr/src/app/./api/users/tracking.py", line 80, in _track
    mixpanel_track(
  File "/usr/src/app/./api/users/tracking.py", line 140, in mixpanel_track
    flush()
  File "/usr/src/app/./api/users/tracking.py", line 128, in flush
    current_app.mp_consumer.flush()
  File "__init__.py", line 722, in flush
    self._flush_endpoint(endpoint)
  File "__init__.py", line 736, in _flush_endpoint
    six.raise_from(mp_e, orig_e)
  File "<string>", line 3, in raise_from
    # Permission is hereby granted, free of charge, to any person obtaining a copy

This only started after upgrading and seems to be to do with the switch the requests. It happens when we flush the consumer.

@seizethedave
Copy link
Contributor

Hello @dkarp0, can you provide the requests version from pip freeze? Not sure what the problem is off the bat but wondering it it's one of the requests versions with buggy TLS negotiation.

@dkarp0
Copy link
Author

dkarp0 commented Aug 17, 2021

Hey @seizethedave, requests==2.26.0 which looks like it's the most recent. Which versions have an issue?

I might try switching away from the BufferedConsumer to see if we still have issues. We use it to delay sending backend mixpanel events until after we've returned data to the client, so it'd be nice to keep it but worth a try to narrow it down. It should then be more obvious if the issue is on our end as I imagine there are less people using the BufferedConsumer

@seizethedave
Copy link
Contributor

Hm, older ones have issues.
“Connection reset by peer” means someone is sending an RST packet. It is possible that our server/gateway (or something between there and your client code) is terminating a long expired keepalive connection. Any way you can capture the connection age and/or how long the connection was idle along with the error?

Certainly fewer people use the buffered consumer. I might try reproing by using the buffered consumer with longer and longer periods of inactivity. But you’ll have to bear with me as I’m out of the office for a while.

@dkarp0
Copy link
Author

dkarp0 commented Aug 23, 2021

@seizethedave Is mixpanel-python exposing the connection details somewhere so I can do that? I haven't dived into how the BufferedConsumer is functioning.

Difficulty is that we only enable mixpanel on our deployed environments so I'm not able to reproduce locally.

@seizethedave
Copy link
Contributor

I'm back from my leave, thanks for being patient. I take it this is still an issue for you?

Is mixpanel-python exposing the connection details somewhere so I can do that?

No, there are no connection details exposed. Would probably have to make some changes to the library to introspect the connection age/etc and log those when there's an RST. (Haven't looked into it!)

@dkarp0
Copy link
Author

dkarp0 commented Nov 16, 2021

@seizethedave I switched to the regular Consumer class and still got the same issue. So I looked into why the switch to requests seemed to cause the issue and found the solution.

We initialised the mixpanel library when initialising our Flask app, but then called track from a background thread. Initialising the mixpanel library now creates a requests Session which creates a TCP connection that gets shared with the background threads and that didn't play nicely.

Fix was to wrap initialising the mixpanel library in a closure, so it happens on the background thread instead of when we initialise the app. It'd be nice if that wasn't necessary though, maybe initialising the requests Session could be lazy and only happen when it's first needed

@seizethedave
Copy link
Contributor

This afternoon I found some time to rig up some threaded tests and I did get a "connection reset by peer" when tracking intermittently with ~20 threads after about 20 minutes. About how many threads was your application using? @dkarp0

@pb-dod
Copy link

pb-dod commented Feb 11, 2022

The fix for this might be to avoid declaring the Mixpanel client globally?

4.9.0 started using a session from the requests library, and session will use a connection pool according to this: psf/requests#4784 (comment)

The connections are probably getting reset after they're sticking around in the pool for a while. Avoiding declaring the client globally should avoid keeping the pool around between requests.

@bazubii
Copy link

bazubii commented Sep 26, 2022

We're still hitting this issue. Not declaring the Mixpanel client globally doesn't really solve the problem if you're using the BufferedConsumer to reduce the number of api calls. Any other option?

@bazubii
Copy link

bazubii commented Sep 29, 2022

Here is a proposed fix that has been working for us: #116

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants