Skip to content

Internal error in sentry_sdk #2732

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
manuellazzari-cargoone opened this issue Feb 13, 2024 · 14 comments
Open

Internal error in sentry_sdk #2732

manuellazzari-cargoone opened this issue Feb 13, 2024 · 14 comments
Labels
Component: Transport Dealing with the transport Improvement Triaged Has been looked at recently during old issue triage

Comments

@manuellazzari-cargoone
Copy link

Environment

SaaS (https://sentry.io/)

Steps to Reproduce

I am randomly getting the following error from some of my Python services. The SDK seems it is sending events correctly to sentry.io. No clue if it is skipping some of them. The error seems to occur more with load.

The service is mainly a Flask Python 3.11 g, run by guincorn on a kubernetes cluster in multiple instances on Google Cloud. All dependencies are basically up to date, and I'm using sentry-sdk-1.40.2 and urllib3-2.0.7.

In the attempt of fixing it, I tried to customize as follows the HttpTransport used by Sentry SDK, with no luck.

try:
    logger.info('Initializing Sentry SDK...')
    sentry_sdk.init(
        ...,
        environment=...,
        traces_sample_rate=...,
        integrations=[FlaskIntegration(), RedisIntegration()],
    )

    import socket
    from sentry_sdk import transport
    from sentry_sdk.transport import HttpTransport
    from urllib3.connection import HTTPConnection

    class KeepAliveHttpTransport(HttpTransport):
        def _get_pool_options(self, ca_certs):
            opts = super()._get_pool_options(ca_certs)
            opts['socket_options'] = HTTPConnection.default_socket_options + [
                (socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1),
                (socket.SOL_TCP, socket.TCP_KEEPIDLE, 45),
                (socket.SOL_TCP, socket.TCP_KEEPINTVL, 10),
                (socket.SOL_TCP, socket.TCP_KEEPCNT, 6),
            ]
            return opts


    logger.info('Forcing keep alive for Sentry SDK HTTP transport...')
    transport.HttpTransport = KeepAliveHttpTransport

except Exception as e:
    logger.info(f'Unable to Start Sentry due to error: {e}')

Expected Result

No connection errors

Actual Result

Internal error in sentry_sdk
Traceback (most recent call last): File "/usr/local/lib/python3.11/dist-packages/urllib3/connectionpool.py", line 793, in urlopen response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/urllib3/connectionpool.py", line 537, in _make_request
response = conn.getresponse()
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/urllib3/connection.py", line 466, in getresponse
httplib_response = super().getresponse()
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/sentry_sdk/integrations/stdlib.py", line 126, in getresponse
return real_getresponse(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/http/client.py", line 1386, in getresponse
response.begin()
File "/usr/lib/python3.11/http/client.py", line 325, in begin
version, status, reason = self._read_status()
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/http/client.py", line 294, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
{}
During handling of the above exception, another exception occurred:
{}
Traceback (most recent call last): File "/usr/local/lib/python3.11/dist-packages/sentry_sdk/transport.py", line 535, in send_envelope_wrapper self._send_envelope(envelope) File "/usr/local/lib/python3.11/dist-packages/sentry_sdk/transport.py", line 434, in _send_envelope self._send_request( File "/usr/local/lib/python3.11/dist-packages/sentry_sdk/transport.py", line 245, in _send_request response = self._pool.request(
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/urllib3/_request_methods.py", line 144, in request
return self.request_encode_body(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/urllib3/_request_methods.py", line 279, in request_encode_body
return self.urlopen(method, url, **extra_kw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/urllib3/poolmanager.py", line 444, in urlopen
response = conn.urlopen(method, u.request_uri, **kw)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/urllib3/connectionpool.py", line 847, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/urllib3/util/retry.py", line 470, in increment
raise reraise(type(error), error, _stacktrace)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/urllib3/util/util.py", line 38, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.11/dist-packages/urllib3/connectionpool.py", line 793, in urlopen
response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/urllib3/connectionpool.py", line 537, in _make_request
response = conn.getresponse()
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/urllib3/connection.py", line 466, in getresponse
httplib_response = super().getresponse()
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/sentry_sdk/integrations/stdlib.py", line 126, in getresponse
return real_getresponse(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/http/client.py", line 1386, in getresponse
response.begin()
File "/usr/lib/python3.11/http/client.py", line 325, in begin
version, status, reason = self._read_status()
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/http/client.py", line 294, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

Product Area

Issues

Link

No response

DSN

https://1e425c4937e14585ab35335aa4810004@o155318.ingest.sentry.io/4503976860385280

Version

No response

@getsantry
Copy link

getsantry bot commented Feb 13, 2024

Assigning to @getsentry/support for routing ⏲️

@getsantry
Copy link

getsantry bot commented Feb 13, 2024

Routing to @getsentry/product-owners-issues for triage ⏲️

@roggenkemper roggenkemper transferred this issue from getsentry/sentry Feb 13, 2024
@vartec
Copy link

vartec commented Feb 13, 2024

Likely to be caused by #2386

@manuellazzari-cargoone
Copy link
Author

Likely to be caused by #2386

@vartec how is that connected? My app does not crash, only the Sentry SDK is complaining...

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 2 Feb 14, 2024
@kieran-sf
Copy link

kieran-sf commented Feb 14, 2024

I am getting the same issue as above, but I haven't yet tried the custom HttpTransport.
Basic specs are:
sentry-sdk[flask,django,celery]==1.29.0
python3.9

@sentrivana
Copy link
Contributor

Hey @manuellazzari-cargoone and @kieran-sf! Thanks for reporting this.

My first suggestion would've been exactly what you tried @manuellazzari-cargoone with the custom socket options from #1198 (comment). Are you seeing at least some improvement?

I'm curious whether spacing out the events sent from the SDK makes a difference. Can you try overriding _send_request in your custom transport roughly like this?

import time

class KeepAliveHttpTransport(HttpTransport):
    def _send_request(self, *args, **kwargs):
        time.sleep(0.01)
        super()._send_request(self, *args, **kwargs)

@getsantry getsantry bot moved this to Waiting for: Community in GitHub Issues with 👀 2 Feb 16, 2024
@manuellazzari-cargoone
Copy link
Author

@sentrivana thanks for getting back -- when trying the custom socket settings I wasn't able to see any major differences. Now I'm testing the whole package (custom socket settings and custom send request).

@getsantry getsantry bot moved this from Waiting for: Community to Waiting for: Product Owner in GitHub Issues with 👀 2 Feb 16, 2024
@manuellazzari-cargoone
Copy link
Author

@sentrivana the issue is still there even by adding the custom _send_request. From a qualitative perspective, it seems that it appears more or less with the same frequency (see below an extraction from our logs before and after the changes were deployed).

2024-02-16T15:29:33.048565197Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:30:55.565954424Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:33:13.537149008Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:33:17.914818994Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:34:47.778557487Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:34:47.819326396Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:37:33.774480759Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:37:34.586534890Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:41:53.741483570Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:45:11.560933696Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:51:10.940568800Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:51:11.016053254Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:51:20.146324668Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:51:28.814144310Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:58:53.136957655Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:58:54.053616565Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T15:59:16.002942751Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T16:03:12.954825518Z Initializing Sentry SDK... <-- changes deployed
...
2024-02-16T16:13:44.713686531Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T16:22:30.285920011Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T16:25:48.997920085Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T16:25:49.389493362Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T16:26:00.889221302Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk
2024-02-16T16:38:44.314760846Z ERROR - sentry_sdk.errors - Internal error in sentry_sdk

@sentrivana
Copy link
Contributor

sentrivana commented Feb 20, 2024

@manuellazzari-cargoone Thanks for the follow up. It looks like the sleep might've had a tiny effect, at least from the logs it looks like there's less occurrences if you look at a comparable time span before and after -- but this could obviously have to do with traffic etc., so don't think this is exactly conclusive.

Are you seeing any network errors for outgoing requests anywhere else in your system? Just trying to rule out general network instability. Alternatively I'm thinking if there's anything special about the errors/transactions -- maybe them being unusually big so it takes long to send each one and the server drops the connection? Fiddling around with the socket options might make a difference, too.

@manuellazzari-cargoone
Copy link
Author

Are you seeing any network errors for outgoing requests anywhere else in your system? Just trying to rule out general network instability.

I'm observing this issue just with 2 services both running with gunicorn. Also it's all GCP, so I would rule out consistent network instability.

Alternatively I'm thinking if there's anything special about the errors/transactions -- maybe them being unusually big so it takes long to send each one and the server drops the connection?

I'm not sure about size, but I know for sure we have services deal with a lot more traffic and a lot bigger packages. In particular, one of the affected services is dealing with very little traffic and small packages and still experiencing the problem.

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 2 Feb 21, 2024
@sentrivana
Copy link
Contributor

Thanks! I think in general we need to reconsider our transport logic and possibly add retries for more cases. (Currently we only retry if we're rate limited.) But at the same time, with this particular error there's no telling if the server actually processed the event and just crapped out at the very end, so it's not clear cut.

@szokeasaurusrex szokeasaurusrex added the Triaged Has been looked at recently during old issue triage label Sep 11, 2024
@antonpirker
Copy link
Member

antonpirker commented Dec 11, 2024

FYI @manuellazzari-cargoone we have now experimental Http2Transport that you can find here: https://github.com/getsentry/sentry-python/blob/master/sentry_sdk/transport.py#L736C11-L736C25

Maybe this solves your problem?

@andres06-hub
Copy link

HI
What happens to me is very similar, only it is when I create a “simple event”, I have my sentry in a k8s cluster.

Error:

  File "/.venv/lib/python3.12/site-packages/urllib3/connectionpool.py", line 843, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/.venv/lib/python3.12/site-packages/urllib3/util/retry.py", line 519, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=64186): Max retries exceeded with url: /api/1/envelope/ (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fa3f3d3cf50>: Failed to establish a new connection: [Errno 111] Connection refused'))
00:28:10 [ERROR] sentry_sdk.errors: Internal error in sentry_sdk

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 3 Dec 13, 2024
@szokeasaurusrex
Copy link
Member

@andres06-hub the error you are experiencing looks different. You are hitting a max retry error. Perhaps there is some issue with the network connection to Sentry?

If not, then please open a separate issue with a full explanation of how to reproduce the behavior you are observing, so that we can help you effectively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Transport Dealing with the transport Improvement Triaged Has been looked at recently during old issue triage
Projects
Status: No status
Status: No status
Development

No branches or pull requests

8 participants