Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sentry-cli monitors run should not fail to run its command during a Sentry outage #2169

Closed
1 of 8 tasks
alexmv opened this issue Sep 30, 2024 · 2 comments · Fixed by #2171
Closed
1 of 8 tasks

sentry-cli monitors run should not fail to run its command during a Sentry outage #2169

alexmv opened this issue Sep 30, 2024 · 2 comments · Fixed by #2171
Assignees

Comments

@alexmv
Copy link

alexmv commented Sep 30, 2024

CLI Version

2.33.0

Operating System and Architecture

  • macOS (arm64)
  • macOS (x86_64)
  • Linux (i686)
  • Linux (x86_64)
  • Linux (armv7)
  • Linux (aarch64)
  • Windows (i686)
  • Windows (x86_64)

Operating System Version

Ubuntu 22.04

Link to reproduction repository

No response

CLI Command

sentry-cli monitors run -e production --schedule '* * * * *' name-of-cron-job -- /path/to/binary

Exact Reproduction Steps

  1. Have the above be an every-minute cron job that your system expects to run.
  2. Sentry experiences an outage, and begins 502'ing responses

Expected Results

The underlying /path/to/binary would still be run, albeit without Sentry logging it.

A failure of the logging infrastructure should not cause it to skip running the command it knows needs to run on a regular basis. That's just adding another failure mode to the system and making your outage into our outage.

If the initial check-in fails, it should continue, and run the command without monitoring.

Actual Results

The Sentry logging request failed, and it aborted, without running the underlying cron job it was wrapping.

Logs

Log with SENTRY_LOG_LEVEL=debug, lightly redacted:

  DEBUG   2024-09-29 20:10:18.206902804 +00:00 sentry-cli version: 2.33.0, platform: "linux", architecture: "x86_64"
  INFO    2024-09-29 20:10:18.206988039 +00:00 sentry-cli was invoked with the following command line: "sentry-cli" "monitors" "run" "-e" "production" "--schedule" "* * * * *" "name-of-cron-job" "--" "/path/to/binary"
  DEBUG   2024-09-29 20:10:18.207637519 +00:00 Sending envelope:
{}
{"type":"check_in","length":211}
{"check_in_id":"...","monitor_slug":"name-of-cron-job","status":"in_progress","environment":"production","monitor_config":{"schedule":{"type":"crontab","value":"* * * * *"}}}

  DEBUG   2024-09-29 20:10:18.207712594 +00:00 request POST https://o48127.ingest.sentry.io/api/.../envelope/
  DEBUG   2024-09-29 20:10:18.207752486 +00:00 retry number 0, max retries: 0
  DEBUG   2024-09-29 20:10:18.231994537 +00:00 > POST /api/.../envelope/ HTTP/1.1
  DEBUG   2024-09-29 20:10:18.232053701 +00:00 > Host: o48127.ingest.sentry.io
  DEBUG   2024-09-29 20:10:18.232082583 +00:00 > Accept: */*
  DEBUG   2024-09-29 20:10:18.232087613 +00:00 > Connection: TE
  DEBUG   2024-09-29 20:10:18.232100174 +00:00 > TE: gzip
  DEBUG   2024-09-29 20:10:18.232113554 +00:00 > User-Agent: sentry-cli/2.33.0
  DEBUG   2024-09-29 20:10:18.232134836 +00:00 > X-Sentry-Auth: Sentry sentry_key=..., sentry_version=7, sentry_timestamp=1727640618.2076352, sentry_client=sentry-cli/x86_64
  DEBUG   2024-09-29 20:10:18.232148417 +00:00 > Content-Length: 248
  DEBUG   2024-09-29 20:10:20.402872925 +00:00 < HTTP/1.1 502 Bad Gateway
  DEBUG   2024-09-29 20:10:20.402944889 +00:00 < Server: nginx
  DEBUG   2024-09-29 20:10:20.402969231 +00:00 < Date: Sun, 29 Sep 2024 20:10:20 GMT
  DEBUG   2024-09-29 20:10:20.402986362 +00:00 < Content-Type: text/html
  DEBUG   2024-09-29 20:10:20.402994912 +00:00 < Content-Length: 150
  DEBUG   2024-09-29 20:10:20.403011133 +00:00 < Strict-Transport-Security: max-age=31536000; includeSubDomains; preload
  DEBUG   2024-09-29 20:10:20.403034675 +00:00 < Via: 1.1 google
  DEBUG   2024-09-29 20:10:20.403057466 +00:00 < Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
  DEBUG   2024-09-29 20:10:20.403095759 +00:00 response status: 502
  DEBUG   2024-09-29 20:10:20.403140531 +00:00 body: <html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx</center>
</body>
</html>

error: API request failed
  caused by: sentry reported an error: bad gateway (http status: 502)
  INFO    2024-09-29 20:10:20.403377586 +00:00 Skipping update nagger update check
@szokeasaurusrex
Copy link
Member

The underlying /path/to/binary would still be run, albeit without Sentry logging it.

A failure of the logging infrastructure should not cause it to skip running the command it knows needs to run on a regular basis. That's just adding another failure mode to the system and making your outage into our outage.

If the initial check-in fails, it should continue, and run the command without monitoring.

@alexmv First of all, thank you for bringing this bug to our attention by opening this issue. I fully agree with you; with the monitors run command, we should always run the command passed to monitors run, even if an error occurs when sending the monitor check-in to Sentry. This is indeed a serious bug, and I will prioritize getting a fix out as soon as possible.

Sorry for any inconvenience caused. I will let you know as soon as we have released a fix.

@szokeasaurusrex szokeasaurusrex self-assigned this Sep 30, 2024
szokeasaurusrex added a commit that referenced this issue Sep 30, 2024
…fails

Fix a bug where a failure in sending the crons checkin before running the program in the `monitors run` command (e.g. due to a Sentry outage) would cause the program to be prevented from running. Now, we instead log the error and continue running the program.

Also, ensure that a failure in sending the final checkin (after the program has finished) does not cause the program to exit with a different exit code. `sentry-cli monitors run` should propagate the exit code of the program it runs.

Fixes #2169
szokeasaurusrex added a commit that referenced this issue Sep 30, 2024
…fails

Fix a bug where a failure in sending the crons checkin before running the program in the `monitors run` command (e.g. due to a Sentry outage) would cause the program to be prevented from running. Now, we instead log the error and continue running the program.

Also, ensure that a failure in sending the final checkin (after the program has finished) does not cause the program to exit with a different exit code. `sentry-cli monitors run` should propagate the exit code of the program it runs.

Fixes #2169
@alexmv
Copy link
Author

alexmv commented Sep 30, 2024

Thank you for the quick fix and release!

alexmv added a commit to alexmv/zulip that referenced this issue Sep 30, 2024
This version causes `sentry-cli monitors run` to not fail if Sentry is
down (getsentry/sentry-cli#2169).
timabbott pushed a commit to zulip/zulip that referenced this issue Sep 30, 2024
This version causes `sentry-cli monitors run` to not fail if Sentry is
down (getsentry/sentry-cli#2169).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants