Skip to content

Conversation

@armenzg
Copy link
Member

@armenzg armenzg commented Nov 3, 2025

This is how it used to be before #76608. Now that we have fixed the circular relationship (#102470) I believe we should give this approach another attempt.

Fixes SENTRY-5BX2:

For posterity:

OperationalError
canceling statement due to user request

SQL: SELECT "sentry_grouphistory"."id" AS "id" FROM "sentry_grouphistory" WHERE "sentry_grouphistory"."group_id" = %s ORDER BY 1 ASC LIMIT 100

The group in question has only been seen 4 times yet it has 905,729 rows of group history. This is caused by the priority of the group changing back and forth between high and medium priority. We're investigating the root cause of it.

This is how it used to be before #76608. Now that we have fixed the circular relationship (#102470) I believe we should give this approach another attempt.

Fixes [SENTRY-5BX2](https://sentry.sentry.io/issues/6993998110/):

For posterity:
```
OperationalError
canceling statement due to user request

SQL: SELECT "sentry_grouphistory"."id" AS "id" FROM "sentry_grouphistory" WHERE "sentry_grouphistory"."group_id" = %s ORDER BY 1 ASC LIMIT 100
```

The group in question has only been seen 4 times yet it has 905,729 rows of group history.
@armenzg armenzg self-assigned this Nov 3, 2025
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Nov 3, 2025
if options.get("deletions.group-history.use-bulk-deletion"):
manager.register(models.GroupHistory, defaults.GroupHistoryDeletionTask)
else:
manager.register(models.GroupHistory, BulkModelDeletionTask)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it works, we can remove the custom class.

cursor[bot]

This comment was marked as outdated.

@codecov
Copy link

codecov bot commented Nov 3, 2025

Codecov Report

❌ Patch coverage is 80.00000% with 1 line in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/sentry/deletions/__init__.py 75.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #102580      +/-   ##
===========================================
+ Coverage   80.88%    80.89%   +0.01%     
===========================================
  Files        8861      8865       +4     
  Lines      390436    390860     +424     
  Branches    24832     24832              
===========================================
+ Hits       315786    316195     +409     
- Misses      74285     74300      +15     
  Partials      365       365              

Copy link
Member

@markstory markstory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth a shot now that the recursion issue is addressed.

Copy link
Member

@wedamija wedamija left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the group has this many rows, there's a good chance that the query you have will still time out, since there is no index on group_id, id. I don't mind if you want to try, but I think the solution here would be to add that index if this doesn't work. You could even just do it temporarily until the deletions finish.

@wedamija
Copy link
Member

wedamija commented Nov 3, 2025

If the group has this many rows, there's a good chance that the query you have will still time out, since there is no index on group_id, id. I don't mind if you want to try, but I think the solution here would be to add that index if this doesn't work. You could even just do it temporarily until the deletions finish.

Actually, how about this - just remove the ordering from your query? You don't care about the order here, you just want to delete rows.

chunk_ids = list(queryset.order_by("id").values_list("id", flat=True)[:100])

to

chunk_ids = list(queryset.values_list("id", flat=True)[:100])

@armenzg armenzg closed this Nov 3, 2025
@armenzg armenzg reopened this Nov 3, 2025
@armenzg armenzg marked this pull request as ready for review November 3, 2025 18:17
@wedamija
Copy link
Member

wedamija commented Nov 3, 2025

If the group has this many rows, there's a good chance that the query you have will still time out, since there is no index on group_id, id. I don't mind if you want to try, but I think the solution here would be to add that index if this doesn't work. You could even just do it temporarily until the deletions finish.

Actually, how about this - just remove the ordering from your query? You don't care about the order here, you just want to delete rows.

chunk_ids = list(queryset.order_by("id").values_list("id", flat=True)[:100])

to

chunk_ids = list(queryset.values_list("id", flat=True)[:100])

Sorry, I was looking at the sentry error that you linked here and this feedback was to fix the error seen there. Afaict, the BulkModelDeletionTask already does this, so lgtm

@armenzg
Copy link
Member Author

armenzg commented Nov 3, 2025

Related code:

def bulk_delete_objects(
model: type[Model],
limit: int = 10000,
transaction_id: str | None = None,
logger: logging.Logger | None = None,
**filters: Any,
) -> bool:
qs = model.objects.filter(**filters).values_list("id")[:limit]
delete_query = DeleteQuery(model)
delete_query.add_q(Q(id__in=qs))
n = delete_query.get_compiler(router.db_for_write(model)).execute_sql(ROW_COUNT)
# `n` will be `None` if `qs` is an empty queryset
has_more = n is not None and n > 0
if has_more and logger is not None and _leaf_re.search(model.__name__) is None:
logger.info(
"object.delete.bulk_executed",
extra=dict(filters, model=model.__name__, transaction_id=transaction_id),
)
return has_more

@armenzg armenzg enabled auto-merge (squash) November 3, 2025 18:19
@armenzg armenzg merged commit ac997e9 into master Nov 3, 2025
127 checks passed
@armenzg armenzg deleted the fix/operational_error/cleanup/armenzg branch November 3, 2025 18:39
armenzg added a commit that referenced this pull request Nov 4, 2025
This is part of the required steps to remove a column from Sentry:
https://develop.sentry.dev/backend/application-domains/database-migrations/#deleting-columns.

This is a follow-up to #102474.

This fixes [SENTRY-5BX8](https://sentry.sentry.io/issues/6994851757/) and [SENTRY-5BXW](https://sentry.sentry.io/issues/6995699817/) which are happening since we switched to the bulk deletion model (#102580).
armenzg added a commit that referenced this pull request Nov 4, 2025
This is part of the required steps to remove a column from Sentry:
https://develop.sentry.dev/backend/application-domains/database-migrations/#deleting-columns.

This is a follow-up to #102474.

This fixes [SENTRY-5BX8](https://sentry.sentry.io/issues/6994851757/)
and [SENTRY-5BXW](https://sentry.sentry.io/issues/6995699817/) which are
happening since we switched to the bulk deletion model (#102580).
shashjar pushed a commit that referenced this pull request Nov 4, 2025
This is how it used to be before #76608. Now that we have fixed the
circular relationship (#102470) I believe we should give this approach
another attempt.

Fixes [SENTRY-5BX2](https://sentry.sentry.io/issues/6993998110/):

For posterity:
```
OperationalError
canceling statement due to user request

SQL: SELECT "sentry_grouphistory"."id" AS "id" FROM "sentry_grouphistory" WHERE "sentry_grouphistory"."group_id" = %s ORDER BY 1 ASC LIMIT 100
```

The group in question has only been seen 4 times yet it has 905,729 rows
of group history. This is caused by the priority of the group changing
back and forth between high and medium priority. We're investigating the
root cause of it.
shashjar pushed a commit that referenced this pull request Nov 4, 2025
armenzg added a commit that referenced this pull request Nov 4, 2025
The bulk deletion added in #102580 with #102672 is working well.
armenzg added a commit that referenced this pull request Nov 5, 2025
The bulk deletion added in #102580 with #102672 is working well.
armenzg added a commit that referenced this pull request Nov 5, 2025
priscilawebdev pushed a commit that referenced this pull request Nov 6, 2025
This is how it used to be before #76608. Now that we have fixed the
circular relationship (#102470) I believe we should give this approach
another attempt.

Fixes [SENTRY-5BX2](https://sentry.sentry.io/issues/6993998110/):

For posterity:
```
OperationalError
canceling statement due to user request

SQL: SELECT "sentry_grouphistory"."id" AS "id" FROM "sentry_grouphistory" WHERE "sentry_grouphistory"."group_id" = %s ORDER BY 1 ASC LIMIT 100
```

The group in question has only been seen 4 times yet it has 905,729 rows
of group history. This is caused by the priority of the group changing
back and forth between high and medium priority. We're investigating the
root cause of it.
priscilawebdev pushed a commit that referenced this pull request Nov 6, 2025
priscilawebdev pushed a commit that referenced this pull request Nov 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants