-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Postgres alpine image introduces unique constraint index errors #3167
Comments
For users that are attempting to upgrade from the alpine image to the latest 24.6.0 or above, all indexes involving columns of type text, varchar, char, and citext should be reindexed before the instance is put into production (source). For users that are attempting to fix duplicate rows of data, the process may be pretty manual 🥲. It will likely involve removing the duplicate rows of data, along with models with foreign key relations. |
I had the same problems and fixed them in the database with the follwing commands!
In case there are more duplicates the reindex will fail and you have to delete thes duplicates in a similar way. |
@hubertdeng123 Aren't you going to provide an official version of the SQL upgrade script? |
@candux Thanks for the input! @MingNiu We're trying to figure out the best way to go about this. Different folks will have different ways they'd like to address the potential duplicate environments/releases/etc in their self-hosted instance. Merging data in postgres is extremely dangerous and it's not something that we think should be done as a general solution by us. For some people, they'd like to maybe keep the latest data while others would like to keep the data up until the point where data corruption occurred. |
@hubertdeng123 ok. If I only want to keep data such as projects, members, configurations, etc., and clearing the issue record is acceptable. So can you provide the official upgrade script? We are not familiar with the table structure and it is too difficult for us to manually fix the data. |
We will not be providing an official upgrade script at this time. What kind of duplicates are you seeing? That will help determine what rows to delete. |
We're also affected by this issue, environments are now duplicated and workers cannot process tasks because of:
See:
|
Since it looks like most folks are experiencing issues with environments/releases, I wanted to post a manual guide to attempt to fix this data corruption. For users looking to delete duplicate data as a result of data corruption on 24.4.2, 24.5.0, and 24.5.1:We'd highly recommend removing the duplicates, then upgrading to 24.6.0+ as soon as possible to avoid using the postgres alpine image which may cause painpoints when upgrading in the future. Reindexing the database with corrupt data will result in errors relating to duplicates. So, we'll need to delete duplicates before attempting to fix the indexes.
Cleaning up environment:
Cleaning up releases (credit to @candux):
For those users with clean data on versions 24.4.2, 24.5.0, 24.5.1 to 24.6.0+We'll need to reindex the entire database to ensure data corruption will not occur in the future. According to the postgres docs, reindex is safe to be used in all cases. Steps to reindex:
|
@hubertdeng123 Thank you for your instructions. I would like to clarify one point: when executing the
Is this related to this problem? What can be done about it? Here is what this table looks like:
|
We installed a fresh sentry
|
@feshchenkod It looks like you forgot to update the versions in your .env.custom file. |
Got it, I overlooked that part. Thanks a lot for your help! |
@JiffsMaverick It looks like you have duplicate entries in |
@hubertdeng123 These records are not completely duplicated. There is an issue with the duplication of only two fields external_id and organization_id. For example:
Here is the code responsible for this index: class Meta:
app_label = "sentry"
db_table = "sentry_commitauthor"
unique_together = (("organization_id", "email"), ("organization_id", "external_id")) Is deleting records in this table safe? How would you recommend choosing which record to delete? Or can I delete anything and it will restore itself later? |
Hmmm, if that is the case, could you try changing the external_id in this case? If these are truly not duplicates maybe what happened was the broken indexes resulted in the external_id not being incremented properly.
Deleting records is never truly safe, it depends on whether or not the data in particular is something you are ok with keeping. Which record to delete would be up to you but let's first try to change the external_id. |
@hubertdeng123 It seems I gave a not very good example which misled you. The Here is a better example:
|
I got ChatGPT to write me a deduplication sql template and I modified it to include updating all FK relations to release_id (and first_release_id), but I'm still seeing "sentry.models.release.Release.MultipleObjectsReturned: get() returned more than one Release -- it returned 2!" sentry_fix_duplicates.txt |
In addition, we had to run DELETE FROM sentry_activity WHERE group_id IN (SELECT id from sentry_groupedmessage where first_release_id in (SELECT id from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1)); |
@hubertdeng123 Seems like the queries are not finding all the duplicates. It only deleted two, but we have way more: To see the duplicates: SELECT a.*
FROM sentry_grouprelease a
JOIN (SELECT group_id, release_id, environment, COUNT(*)
FROM sentry_grouprelease
GROUP BY group_id, release_id, environment
HAVING count(*) > 1 ) b
ON a.group_id = b.group_id
AND a.release_id = b.release_id
AND a.environment = b.environment
ORDER BY a.group_id; |
Between self-hosted releases 24.4.2 and 24.5.1, we've noticed that users have been experiencing issues with duplicate environments/releases. This is likely due to a mistake on our side to introduce the alpine version of postgres here. This has since been reverted back to the debian image, but there is likely quite a few of you experiencing issues.
We are creating this issue to document user issues and potential solutions. From here on out, we will only be using the debian image of postgres. We have since deleted self-hosted versions 24.4.2, 24.5.0, and 24.5.1
#3161
#3166
The text was updated successfully, but these errors were encountered: