Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postgres alpine image introduces unique constraint index errors #3167

Open
hubertdeng123 opened this issue Jun 25, 2024 · 19 comments
Open

Postgres alpine image introduces unique constraint index errors #3167

hubertdeng123 opened this issue Jun 25, 2024 · 19 comments

Comments

@hubertdeng123
Copy link
Member

hubertdeng123 commented Jun 25, 2024

Between self-hosted releases 24.4.2 and 24.5.1, we've noticed that users have been experiencing issues with duplicate environments/releases. This is likely due to a mistake on our side to introduce the alpine version of postgres here. This has since been reverted back to the debian image, but there is likely quite a few of you experiencing issues.

We are creating this issue to document user issues and potential solutions. From here on out, we will only be using the debian image of postgres. We have since deleted self-hosted versions 24.4.2, 24.5.0, and 24.5.1

#3161
#3166

@hubertdeng123
Copy link
Member Author

For users that are attempting to upgrade from the alpine image to the latest 24.6.0 or above, all indexes involving columns of type text, varchar, char, and citext should be reindexed before the instance is put into production (source).

For users that are attempting to fix duplicate rows of data, the process may be pretty manual 🥲. It will likely involve removing the duplicate rows of data, along with models with foreign key relations.

@candux
Copy link

candux commented Jun 26, 2024

I had the same problems and fixed them in the database with the follwing commands!
THIS is dangerous and will delete data! Use at your own risk! Make a backup before

ALTER TABLE ONLY public.sentry_commitfilechange
    DROP CONSTRAINT sentry_commitfilechange_commit_id_4c6f7ec25af34227_uniq;
DELETE from public.sentry_commitfilechange ou where (SELECT count(*) from public.sentry_commitfilechange inr where inr.commit_id = ou.commit_id and inr.filename = ou.filename) > 1;
ALTER TABLE ONLY public.sentry_commitfilechange
    ADD CONSTRAINT sentry_commitfilechange_commit_id_4c6f7ec25af34227_uniq UNIQUE (commit_id, filename);
    

DROP INDEX sentry_release_organization_id_f0a7ec9ba96de76;
DELETE from sentry_release_project where release_id in (SELECT id from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1);

DELETE from sentry_releaseprojectenvironment where release_id in (SELECT id from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1);

Delete from sentry_grouphash where group_id in (SELECT id from sentry_groupedmessage where first_release_id in (SELECT id from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1));

DELETE from sentry_groupedmessage where first_release_id in (SELECT id from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1);

DELETE from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1;

CREATE UNIQUE INDEX sentry_release_organization_id_f0a7ec9ba96de76 ON public.sentry_release USING btree (organization_id, version);

REINDEX DATABASE postgres;
REINDEX system postgres;

In case there are more duplicates the reindex will fail and you have to delete thes duplicates in a similar way.

@MingNiu
Copy link

MingNiu commented Jun 27, 2024

@hubertdeng123 Aren't you going to provide an official version of the SQL upgrade script?

@hubertdeng123
Copy link
Member Author

@candux Thanks for the input!

@MingNiu We're trying to figure out the best way to go about this. Different folks will have different ways they'd like to address the potential duplicate environments/releases/etc in their self-hosted instance. Merging data in postgres is extremely dangerous and it's not something that we think should be done as a general solution by us. For some people, they'd like to maybe keep the latest data while others would like to keep the data up until the point where data corruption occurred.

@MingNiu
Copy link

MingNiu commented Jun 28, 2024

@hubertdeng123 ok.

If I only want to keep data such as projects, members, configurations, etc., and clearing the issue record is acceptable. So can you provide the official upgrade script?

We are not familiar with the table structure and it is too difficult for us to manually fix the data.

@hubertdeng123
Copy link
Member Author

We will not be providing an official upgrade script at this time. What kind of duplicates are you seeing? That will help determine what rows to delete.

@shkarface
Copy link

We're also affected by this issue, environments are now duplicated and workers cannot process tasks because of:

worker-1  | Traceback (most recent call last):
worker-1  |   File "/.venv/lib/python3.11/site-packages/celery/app/trace.py", line 477, in trace_task
worker-1  |     R = retval = fun(*args, **kwargs)
worker-1  |                  ^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/.venv/lib/python3.11/site-packages/sentry_sdk/utils.py", line 1711, in runner
worker-1  |     return sentry_patched_function(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/.venv/lib/python3.11/site-packages/sentry_sdk/integrations/celery/__init__.py", line 380, in _inner
worker-1  |     reraise(*exc_info)
worker-1  |   File "/.venv/lib/python3.11/site-packages/sentry_sdk/utils.py", line 1640, in reraise
worker-1  |     raise value
worker-1  |   File "/.venv/lib/python3.11/site-packages/sentry_sdk/integrations/celery/__init__.py", line 375, in _inner
worker-1  |     return f(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^
worker-1  |   File "/.venv/lib/python3.11/site-packages/celery/app/trace.py", line 760, in __protected_call__
worker-1  |     return self.run(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/usr/src/sentry/src/sentry/silo/base.py", line 146, in override
worker-1  |     return original_method(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/usr/src/sentry/src/sentry/tasks/base.py", line 138, in _wrapped
worker-1  |     result = func(*args, **kwargs)
worker-1  |              ^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/usr/src/sentry/src/sentry/tasks/store.py", line 741, in save_event_transaction
worker-1  |     _do_save_event(cache_key, data, start_time, event_id, project_id, **kwargs)
worker-1  |   File "/usr/src/sentry/src/sentry/tasks/store.py", line 617, in _do_save_event
worker-1  |     manager.save(
worker-1  |   File "/.venv/lib/python3.11/site-packages/sentry_sdk/tracing_utils.py", line 669, in func_with_tracing
worker-1  |     return func(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/usr/src/sentry/src/sentry/event_manager.py", line 471, in save
worker-1  |     jobs = save_transaction_events([job], projects)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/.venv/lib/python3.11/site-packages/sentry_sdk/tracing_utils.py", line 669, in func_with_tracing
worker-1  |     return func(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/usr/src/sentry/src/sentry/event_manager.py", line 3034, in save_transaction_events
worker-1  |     _get_or_create_environment_many(jobs, projects)
worker-1  |   File "/.venv/lib/python3.11/site-packages/sentry_sdk/tracing_utils.py", line 669, in func_with_tracing
worker-1  |     return func(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/usr/src/sentry/src/sentry/event_manager.py", line 941, in _get_or_create_environment_many
worker-1  |     job["environment"] = Environment.get_or_create(
worker-1  |                          ^^^^^^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/usr/src/sentry/src/sentry/models/environment.py", line 98, in get_or_create
worker-1  |     env = cls.objects.get_or_create(name=name, organization_id=project.organization_id)[
worker-1  |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/usr/src/sentry/src/sentry/silo/base.py", line 146, in override
worker-1  |     return original_method(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/.venv/lib/python3.11/site-packages/django/db/models/manager.py", line 87, in manager_method
worker-1  |     return getattr(self.get_queryset(), name)(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/usr/src/sentry/src/sentry/silo/base.py", line 146, in override
worker-1  |     return original_method(*args, **kwargs)
worker-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
worker-1  |   File "/.venv/lib/python3.11/site-packages/django/db/models/query.py", line 948, in get_or_create
worker-1  |     return self.get(**kwargs), False
worker-1  |            ^^^^^^^^^^^^^^^^^^
worker-1  |   File "/.venv/lib/python3.11/site-packages/django/db/models/query.py", line 652, in get
worker-1  |     raise self.model.MultipleObjectsReturned(
worker-1  | sentry.models.environment.Environment.MultipleObjectsReturned: get() returned more than one Environment -- it returned 2!
worker-1  | 22:41:45 [ERROR] celery.app.trace: Task sentry.tasks.store.save_event_transaction[e8aaa026-f203-4119-bf1a-917a74e70462] raised unexpected: MultipleObjectsReturned('get() returned more than one Environment -- it returned 2!') (data={'hostname': 'celery@ee7e0490a110', 'id': 'e8aaa026-f203-4119-bf1a-917a74e70462', 'name': 'sentry.tasks.store.save_event_transaction', 'exc': "MultipleObjectsReturned('get() returned more than one Environment -- it returned 2!')", 'traceback': 'Traceback (most recent call last):\n  File "/.venv/lib/python3.11/site-packages/celery/app/trace.py", line 477, in trace_task\n    R = retval = fun(*args, **kwargs)\n                 ^^^^^^^^^^^^^^^^^^^^\n  File "/.venv/lib/python3.11/site-packages/sentry_sdk/utils.py", line 1711, in runner\n    return sentry_patched_function(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/.venv/lib/python3.11/site-packages/sentry_sdk/integrations/celery/__init__.py", line 380, in _inner\n    reraise(*exc_info)\n  File "/.venv/lib/python3.11/site-packages/sentry_sdk/utils.py", line 1640, in reraise\n    raise value\n  File "/.venv/lib/python3.11/site-packages/sentry_sdk/integrations/celery/__init__.py", line 375, in _inner\n    return f(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^\n  File "/.venv/lib/python3.11/site-packages/celery/app/trace.py", line 760, in __protected_call__\n    return self.run(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/usr/src/sentry/src/sentry/silo/base.py", line 146, in override\n    return original_method(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/usr/src/sentry/src/sentry/tasks/base.py", line 138, in _wrapped\n    result = func(*args, **kwargs)\n             ^^^^^^^^^^^^^^^^^^^^^\n  File "/usr/src/sentry/src/sentry/tasks/store.py", line 741, in save_event_transaction\n    _do_save_event(cache_key, data, start_time, event_id, project_id, **kwargs)\n  File "/usr/src/sentry/src/sentry/tasks/store.py", line 617, in _do_save_event\n    manager.save(\n  File "/.venv/lib/python3.11/site-packages/sentry_sdk/tracing_utils.py", line 669, in func_with_tracing\n    return func(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^\n  File "/usr/src/sentry/src/sentry/event_manager.py", line 471, in save\n    jobs = save_transaction_events([job], projects)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/.venv/lib/python3.11/site-packages/sentry_sdk/tracing_utils.py", line 669, in func_with_tracing\n    return func(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^\n  File "/usr/src/sentry/src/sentry/event_manager.py", line 3034, in save_transaction_events\n    _get_or_create_environment_many(jobs, projects)\n  File "/.venv/lib/python3.11/site-packages/sentry_sdk/tracing_utils.py", line 669, in func_with_tracing\n    return func(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^\n  File "/usr/src/sentry/src/sentry/event_manager.py", line 941, in _get_or_create_environment_many\n    job["environment"] = Environment.get_or_create(\n                         ^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/usr/src/sentry/src/sentry/models/environment.py", line 98, in get_or_create\n    env = cls.objects.get_or_create(name=name, organization_id=project.organization_id)[\n          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/usr/src/sentry/src/sentry/silo/base.py", line 146, in override\n    return original_method(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/.venv/lib/python3.11/site-packages/django/db/models/manager.py", line 87, in manager_method\n    return getattr(self.get_queryset(), name)(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/usr/src/sentry/src/sentry/silo/base.py", line 146, in override\n    return original_method(*args, **kwargs)\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/.venv/lib/python3.11/site-packages/django/db/models/query.py", line 948, in get_or_create\n    return self.get(**kwargs), False\n           ^^^^^^^^^^^^^^^^^^\n  File "/.venv/lib/python3.11/site-packages/django/db/models/query.py", line 652, in get\n    raise self.model.MultipleObjectsReturned(\nsentry.models.environment.Environment.MultipleObjectsReturned: get() returned more than one Environment -- it returned 2!\n', 'args': '()', 'kwargs': "{'cache_key': 'e:7dacbb33dd984f41ba98253510335ef1:60', 'data': None, 'start_time': 1720132852.0, 'event_id': '7dacbb33dd984f41ba98253510335ef1', 'project_id': 60, '__start_time': 1720132852.387123}", 'description': 'raised unexpected', 'internal': False})

See:

worker-1  | sentry.models.environment.Environment.MultipleObjectsReturned: get() returned more than one Environment -- it returned 2!

@hubertdeng123
Copy link
Member Author

hubertdeng123 commented Jul 11, 2024

Since it looks like most folks are experiencing issues with environments/releases, I wanted to post a manual guide to attempt to fix this data corruption.
This is dangerous and can result in data loss. Please use at your own discretion!

For users looking to delete duplicate data as a result of data corruption on 24.4.2, 24.5.0, and 24.5.1:

We'd highly recommend removing the duplicates, then upgrading to 24.6.0+ as soon as possible to avoid using the postgres alpine image which may cause painpoints when upgrading in the future. Reindexing the database with corrupt data will result in errors relating to duplicates. So, we'll need to delete duplicates before attempting to fix the indexes.

  1. docker compose up -d postgres
  2. docker compose exec postgres bash
  3. psql -U postgres
  4. Run below commands

Cleaning up environment:

DROP INDEX sentry_environment_organization_id_name_95a37dc7_uniq;
DELETE from sentry_deploy WHERE environment_id in (SELECT id from public.sentry_environment ou where (SELECT count(*) from public.sentry_environment inr where inr.organization_id = ou.organization_id and inr.name = ou.name) > 1);
DELETE from sentry_latestrelease WHERE environment_id in (SELECT id from public.sentry_environment ou where (SELECT count(*) from public.sentry_environment inr where inr.organization_id = ou.organization_id and inr.name = ou.name) > 1);
DELETE from sentry_rule WHERE environment_id in (SELECT id from public.sentry_environment ou where (SELECT count(*) from public.sentry_environment inr where inr.organization_id = ou.organization_id and inr.name = ou.name) > 1);;
DELETE from sentry_userreport WHERE environment_id in (SELECT id from public.sentry_environment ou where (SELECT count(*) from public.sentry_environment inr where inr.organization_id = ou.organization_id and inr.name = ou.name) > 1);;

DELETE from sentry_environmentproject WHERE environment_id in (SELECT id from public.sentry_environment ou where (SELECT count(*) from public.sentry_environment inr where inr.organization_id = ou.organization_id and inr.name = ou.name) > 1);;
DELETE from sentry_releaseprojectenvironment WHERE environment_id in (SELECT id from public.sentry_environment ou where (SELECT count(*) from public.sentry_environment inr where inr.organization_id = ou.organization_id and inr.name = ou.name) > 1);;
DELETE from sentry_environment WHERE id in (SELECT id from public.sentry_environment ou where (SELECT count(*) from public.sentry_environment inr where inr.organization_id = ou.organization_id and inr.name = ou.name) > 1);;
CREATE UNIQUE INDEX sentry_environment_organization_id_name_95a37dc7_uniq ON public.sentry_environment USING btree (organization_id, name);

Cleaning up releases (credit to @candux):

DROP INDEX sentry_release_organization_id_version_65da5843_uniq;
DELETE from sentry_release_project where release_id in (SELECT id from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1);

DELETE from sentry_releaseprojectenvironment where release_id in (SELECT id from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1);

DELETE from sentry_grouphash where group_id in (SELECT id from sentry_groupedmessage where first_release_id in (SELECT id from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1));

DELETE from sentry_groupedmessage where first_release_id in (SELECT id from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1);

DELETE from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1;

CREATE UNIQUE INDEX sentry_release_organization_id_version_65da5843_uniq ON public.sentry_release USING btree (organization_id, version);

For those users with clean data on versions 24.4.2, 24.5.0, 24.5.1 to 24.6.0+

We'll need to reindex the entire database to ensure data corruption will not occur in the future. According to the postgres docs, reindex is safe to be used in all cases.

Steps to reindex:

  1. git checkout 24.6.0+
  2. ./install.sh
  3. docker compose up -d postgres
  4. docker compose exec postgres bash
  5. psql -U postgres
  6. REINDEX DATABASE postgres;

@JiffsMaverick
Copy link

JiffsMaverick commented Jul 22, 2024

@hubertdeng123 Thank you for your instructions. I would like to clarify one point: when executing the REINDEX DATABASE postgres command, I get an error:

postgres=# REINDEX DATABASE postgres;
ERROR:  could not create unique index "sentry_commitauthor_organization_id_3cdc85e9f09bf3f3_uniq"
DETAIL:  Key (organization_id, external_id)=(1, github:user3) is duplicated.

Is this related to this problem? What can be done about it? Here is what this table looks like:

postgres=# SELECT organization_id, external_id, COUNT(*)
FROM sentry_commitauthor
GROUP BY organization_id, external_id
HAVING COUNT(*) > 1;
 organization_id |     external_id      | count
-----------------+----------------------+-------
               1 | github:user1         |     2
               1 | github:user2         |     2
               1 | github:user3         |     2
               1 | github:user4         |     2
               1 |                      |    12
(5 rows)

@feshchenkod
Copy link

We installed a fresh sentry 24.5.1 a few weeks ago and are now experiencing difficulties updating to version 24.6.0 or 24.7.0. Any guide on resolving this error?

Running hooks in /etc/ca-certificates/update.d...
done.
Usage: sentry upgrade [OPTIONS]
Try 'sentry upgrade --help' for help.

Error: No such option: --create-kafka-topics
Error in install/set-up-and-migrate-database.sh:30.
'$dcr web upgrade --create-kafka-topics' exited with status 2
-> ./install.sh:main:38
--> install/set-up-and-migrate-database.sh:source:30

Cleaning up...

@JiffsMaverick
Copy link

@feshchenkod It looks like you forgot to update the versions in your .env.custom file.

@feshchenkod
Copy link

@feshchenkod It looks like you forgot to update the versions in your .env.custom file.

Got it, I overlooked that part. Thanks a lot for your help!

@hubertdeng123
Copy link
Member Author

@JiffsMaverick It looks like you have duplicate entries in sentry_commitauthor table. You'll need to clean those up before trying to reindex your table again.

@JiffsMaverick
Copy link

JiffsMaverick commented Jul 24, 2024

@hubertdeng123 These records are not completely duplicated. There is an issue with the duplication of only two fields external_id and organization_id. For example:

 id | organization_id |           name           |                    email                     |           external_id
----+-----------------+--------------------------+----------------------------------------------+---------------------------------
  1 |               1 | User1                    | user1@company.com                            | github:gituser1
 12 |               1 | User1                    | altuser1@notcompany.com                      | github:gituser1
  3 |               1 | User2                    | user2@company.com                            | github:gituser2
 18 |               1 | User2-alt                | altuser2@notcompany.com                      | github:gituser2

Here is the code responsible for this index:

    class Meta:
        app_label = "sentry"
        db_table = "sentry_commitauthor"
        unique_together = (("organization_id", "email"), ("organization_id", "external_id"))

Is deleting records in this table safe? How would you recommend choosing which record to delete? Or can I delete anything and it will restore itself later?

@hubertdeng123
Copy link
Member Author

hubertdeng123 commented Jul 26, 2024

Hmmm, if that is the case, could you try changing the external_id in this case? If these are truly not duplicates maybe what happened was the broken indexes resulted in the external_id not being incremented properly.

Is deleting records in this table safe? How would you recommend choosing which record to delete? Or can I delete anything and it will restore itself later?

Deleting records is never truly safe, it depends on whether or not the data in particular is something you are ok with keeping. Which record to delete would be up to you but let's first try to change the external_id.

@JiffsMaverick

@JiffsMaverick
Copy link

@hubertdeng123 It seems I gave a not very good example which misled you. The external_id field should not increment. In this table, there can indeed be records with the same organization_id, the same external_id, but different email. For example, this situation can occur if a person has two emails linked to their GitHub profile - work and personal. It looks like the code in Sentry needs to change how the index in this table is constructed to account for the non-uniqueness of external_id.

Here is a better example:

 id | organization_id |           name       |            email           |      external_id
----+-----------------+----------------------+----------------------------+----------------------
  1 |               1 | Michael Smith        | michael@company.com        | github:michaelsmith
  2 |               1 | Michael Smith        | me@coolmichael.com         | github:michaelsmith
  3 |               1 | John Johnson         | john@company.com           | github:johnjohnson
  4 |               1 | John Johnson         | mail@jjblog.com            | github:johnjohnson

@brunis
Copy link

brunis commented Sep 30, 2024

I got ChatGPT to write me a deduplication sql template and I modified it to include updating all FK relations to release_id (and first_release_id), but I'm still seeing "sentry.models.release.Release.MultipleObjectsReturned: get() returned more than one Release -- it returned 2!"
I'm assuming it's joining on some other table that causes them to have duplicate.. but I don't know which one..
Where could I locate the sql query that fires:
File "/.venv/lib/python3.11/site-packages/django/db/models/query.py", line 652
I'm attaching the sql that I used and the error logs.
Which table are the event details stored in btw?

sentry_fix_duplicates.txt
MultipleObjectsReturned1.txt
MultipleObjectsReturned2.txt

@max-wittig
Copy link
Contributor

In addition, we had to run

DELETE FROM sentry_activity WHERE group_id IN (SELECT id from sentry_groupedmessage where first_release_id in (SELECT id from public.sentry_release ou where (SELECT count(*) from public.sentry_release inr where inr.organization_id = ou.organization_id and inr.version = ou.version) > 1));

@max-wittig
Copy link
Contributor

max-wittig commented Oct 10, 2024

@hubertdeng123 Seems like the queries are not finding all the duplicates. It only deleted two, but we have way more:

Image

To see the duplicates:

SELECT a.*
FROM sentry_grouprelease a
JOIN (SELECT group_id, release_id, environment, COUNT(*)
FROM sentry_grouprelease 
GROUP BY group_id, release_id, environment
HAVING count(*) > 1 ) b
ON a.group_id = b.group_id
AND a.release_id = b.release_id
AND a.environment = b.environment
ORDER BY a.group_id;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Waiting for: Product Owner
Status: No status
Development

No branches or pull requests

9 participants