Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work around PosgreSQL index issue #653

Merged
merged 1 commit into from
Mar 29, 2022

Conversation

TG1999
Copy link
Contributor

@TG1999 TG1999 commented Mar 25, 2022

Update model to use an index on a checksum of the large summary text field.
Otherwise there can be times when this exceeds PostgreSQL limits.
See #632 (comment)
The failing test before this patch #659
Reference: #650
Signed-off-by: Tushar Goel tushar.goel.dav@gmail.com

@pombredanne pombredanne changed the title Model changes #650 Work around PosgreSQL index issue Mar 25, 2022
Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should work, but IMHO we need first a minimal test case that reproduces the error.

@TG1999 TG1999 force-pushed the models/migrations branch 2 times, most recently from b5b1ab4 to cbec211 Compare March 25, 2022 17:32
Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great!
You may be missing one migration though and this should be a data migration that computes the md5 for the actual data?

@TG1999
Copy link
Contributor Author

TG1999 commented Mar 25, 2022

When I run ./manage.py makemigrations command again, it does nothing, md5 hash computation part is not getting stored as a migration file.

@ziadhany
Copy link
Collaborator

ziadhany commented Mar 26, 2022

Why do we have three fields in the model (summary_md5, package affected_md5, references_md5) separately. IMHO we can have one field and hash all the fields inside it. This will help us not to store useless data in the database
instead using unique_together ,we can use unique=True .
ex:

md5_field = models.CharField(
                    max_length = 32,  
                    unique = True
                    )

def save(self, *args, **kwargs):
        self.md5_field = hashlib.md5(self.summary.encode("utf-8") + self.affected_packages.encode("utf-8") + ... 
        super(Advisory, self).save(*args, **kwargs)

@TG1999 TG1999 force-pushed the models/migrations branch 3 times, most recently from 83506ae to 0d4419d Compare March 28, 2022 19:58
Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!
See a few comments for you consideration!

vulnerabilities/models.py Outdated Show resolved Hide resolved
vulnerabilities/migrations/0005_auto_20220328_1929.py Outdated Show resolved Hide resolved
vulnerabilities/models.py Outdated Show resolved Hide resolved
vulnerabilities/tests/test_postgres_workaround.py Outdated Show resolved Hide resolved
@TG1999 TG1999 force-pushed the models/migrations branch 2 times, most recently from 55cde35 to de19f98 Compare March 29, 2022 09:54
Work around PostgreSQL index issue

Signed-off-by: Tushar Goel <tushar.goel.dav@gmail.com>
@TG1999 TG1999 force-pushed the models/migrations branch from de19f98 to e6f12d5 Compare March 29, 2022 11:33
@TG1999 TG1999 merged commit a59b5e4 into aboutcode-org:main Mar 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants