feat: pre risky migration flaky test changes #524

joseph-sentry · 2024-06-26T15:54:54Z

This commit makes the changes possible before the risky migrations in the supporting shared PR are done.

update shared version
update test results parser version
add time-machine as a dependency
add Flake and ReducedError sqlalchemy models
modify TestResultsNotificationPayload to contain a set of flaky test ids instead of a dict[str, TestResultsNotificationFlake]
change flaky test results comment format
change flake detection in test results finisher to gather the Flake objects for a given repo and compare their test ids to the test ids of the failures relevant to the test results comment

codecov · 2024-06-26T16:03:04Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.51%. Comparing base (c3bcddf) to head (6b2507c).

✅ All tests successful. No failed tests found.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #524   +/-   ##
=======================================
  Coverage   97.50%   97.51%           
=======================================
  Files         449      449           
  Lines       35739    35731    -8     
=======================================
- Hits        34848    34843    -5     
+ Misses        891      888    -3

Flag	Coverage Δ
integration	`97.49% <100.00%> (+<0.01%)`	⬆️
latest-uploader-overall	`97.49% <100.00%> (+<0.01%)`	⬆️
unit	`97.49% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
NonTestCode	`94.60% <100.00%> (+0.01%)`	⬆️
OutsideTasks	`97.74% <100.00%> (-0.01%)`	⬇️

Files	Coverage Δ
database/models/reports.py	`99.45% <100.00%> (+0.05%)`	⬆️
services/test_results.py	`91.71% <100.00%> (-0.18%)`	⬇️
services/tests/test_test_results.py	`100.00% <100.00%> (ø)`
tasks/test_results_finisher.py	`97.69% <100.00%> (+1.69%)`	⬆️
tasks/tests/unit/test_test_results_finisher.py	`100.00% <100.00%> (ø)`
...sks/tests/unit/test_test_results_processor_task.py	`100.00% <ø> (ø)`

... and 1 file with indirect coverage changes

This change has been scanned for critical changes. Learn more

codecov-notifications · 2024-06-26T16:08:13Z

Codecov Report

All modified and coverable lines are covered by tests ✅

✅ All tests successful. No failed tests found.

@@           Coverage Diff           @@
##             main     #524   +/-   ##
=======================================
  Coverage   97.48%   97.49%           
=======================================
  Files         418      418           
  Lines       35016    35008    -8     
=======================================
- Hits        34135    34130    -5     
+ Misses        881      878    -3

Flag	Coverage Δ
integration	`97.49% <100.00%> (+<0.01%)`	⬆️
latest-uploader-overall	`97.49% <100.00%> (+<0.01%)`	⬆️
unit	`97.49% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
NonTestCode	`94.55% <100.00%> (+0.01%)`	⬆️
OutsideTasks	`97.74% <100.00%> (-0.01%)`	⬇️

Files	Coverage Δ
database/models/reports.py	`99.45% <100.00%> (+0.05%)`	⬆️
services/test_results.py	`90.53% <100.00%> (-0.28%)`	⬇️
services/tests/test_test_results.py	`100.00% <100.00%> (ø)`
tasks/test_results_finisher.py	`96.92% <100.00%> (+1.58%)`	⬆️
tasks/tests/unit/test_test_results_finisher.py	`100.00% <100.00%> (ø)`
...sks/tests/unit/test_test_results_processor_task.py	`100.00% <ø> (ø)`

... and 1 file with indirect coverage changes

codecov-qa · 2024-06-26T16:08:25Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.49%. Comparing base (c3bcddf) to head (6b2507c).

✅ All tests successful. No failed tests found.

@@           Coverage Diff           @@
##             main     #524   +/-   ##
=======================================
  Coverage   97.48%   97.49%           
=======================================
  Files         418      418           
  Lines       35016    35008    -8     
=======================================
- Hits        34135    34130    -5     
+ Misses        881      878    -3

Flag	Coverage Δ
integration	`97.49% <100.00%> (+<0.01%)`	⬆️
latest-uploader-overall	`97.49% <100.00%> (+<0.01%)`	⬆️
unit	`97.49% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
NonTestCode	`94.55% <100.00%> (+0.01%)`	⬆️
OutsideTasks	`97.74% <100.00%> (-0.01%)`	⬇️

Files	Coverage Δ
database/models/reports.py	`99.45% <100.00%> (+0.05%)`	⬆️
services/test_results.py	`90.53% <100.00%> (-0.28%)`	⬇️
services/tests/test_test_results.py	`100.00% <100.00%> (ø)`
tasks/test_results_finisher.py	`96.92% <100.00%> (+1.58%)`	⬆️
tasks/tests/unit/test_test_results_finisher.py	`100.00% <100.00%> (ø)`
...sks/tests/unit/test_test_results_processor_task.py	`100.00% <ø> (ø)`

... and 1 file with indirect coverage changes

codecov-public-qa · 2024-06-26T16:08:44Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.49%. Comparing base (c3bcddf) to head (6b2507c).

✅ All tests successful. No failed tests found ☺️

@@           Coverage Diff           @@
##             main     #524   +/-   ##
=======================================
  Coverage   97.48%   97.49%           
=======================================
  Files         418      418           
  Lines       35016    35008    -8     
=======================================
- Hits        34135    34130    -5     
+ Misses        881      878    -3

Flag	Coverage Δ
integration	`97.49% <100.00%> (+<0.01%)`	⬆️
latest-uploader-overall	`97.49% <100.00%> (+<0.01%)`	⬆️
unit	`97.49% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
NonTestCode	`94.55% <100.00%> (+0.01%)`	⬆️
OutsideTasks	`97.74% <100.00%> (-0.01%)`	⬇️

Files	Coverage Δ
database/models/reports.py	`99.45% <100.00%> (+0.05%)`	⬆️
services/test_results.py	`90.53% <100.00%> (-0.28%)`	⬇️
services/tests/test_test_results.py	`100.00% <100.00%> (ø)`
tasks/test_results_finisher.py	`96.92% <100.00%> (+1.58%)`	⬆️
tasks/tests/unit/test_test_results_finisher.py	`100.00% <100.00%> (ø)`
...sks/tests/unit/test_test_results_processor_task.py	`100.00% <ø> (ø)`

... and 1 file with indirect coverage changes

adrian-codecov · 2024-06-26T18:25:54Z

database/models/reports.py

+    message = Column(types.Text)
+
+
+class Flake(CodecovBaseModel, MixinBaseClass):


Is the idea to then replace these w/ their django counterparts after you run the risky migration?

no, these will remain, these changes just don't depend on the risky migrations to run

Mmm is there a reason we don't want to use the django models instead?

it's possible the Flake model is not needed, but the ReducedError model is needed because we will be writing the reduced_error_id in the test results processor in future changes

After talking, we'll defer the use of django models as it poses some difficulties with testing atm. Happy to approve once you have addressed @michelletran-codecov's feedback 👌

michelletran-codecov

Generally LGTM modulo comment below and adding SQLAlchemy models. The current query to get flakes feels pretty isolated. I'm wondering if there's a way for us to use the Django models instead. If there are no writes in this task, then there will unlikely be transaction contention. Also, the current query is pretty isolated, but I'm guessing that we'll need to tie it with TestInstance at some point? If that's the only place where we're interacting with the existing SQLAlchemy models, then would it be doable with referencing the ids?

Of course, keeping two different database models and connections for this task is also adding unnecessary complexity, so I'm not opposed to also just adding SQLAlchemy models.

michelletran-codecov · 2024-06-26T17:50:54Z

tasks/test_results_finisher.py

-                reason=reason,
-            ),
-        )
+    def get_flaky_tests(self, db_session, commit_yaml, repoid, commit, failures):


Can we add some type annotations?

joseph-sentry · 2024-06-26T20:17:58Z

I'm having a lot of trouble getting the tests to work with both the Django ORM and sqlalchemy, i'm trying to get them to both connect to the same db but there's complications because django wants to create it's own test database and sqlalchemy tries creating one as well. I don't think the complexity is worth the benefits.

michelletran-codecov

A few more comments about query performance.

michelletran-codecov · 2024-06-28T15:37:38Z

tasks/test_results_finisher.py

+                    Flake.repoid == repoid,
+                    Flake.end_date.is_(None),
+                )
+                .all()


Hmm... so usually to make the query more predictable, I would suggest to add a limit to the returned results (will help us with IO and memory usage in app). I see that you use this mainly to 1. count and 2. retrieve flakes from failed tests.

What do you think about splitting those into 2 separate queries? One to do the count (can use the queryset count method: https://docs.djangoproject.com/en/5.0/ref/models/querysets/#django.db.models.query.QuerySet.count to do select count(*) ...) and another to query flaky test from the failed test ids. This will involve 2 DB calls rather than one, but hopefully will be relatively fast and be a less data processing in the app itself.

I agree that we could limit and offset this query to reduce memory usage but maybe it'd be better to filter on test ids in the query like i mentioned in my comment below, and also only select the test id from the query which is all we're looking for here. Is there another reason other than memory usage that we would limit this query?

only select the test id from the query which is all we're looking for here.

I'm good with providing a id list. We will probably also want to ensure that the list isn't too long (i.e. we can cap this on the application side). It's probably fine to query all the tests for now. However, we will want to keep an eye on the performance of this query (correlated with number of failed tests it's trying to retrieve).

Is there another reason other than memory usage that we would limit this query?

Yes. Having a bound on the number of items returned will also make the processing of the results more predictable. For example, we don't have to worry (as much) about long running tasks because it's going to only ever process i.e. 30 items.

michelletran-codecov · 2024-06-28T16:14:23Z

tasks/test_results_finisher.py

+                    Flake.repoid == repoid,
+                    Flake.end_date.is_(None),


I see that the index is added on a compound of ["repository", "test", "reduced_error", and "date"]. This means that for this query, it's only going to use the index for "repository" (because Postgres processes index from left to right).

I believe we can also add an index for null fields specifically to make it more efficient, but I don't have as much experience with this. :p If we want to make this query efficient, we might want to explore doing a compound index with (repository, enddate = null) or something.

would adding on a Flake.testid.in_(list_of_test_ids) make this more efficient then? I plan to to eventually add the reduced_error_id to the query as well, so in the end we will be making full use of this index.

I plan to to eventually add the reduced_error_id to the query as well, so in the end we will be making full use of this index.

Ah OK! This is fine then. 👍

This commit makes the changes possible before the risky migrations in the supporting shared PR are done. - update shared version - update test results parser version - add time-machine as a dependency - add Flake and ReducedError sqlalchemy models - modify TestResultsNotificationPayload to contain a set of flaky test ids instead of a dict[str, TestResultsNotificationFlake] - change flaky test results comment format - change flake detection in test results finisher to gather the Flake objects for a given repo and compare their test ids to the test ids of the failures relevant to the test results comment

Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>

michelletran-codecov

LGTM!

joseph-sentry requested a review from michelletran-codecov June 26, 2024 15:57

adrian-codecov reviewed Jun 26, 2024

View reviewed changes

michelletran-codecov reviewed Jun 26, 2024

View reviewed changes

joseph-sentry force-pushed the joseph/flakes-pre-risky branch 3 times, most recently from 846827a to 1326fb1 Compare June 28, 2024 15:25

joseph-sentry requested a review from michelletran-codecov June 28, 2024 15:25

michelletran-codecov reviewed Jun 28, 2024

View reviewed changes

joseph-sentry added 6 commits July 2, 2024 13:06

fix: flake detection feature flag default to false

da74ead

fix: add type annotations

9e48903

Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>

fix: flake query in test results finisher

8516d12

fix: better flake query

fbc70b3

Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>

fix: add limit to query

6b2507c

Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>

joseph-sentry force-pushed the joseph/flakes-pre-risky branch from db5fd0a to 6b2507c Compare July 2, 2024 17:07

joseph-sentry requested a review from michelletran-codecov July 2, 2024 17:11

michelletran-codecov approved these changes Jul 2, 2024

View reviewed changes

joseph-sentry added this pull request to the merge queue Jul 2, 2024

Merged via the queue into main with commit b689d22 Jul 2, 2024
25 of 30 checks passed

joseph-sentry deleted the joseph/flakes-pre-risky branch July 2, 2024 18:48

michelletran-codecov mentioned this pull request Jul 3, 2024

feat: flake detection post risky migration changes #525

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: pre risky migration flaky test changes #524

feat: pre risky migration flaky test changes #524

joseph-sentry commented Jun 26, 2024

codecov bot commented Jun 26, 2024 •

edited

Loading

codecov-notifications bot commented Jun 26, 2024 •

edited

Loading

codecov-qa bot commented Jun 26, 2024 •

edited

Loading

codecov-public-qa bot commented Jun 26, 2024 •

edited

Loading

adrian-codecov Jun 26, 2024

joseph-sentry Jun 26, 2024

adrian-codecov Jun 26, 2024

joseph-sentry Jun 26, 2024

adrian-codecov Jun 26, 2024 •

edited

Loading

michelletran-codecov left a comment

michelletran-codecov Jun 26, 2024

joseph-sentry commented Jun 26, 2024

michelletran-codecov left a comment

michelletran-codecov Jun 28, 2024

joseph-sentry Jun 28, 2024

michelletran-codecov Jul 2, 2024

michelletran-codecov Jun 28, 2024

joseph-sentry Jun 28, 2024

michelletran-codecov Jul 2, 2024

michelletran-codecov left a comment

		message = Column(types.Text)


		class Flake(CodecovBaseModel, MixinBaseClass):

feat: pre risky migration flaky test changes #524

feat: pre risky migration flaky test changes #524

Conversation

joseph-sentry commented Jun 26, 2024

codecov bot commented Jun 26, 2024 • edited Loading

Codecov Report

codecov-notifications bot commented Jun 26, 2024 • edited Loading

Codecov Report

codecov-qa bot commented Jun 26, 2024 • edited Loading

Codecov Report

codecov-public-qa bot commented Jun 26, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adrian-codecov Jun 26, 2024 • edited Loading

Choose a reason for hiding this comment

michelletran-codecov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joseph-sentry commented Jun 26, 2024

michelletran-codecov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michelletran-codecov left a comment

Choose a reason for hiding this comment

codecov bot commented Jun 26, 2024 •

edited

Loading

codecov-notifications bot commented Jun 26, 2024 •

edited

Loading

codecov-qa bot commented Jun 26, 2024 •

edited

Loading

codecov-public-qa bot commented Jun 26, 2024 •

edited

Loading

adrian-codecov Jun 26, 2024 •

edited

Loading