Add test results finisher #234

joseph-sentry · 2024-01-16T17:41:18Z

Depends on: #197

Summary

This PR adds functionality for the test results PR comment feature, and contains the necessary improvements to the test result processor task to achieve this.

The first change to the processor is that it's no longer writing to the db anymore. This is because the processor tasks run concurrently, and concurrent writes to the DB were making it difficult to ensure that there multiple Test objects weren't being created at the same time. The writes to the DB to persist Test and TestInstance objects were moved to the finisher.

The second change to the processor is that it now returns a processed list of "Testrun dicts", which contains some metadata (env and run_number) and a list of "testruns" which the finisher will use to decide what Test and TestInstance objects to create.

The TestResultsFinisher task is created in this PR. This task is responsible for gathering the results of the processing step, adding new Test and TestInstance objects, updating existing TestInstance objects that need to be updated, and finally either writing the test results PR comment or triggering the notify task to write the coverage PR comment.

The upload task was updated to use a chord to run the processor tasks concurrently and to pass their results to the finisher task that will run when all the processor tasks are completed.

The notify task now checks if there are any failed tests that exist on a report attached to the commit it is trying to notify for, if there exist failed tests, it will no longer notify.

Edge cases

CI Re-run

This is the edge case where a user might not make any changes to the code, but re-runs the CI. This may happen if there is a flaky test that sometimes fails. We must handle this because when this happens we want to update the outcomes of the tests that ran previously. The run_number metadata field that the processor passes to the finisher is used to handle this edge case. The run_number corresponds to the build_code in the upload. This helps us distinguish between different runs of the same workflow. The run_number tells the finisher that if a TestInstance exists on the current commit and maps to the same Test as the one we are currently examining, we should look at the run_number to determine which one should be used. The notify task being called if there are no failures in the test result finisher, is also part of the solution to this edge case, because of the case where during the re run, the coverage notifier runs before the test result finisher

Same test, different env

This is a case where a user may be running the same test in the same CI run multiple times under a different environment eg. running a tests under multiple python versions. We need to be able to distinguish between these tests. We do this by adding the env field to the Test object which comprises a hash of the flag_names and the job_code of upload where the Test was first created. The uniqueness constraint on Test objects includes the testsuite, name, repoid and env fields which means there can only be one Test with the same name, testsuite and env for each repo.
Subsequent uploads that contain the same test being run with the same env will map to the existing Test object in the DB.

Possible improvements

Checkpoint logger integration
Addition of metrics on top of the checkpoint logger
Streaming parsing improvements mentioned here: feat: add models and task for failed test ingestion #197 (review)
Adding a roll up of the test result data for a given CommitReport (similar to ReportLevelTotals)
Moving TestInstance data to GCS in sqlite files

codecov-staging · 2024-01-16T17:46:03Z

Codecov Report

Attention: 9 lines in your changes are missing coverage. Please review.

@@            Coverage Diff             @@
##             main     #234      +/-   ##
==========================================
- Coverage   98.10%   98.10%   -0.01%     
==========================================
  Files         375      377       +2     
  Lines       31013    31428     +415     
==========================================
+ Hits        30425    30832     +407     
- Misses        588      596       +8

Flag	Coverage Δ
integration	`98.10% <97.88%> (-0.01%)`	⬇️
latest-uploader-overall	`98.10% <97.88%> (-0.01%)`	⬇️
unit	`98.10% <97.88%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
NonTestCode	`96.15% <92.56%> (-0.03%)`	⬇️
OutsideTasks	`97.91% <90.62%> (-0.02%)`	⬇️

Files	Coverage Δ
services/lock_manager.py	`91.48% <100.00%> (+2.35%)`	⬆️
tasks/tests/unit/test_notify_task.py	`100.00% <100.00%> (ø)`
tasks/tests/unit/test_test_results_finisher.py	`100.00% <100.00%> (ø)`
tasks/tests/unit/test_upload_task.py	`99.51% <100.00%> (+<0.01%)`	⬆️
tasks/upload.py	`97.97% <100.00%> (+<0.01%)`	⬆️
tasks/notify.py	`97.95% <90.00%> (-0.46%)`	⬇️
tasks/test_results_finisher.py	`95.45% <95.45%> (ø)`
services/test_results.py	`86.53% <90.47%> (+5.14%)`	⬆️

codecov-qa · 2024-01-16T17:46:22Z

Codecov Report

Attention: 9 lines in your changes are missing coverage. Please review.

Comparison is base (c8afa70) 98.10% compared to head (5d625ed) 98.10%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #234      +/-   ##
==========================================
- Coverage   98.10%   98.10%   -0.01%     
==========================================
  Files         375      377       +2     
  Lines       31013    31428     +415     
==========================================
+ Hits        30425    30832     +407     
- Misses        588      596       +8

Flag	Coverage Δ
integration	`98.10% <97.88%> (-0.01%)`	⬇️
latest-uploader-overall	`98.10% <97.88%> (-0.01%)`	⬇️
unit	`98.10% <97.88%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
NonTestCode	`96.15% <92.56%> (-0.03%)`	⬇️
OutsideTasks	`97.91% <90.62%> (-0.02%)`	⬇️

Files	Coverage Δ
services/lock_manager.py	`91.48% <100.00%> (+2.35%)`	⬆️
tasks/tests/unit/test_notify_task.py	`100.00% <100.00%> (ø)`
tasks/tests/unit/test_test_results_finisher.py	`100.00% <100.00%> (ø)`
tasks/tests/unit/test_upload_task.py	`99.51% <100.00%> (+<0.01%)`	⬆️
tasks/upload.py	`97.97% <100.00%> (+<0.01%)`	⬆️
tasks/notify.py	`97.95% <90.00%> (-0.46%)`	⬇️
tasks/test_results_finisher.py	`95.45% <95.45%> (ø)`
services/test_results.py	`86.53% <90.47%> (+5.14%)`	⬆️

codecov-public-qa · 2024-01-16T17:46:29Z

Codecov Report

Merging #234 (5d625ed) into main (c8afa70) will decrease coverage by 0.01%.
The diff coverage is 97.88%.

@@            Coverage Diff             @@
##             main     #234      +/-   ##
==========================================
- Coverage   98.10%   98.10%   -0.01%     
==========================================
  Files         375      377       +2     
  Lines       31013    31428     +415     
==========================================
+ Hits        30425    30832     +407     
- Misses        588      596       +8

Flag	Coverage Δ
integration	`98.10% <97.88%> (-0.01%)`	⬇️
latest-uploader-overall	`98.10% <97.88%> (-0.01%)`	⬇️
unit	`98.10% <97.88%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
NonTestCode	`96.15% <92.56%> (-0.03%)`	⬇️
OutsideTasks	`97.91% <90.62%> (-0.02%)`	⬇️

Files	Coverage Δ
services/lock_manager.py	`91.48% <100.00%> (+2.35%)`	⬆️
tasks/tests/unit/test_notify_task.py	`100.00% <100.00%> (ø)`
tasks/tests/unit/test_test_results_finisher.py	`100.00% <100.00%> (ø)`
tasks/tests/unit/test_upload_task.py	`99.51% <100.00%> (+<0.01%)`	⬆️
tasks/upload.py	`97.97% <100.00%> (+<0.01%)`	⬆️
tasks/notify.py	`97.95% <90.00%> (-0.46%)`	⬇️
tasks/test_results_finisher.py	`95.45% <95.45%> (ø)`
services/test_results.py	`86.53% <90.47%> (+5.14%)`	⬆️

codecov · 2024-01-16T17:47:17Z

Codecov Report

Attention: 9 lines in your changes are missing coverage. Please review.

Comparison is base (c8afa70) 98.07% compared to head (5d625ed) 98.07%.

Changes have been made to critical files, which contain lines commonly executed in production. Learn more

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #234      +/-   ##
==========================================
- Coverage   98.07%   98.07%   -0.01%     
==========================================
  Files         406      408       +2     
  Lines       31714    32129     +415     
==========================================
+ Hits        31105    31512     +407     
- Misses        609      617       +8

Flag	Coverage Δ
integration	`98.10% <97.88%> (-0.01%)`	⬇️
latest-uploader-overall	`98.10% <97.88%> (-0.01%)`	⬇️
onlysomelabels	`98.07% <97.88%> (-0.01%)`	⬇️
unit	`98.10% <97.88%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
NonTestCode	`96.06% <92.56%> (-0.03%)`	⬇️
OutsideTasks	`97.91% <90.62%> (-0.02%)`	⬇️

Files	Coverage Δ
services/lock_manager.py	`91.48% <100.00%> (+2.35%)`	⬆️
tasks/tests/unit/test_notify_task.py	`100.00% <100.00%> (ø)`
tasks/tests/unit/test_test_results_finisher.py	`100.00% <100.00%> (ø)`
tasks/tests/unit/test_upload_task.py	`99.51% <100.00%> (+<0.01%)`	⬆️
tasks/upload.py Critical	`97.98% <100.00%> (+<0.01%)`	⬆️
tasks/notify.py Critical	`97.95% <90.00%> (-0.46%)`	⬇️
tasks/test_results_finisher.py	`95.45% <95.45%> (ø)`
services/test_results.py	`86.53% <90.47%> (+5.14%)`	⬆️

Related Entrypoints
run/app.tasks.notify.Notify

matt-codecov · 2024-01-16T19:16:26Z

joseph/failed-test-ingestion...joseph/add-test-results-finisher-to-upload this PR includes the test result processor PR, here is a URL to compare the two branches separately

matt-codecov

started a group DM about a design tweak for performance reasons. need to sync on that before accept/request changes decision

matt-codecov · 2024-01-16T19:43:21Z

services/test_results.py

+        len_of_table_value = len(table_value)
+        for i in range((len_of_table_value) // 70):
+            table_value = (
+                table_value[: (i + 1) * 70] + "<br>" + table_value[(i + 1) * 70 :]


you mentioned that the "<br>" takes up 4 characters which messes with the line lengths. you can avoid that with something like:

line_size = 70 lines = [table_value[i:i+line_size] for i in range(0, len(table_value), line_size)] return "<br>".join(lines)

matt-codecov · 2024-01-17T04:57:02Z

services/test_results.py

+        return current_report_row
+
+    # support flags in test results
+    def create_report_upload(


can we put the flags stuff in the base class and not have to override this here?

matt-codecov · 2024-01-17T05:14:51Z

tasks/test_results_finisher.py

+            with lock_manager.locked(
+                LockType.NOTIFICATION,
+                retry_num=self.request.retries,
+            ):


does this lock need to surround the whole finisher task or just the part where we work on the PR comment?

matt-codecov · 2024-01-17T05:16:46Z

tasks/test_results_finisher.py

+        )
+        assert commit, "commit not found"
+
+        notify = True


nothing ever sets this to false, it's only read in one place where we make the return dict. you could just inline True there

matt-codecov · 2024-01-17T05:29:47Z

database/models/reports.py

+    upload_id = Column(types.Integer, ForeignKey("reports_upload.id"))
+    upload = relationship("Upload", backref=backref("testinstances"))
+    failure_message = Column(types.Text)
+    active = Column(types.Boolean)


can we order by created time or join the uploads table and order by upload time to get the newest? that way we just need a single atomic INSERT to insert a new test instance, we don't need to also run an UPDATE to set the previous one to false that we have to make sure runs in the same transaction to avoid races

matt-codecov · 2024-01-17T06:26:03Z

database/models/reports.py

+    repository = relationship("Repository", backref=backref("tests"))
+    name = Column(types.String(256), nullable=False)
+    testsuite = Column(types.String(256), nullable=False)
+    env = Column(types.String(256), nullable=False)


i just left a comment on #197 (review) about changing the id field of this table to be a hash instead. in my comment over there i said it should be a hash of the test suite + test name, but i suppose it should also include the inputs to env as well since env is a similar concept

matt-codecov · 2024-01-17T06:34:05Z

tasks/notify.py

+
+        # check if there were any test failures
+
+        num_of_failed_test_instances = (
+            db_session.query(TestInstance)
+            .join(Upload)
+            .join(CommitReport)
+            .filter(
+                CommitReport.commit_id == commit.id_,
+                TestInstance.outcome == int(Outcome.Failure),
+                TestInstance.active is True,
+            )
+            .count()
+        )
+
+        if num_of_failed_test_instances > 0:
+            return {
+                "notify_attempted": False,
+                "notifications": None,
+                "reason": "test_failures",
+            }
+


should this be in should_send_notifications()?

worker/tasks/notify.py

Line 409 in 3276511

def should_send_notifications(self, current_yaml, commit, ci_passed, report):

matt-codecov

thanks for all the iteration on these changes, i think we will long-term be happy we took the time. just nits left, some old comments + a couple new ones

tasks/notify.py

matt-codecov · 2024-01-31T23:18:51Z

tasks/notify.py

+        )
+
+        if any(
+            [test.outcome == int(Outcome.Failure) for test in latest_test_instances]


i think test.outcome is a string now, is the int cast correct here?

matt-codecov · 2024-01-31T23:24:08Z

tasks/test_results_finisher.py

+            .join(Upload)
+            .join(CommitReport)
+            .join(Commit)
+            .filter(Commit.id_ == commit.id_)
+            .order_by(TestInstance.test_id)
+            .order_by(desc(Upload.created_at))
+            .distinct(TestInstance.test_id)
+            .all()


consider moving this query to a helper function on TestInstance or something since you use it in more than one place so that the logic definitely stays the same

matt-codecov · 2024-01-31T23:25:11Z

tasks/test_results_finisher.py

+            )
+            return {"notify_attempted": False, "notify_succeeded": False}
+
+        success = None


why is this here if it's overwritten two lines down?

matt-codecov · 2024-01-31T23:26:34Z

tasks/test_results_finisher.py

+
+    def check_if_no_failures(self, testrun_list):
+        return all(
+            [instance.outcome != int(Outcome.Failure) for instance in testrun_list]


same Q re instance.outcome being a string and int() maybe not being correct

This class creates the PR comment message based on the test instances passed to it and posts/edits the comment on the PR. Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>

The notify task will exit early now, if the latest test instances relevant to the notify's commit contain failures Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>

This task will check if failures exist in the latest round of test instances for this commit. If they do it will run the test results notifier, else it will trigger the notify task to run. This task takes the same lock as the notify task so that they don't try writing to the PR comment at the same time. Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>

The test result finisher runs after the test results processors are done running. This is achieved using a celery chord. Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>

Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>

this commit changes the int conversions of Outcome to str conversions. moves the db query for latest test instances for a given commit to a separate function and cleans up some unused vars Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>

joseph-sentry force-pushed the joseph/add-test-results-finisher-to-upload branch from 6facf1f to 3563af6 Compare January 16, 2024 19:01

matt-codecov reviewed Jan 17, 2024

View reviewed changes

joseph-sentry force-pushed the joseph/add-test-results-finisher-to-upload branch from b026451 to 31dba51 Compare January 17, 2024 22:11

joseph-sentry temporarily deployed to staging January 23, 2024 22:29 — with GitHub Actions Inactive

joseph-sentry force-pushed the joseph/add-test-results-finisher-to-upload branch from 8742bbe to a390c59 Compare January 24, 2024 14:20

joseph-sentry temporarily deployed to staging January 24, 2024 14:25 — with GitHub Actions Inactive

joseph-sentry force-pushed the joseph/add-test-results-finisher-to-upload branch 2 times, most recently from 731ea75 to 7d9aa9a Compare January 25, 2024 19:51

matt-codecov reviewed Jan 31, 2024

View reviewed changes

joseph-sentry added 8 commits February 1, 2024 15:17

feat: create TestResultsNotifier class

77cfce2

This class creates the PR comment message based on the test instances passed to it and posts/edits the comment on the PR. Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>

feat(notify): don't notify if failed test instances exist

835e5fe

The notify task will exit early now, if the latest test instances relevant to the notify's commit contain failures Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>

feat: run the test results finisher after the processors

9b2399a

The test result finisher runs after the test results processors are done running. This is achieved using a celery chord. Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>

test: rename env to flags_hash in test_results_finisher

4c42209

Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>

fix: address feedback; inline True, remove notify var

c8b6b9a

Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>

fix: incorporate insert_breaks feedback to simplify function

c5d4d3e

Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>

fix: address feedback

5d625ed

this commit changes the int conversions of Outcome to str conversions. moves the db query for latest test instances for a given commit to a separate function and cleans up some unused vars Signed-off-by: joseph-sentry <joseph.sawaya@sentry.io>

joseph-sentry force-pushed the joseph/add-test-results-finisher-to-upload branch from efa626e to 5d625ed Compare February 1, 2024 20:19

matt-codecov approved these changes Feb 1, 2024

View reviewed changes

joseph-sentry merged commit 0b7ef43 into main Feb 1, 2024
15 of 31 checks passed

joseph-sentry deleted the joseph/add-test-results-finisher-to-upload branch February 1, 2024 20:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add test results finisher #234

Add test results finisher #234

joseph-sentry commented Jan 16, 2024 •

edited

Loading

codecov-staging bot commented Jan 16, 2024 •

edited

Loading

codecov-qa bot commented Jan 16, 2024 •

edited

Loading

codecov-public-qa bot commented Jan 16, 2024 •

edited

Loading

codecov bot commented Jan 16, 2024 •

edited

Loading

matt-codecov commented Jan 16, 2024

matt-codecov left a comment

matt-codecov Jan 16, 2024

matt-codecov Jan 17, 2024

matt-codecov Jan 17, 2024

matt-codecov Jan 17, 2024

matt-codecov Jan 17, 2024

matt-codecov Jan 17, 2024

matt-codecov Jan 17, 2024

matt-codecov left a comment

matt-codecov Jan 31, 2024

matt-codecov Jan 31, 2024

matt-codecov Jan 31, 2024

matt-codecov Jan 31, 2024

Add test results finisher #234

Add test results finisher #234

Conversation

joseph-sentry commented Jan 16, 2024 • edited Loading

Summary

Edge cases

CI Re-run

Same test, different env

Possible improvements

codecov-staging bot commented Jan 16, 2024 • edited Loading

Codecov Report

codecov-qa bot commented Jan 16, 2024 • edited Loading

Codecov Report

codecov-public-qa bot commented Jan 16, 2024 • edited Loading

Codecov Report

codecov bot commented Jan 16, 2024 • edited Loading

Codecov Report

matt-codecov commented Jan 16, 2024

matt-codecov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matt-codecov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joseph-sentry commented Jan 16, 2024 •

edited

Loading

codecov-staging bot commented Jan 16, 2024 •

edited

Loading

codecov-qa bot commented Jan 16, 2024 •

edited

Loading

codecov-public-qa bot commented Jan 16, 2024 •

edited

Loading

codecov bot commented Jan 16, 2024 •

edited

Loading