feat: new ta tasks #976

joseph-sentry · 2024-12-19T21:39:06Z

this PR creates new ta processor and finisher tasks and uses them behind a feature flag in the upload task for a smooth rollout

sentry-io · 2024-12-19T21:39:19Z

🔍 Existing Issues For Review

Your pull request is modifying functions with the following pre-existing issues:

📄 File: tasks/upload.py

Function	Unhandled Issue
`_schedule_test_results_processing_task`	[**TypeError: unsupported operand type(s) for

_{Did you find this useful? React with a 👍 or 👎}

github-actions · 2024-12-19T21:39:20Z

This PR includes changes to shared. Please review them here: https://github.com/codecov/shared/compare/2674ae99811767e63151590906691aed4c5ce1f9...

codecov-staging · 2024-12-19T21:46:04Z

Codecov Report

Attention: Patch coverage is 94.89292% with 31 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
tasks/ta_finisher.py	83.13%	28 Missing ⚠️
tasks/test_results_processor.py	92.85%	2 Missing ⚠️
tasks/ta_processor.py	98.64%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

codecov · 2024-12-19T21:46:13Z

Codecov Report

Attention: Patch coverage is 94.89292% with 31 lines in your changes missing coverage. Please review.

Project coverage is 97.74%. Comparing base (ac302e7) to head (86c8b7d).
Report is 2 commits behind head on main.

✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
tasks/ta_finisher.py	83.13%	28 Missing ⚠️
tasks/test_results_processor.py	92.85%	2 Missing ⚠️
tasks/ta_processor.py	98.64%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #976      +/-   ##
==========================================
- Coverage   97.79%   97.74%   -0.05%     
==========================================
  Files         447      451       +4     
  Lines       36175    36653     +478     
==========================================
+ Hits        35376    35828     +452     
- Misses        799      825      +26

Flag	Coverage Δ
integration	`42.57% <67.05%> (+0.42%)`	⬆️
unit	`90.19% <57.82%> (-0.28%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

⚠️ Impact Analysis from Codecov is deprecated and will be sunset on Jan 31 2025. See more

codecov-qa · 2024-12-19T21:46:13Z

❌ 1 Tests Failed:

Tests completed	Failed	Passed	Skipped
1778	1	1777	4

View the top 1 failed tests by shortest run time

tasks/tests/unit/test_upload_task.py::TestUploadTaskIntegration::test_upload_task_call_new_ta_tasks

Stack Traces | 0.156s run time

self = <worker.tasks.tests.unit.test_upload_task.TestUploadTaskIntegration object at 0x7f41e4a81fd0>
mocker = <pytest_mock.plugin.MockFixture object at 0x7f41ddb33290>
mock_configuration = <shared.config.ConfigHelper object at 0x7f41dd5b9d30>
dbsession = <sqlalchemy.orm.session.Session object at 0x7f41dda43110>
codecov_vcr = <vcr.cassette.Cassette object at 0x7f41ddb30e90>
mock_storage = <shared.storage.memory.MemoryStorageService object at 0x7f41dda2ef30>
mock_redis = <worker.tasks.tests.unit.test_upload_task.FakeRedis object at 0x7f41dd2306b0>
celery_app = <Celery celery.tests at 0x7f41dd258cd0>

    def test_upload_task_call_new_ta_tasks(
        self,
        mocker,
        mock_configuration,
        dbsession,
        codecov_vcr,
        mock_storage,
        mock_redis,
        celery_app,
    ):
        chord = mocker.patch("tasks.upload.chord")
        _ = mocker.patch("tasks.upload.NEW_TA_TASKS.check_value", return_value=True)
        storage_path = ".../C3C4715CA57C910D11D5EB899FC86A7E/4c4e4654ac25037ae869caeb3619d485970b6304/a84d445c-9c1e-434f-8275-f18f1f320f81.txt"
        redis_queue = [{"url": storage_path, "build_code": "some_random_build"}]
        jsonified_redis_queue = [json.dumps(x) for x in redis_queue]
        mocker.patch.object(UploadTask, "app", celery_app)
    
        mock_repo_provider_service = AsyncMock()
        mock_repo_provider_service.get_commit.return_value = {
            "author": {
                "id": "123",
                "username": "456",
                "email": "789",
                "name": "101",
            },
            "message": "hello world",
            "parents": [],
            "timestamp": str(datetime.now()),
        }
        mock_repo_provider_service.get_ancestors_tree.return_value = {"parents": []}
        mock_repo_provider_service.get_pull_request.return_value = {
            "head": {"branch": "main"},
            "base": {},
        }
        mock_repo_provider_service.list_top_level_files.return_value = [
            {"name": "codecov.yml", "path": "codecov.yml"}
        ]
        mock_repo_provider_service.get_source.return_value = {
            "content": """
            codecov:
                max_report_age: 1y ago
            """
        }
    
        mocker.patch(
            "tasks.upload.get_repo_provider_service",
            return_value=mock_repo_provider_service,
        )
        mocker.patch("tasks.upload.hasattr", return_value=False)
        commit = CommitFactory.create(
            message="",
            commitid="abf6d4df662c47e32460020ab14abf9303581429",
            repository__owner__oauth_token="GHTZB+Mi+.../ubudnSKTJYb/fgN4hRJVJYSIErtidEsCLDJBb8DZzkbXqLujHAnv28aKShXddE/OffwRuwKug==",
            repository__owner__username="ThiagoCodecov",
            repository__owner__service="github",
            repository__yaml={"codecov": {"max_report_age": "1y ago"}},
            repository__name="example-python",
            pullid=1,
            # Setting the time to _before_ patch centric default YAMLs start date of 2024-04-30
            repository__owner__createstamp=datetime(2023, 1, 1, tzinfo=timezone.utc),
            branch="main",
        )
        dbsession.add(commit)
        dbsession.flush()
        dbsession.refresh(commit)
    
        mock_redis.lists[f"uploads/{commit.repoid}/{commit.commitid}/test_results"] = (
            jsonified_redis_queue
        )
    
>       UploadTask().run_impl(
            dbsession,
            commit.repoid,
            commit.commitid,
            report_type="test_results",
        )

.../tests/unit/test_upload_task.py:512: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tasks/upload.py:353: in run_impl
    return self.run_impl_within_lock(
.../local/lib/python3.13.../site-packages/sentry_sdk/tracing_utils.py:673: in func_with_tracing
    return func(*args, **kwargs)
tasks/upload.py:535: in run_impl_within_lock
    self._bulk_insert_coverage_measurements(measurements=measurements)
tasks/upload.py:570: in _bulk_insert_coverage_measurements
    bulk_insert_coverage_measurements(measurements=measurements)
.../local/lib/python3.13.../shared/upload/utils.py:47: in bulk_insert_coverage_measurements
    with transaction.atomic():
.../local/lib/python3.13.../django/db/transaction.py:198: in __enter__
    if not connection.get_autocommit():
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <DatabaseWrapper vendor='postgresql' alias='default'>

    def get_autocommit(self):
        """Get the autocommit state."""
>       self.ensure_connection()
E       RuntimeError: Database access not allowed, use the "django_db" mark, or the "db" or "transactional_db" fixtures to enable it.

.../local/lib/python3.13.../backends/base/base.py:464: RuntimeError

To view more test analytics, go to the Test Analytics Dashboard
📢 Thoughts on this report? Let us know!

codecov-public-qa · 2024-12-19T21:46:22Z

❌ 1 Tests Failed:

Tests completed	Failed	Passed	Skipped
1778	1	1777	4

View the top 1 failed tests by shortest run time

tasks/tests/unit/test_upload_task.py::TestUploadTaskIntegration::test_upload_task_call_new_ta_tasks

Stack Traces | 0.156s run time

self = <worker.tasks.tests.unit.test_upload_task.TestUploadTaskIntegration object at 0x7f41e4a81fd0>
mocker = <pytest_mock.plugin.MockFixture object at 0x7f41ddb33290>
mock_configuration = <shared.config.ConfigHelper object at 0x7f41dd5b9d30>
dbsession = <sqlalchemy.orm.session.Session object at 0x7f41dda43110>
codecov_vcr = <vcr.cassette.Cassette object at 0x7f41ddb30e90>
mock_storage = <shared.storage.memory.MemoryStorageService object at 0x7f41dda2ef30>
mock_redis = <worker.tasks.tests.unit.test_upload_task.FakeRedis object at 0x7f41dd2306b0>
celery_app = <Celery celery.tests at 0x7f41dd258cd0>

    def test_upload_task_call_new_ta_tasks(
        self,
        mocker,
        mock_configuration,
        dbsession,
        codecov_vcr,
        mock_storage,
        mock_redis,
        celery_app,
    ):
        chord = mocker.patch("tasks.upload.chord")
        _ = mocker.patch("tasks.upload.NEW_TA_TASKS.check_value", return_value=True)
        storage_path = ".../C3C4715CA57C910D11D5EB899FC86A7E/4c4e4654ac25037ae869caeb3619d485970b6304/a84d445c-9c1e-434f-8275-f18f1f320f81.txt"
        redis_queue = [{"url": storage_path, "build_code": "some_random_build"}]
        jsonified_redis_queue = [json.dumps(x) for x in redis_queue]
        mocker.patch.object(UploadTask, "app", celery_app)
    
        mock_repo_provider_service = AsyncMock()
        mock_repo_provider_service.get_commit.return_value = {
            "author": {
                "id": "123",
                "username": "456",
                "email": "789",
                "name": "101",
            },
            "message": "hello world",
            "parents": [],
            "timestamp": str(datetime.now()),
        }
        mock_repo_provider_service.get_ancestors_tree.return_value = {"parents": []}
        mock_repo_provider_service.get_pull_request.return_value = {
            "head": {"branch": "main"},
            "base": {},
        }
        mock_repo_provider_service.list_top_level_files.return_value = [
            {"name": "codecov.yml", "path": "codecov.yml"}
        ]
        mock_repo_provider_service.get_source.return_value = {
            "content": """
            codecov:
                max_report_age: 1y ago
            """
        }
    
        mocker.patch(
            "tasks.upload.get_repo_provider_service",
            return_value=mock_repo_provider_service,
        )
        mocker.patch("tasks.upload.hasattr", return_value=False)
        commit = CommitFactory.create(
            message="",
            commitid="abf6d4df662c47e32460020ab14abf9303581429",
            repository__owner__oauth_token="GHTZB+Mi+.../ubudnSKTJYb/fgN4hRJVJYSIErtidEsCLDJBb8DZzkbXqLujHAnv28aKShXddE/OffwRuwKug==",
            repository__owner__username="ThiagoCodecov",
            repository__owner__service="github",
            repository__yaml={"codecov": {"max_report_age": "1y ago"}},
            repository__name="example-python",
            pullid=1,
            # Setting the time to _before_ patch centric default YAMLs start date of 2024-04-30
            repository__owner__createstamp=datetime(2023, 1, 1, tzinfo=timezone.utc),
            branch="main",
        )
        dbsession.add(commit)
        dbsession.flush()
        dbsession.refresh(commit)
    
        mock_redis.lists[f"uploads/{commit.repoid}/{commit.commitid}/test_results"] = (
            jsonified_redis_queue
        )
    
>       UploadTask().run_impl(
            dbsession,
            commit.repoid,
            commit.commitid,
            report_type="test_results",
        )

.../tests/unit/test_upload_task.py:512: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tasks/upload.py:353: in run_impl
    return self.run_impl_within_lock(
.../local/lib/python3.13.../site-packages/sentry_sdk/tracing_utils.py:673: in func_with_tracing
    return func(*args, **kwargs)
tasks/upload.py:535: in run_impl_within_lock
    self._bulk_insert_coverage_measurements(measurements=measurements)
tasks/upload.py:570: in _bulk_insert_coverage_measurements
    bulk_insert_coverage_measurements(measurements=measurements)
.../local/lib/python3.13.../shared/upload/utils.py:47: in bulk_insert_coverage_measurements
    with transaction.atomic():
.../local/lib/python3.13.../django/db/transaction.py:198: in __enter__
    if not connection.get_autocommit():
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <DatabaseWrapper vendor='postgresql' alias='default'>

    def get_autocommit(self):
        """Get the autocommit state."""
>       self.ensure_connection()
E       RuntimeError: Database access not allowed, use the "django_db" mark, or the "db" or "transactional_db" fixtures to enable it.

.../local/lib/python3.13.../backends/base/base.py:464: RuntimeError

To view more test analytics, go to the Test Analytics Dashboard
📢 Thoughts on this report? Let us know!

github-actions · 2024-12-19T21:46:34Z

✅ All tests successful. No failed tests were found.

📣 Thoughts on this report? Let Codecov know! | Powered by Codecov

Swatinem · 2024-12-20T08:43:04Z

tasks/upload.py

+                    arguments_list=list(chunk),
+                )
+                for chunk in itertools.batched(argument_list, CHUNK_SIZE)


Not sure if we still want to run these in batches, or rather one upload per task?

one upload per task seems reasonable now that we aren't writing to the db in the processor

Swatinem · 2024-12-20T08:43:52Z

tasks/tests/unit/test_ta_finisher_task.py

+
+def test_test_analytics(dbsession, mocker, celery_app):
+    url = "literally/whatever"
+    storage_service = get_appropriate_storage_service(None)


maybe you want to use the mock storage provider for this?

Swatinem · 2024-12-20T08:51:51Z

tasks/tests/unit/test_ta_finisher_task.py

+    mocker.patch.object(TAProcessorTask, "app", celery_app)
+    mocker.patch.object(TAFinisherTask, "app", celery_app)
+
+    hello = celery_app.register_task(ProcessFlakesTask())
+    _ = celery_app.tasks[hello.name]
+    goodbye = celery_app.register_task(CacheTestRollupsTask())
+    _ = celery_app.tasks[goodbye.name]


I have never seen this pattern, what does it do?

without this, when the finisher would try to call those tasks, they weren't in the mocked celery app, so what i was trying to do here is add them to the mocked celery app

i replaced this with some code that is hopefully more clear

Swatinem · 2024-12-20T08:58:06Z

.../cassetes/test_upload_task/TestUploadTaskIntegration/test_upload_task_call_new_ta_tasks.yaml

+      user-agent:
+      - Default
+    method: GET
+    uri: https://api.github.com/repos/ThiagoCodecov/example-python/commits/abf6d4df662c47e32460020ab14abf9303581429


can you rather mock away whatever call does this request, instead of relying on vcr?

Swatinem · 2024-12-20T09:11:39Z

tasks/ta_finisher.py

+        for upload in uploads:
+            repo_flag_ids = get_repo_flag_ids(db_session, repoid, upload.flag_names)
+            if upload.state == "processed":


this loop has a couple of problems:

you are querying all uploads from the DB, but only ever run the code on processed ones

you only append to tests_to_write and friends, but never clear those across uploads

save_tests and friends runs for all the uploads, together with never clearing tests_to_write above means that you insert the same tests over and over again depending on how many total uploads you have

you unconditionally set state = "finished" for all the downloads, also ones that already have that state

the intermediate msgpack file is never cleared.

joseph-sentry · 2024-12-20T17:46:29Z

@Swatinem sorry i got confused and rebased and force pushed but i really just added 5 new commits on top of the existing ones and didn't modify any of the existing ones

github-actions · 2024-12-27T19:29:51Z

This PR includes changes to shared. Please review them here: https://github.com/codecov/shared/compare/2674ae99811767e63151590906691aed4c5ce1f9...

Swatinem · 2025-01-07T12:31:35Z

services/processing/flake_processing.py

+        report__commit__repository__repoid=repo_id,
        report__commit__commitid=commit_id,


I believe commit_id already uniquely identifies the report, so no need for an additional repository join.

tasks/tests/unit/test_upload_task.py

tasks/tests/unit/test_ta_processor_task.py

services/ta_finishing.py

github-actions · 2025-01-09T16:57:16Z

This PR includes changes to shared. Please review them here: https://github.com/codecov/shared/compare/609e56d2aa30b26d44cddaba0e1ebd79ba954ac9...

Swatinem · 2025-01-10T10:26:53Z

one_off_scripts/__init__.py

@@ -1,6 +0,0 @@
-import os


I’m 👍🏻 on removing these if they won’t ever be used again, but its probably best to do that in a separate PR.

github-actions · 2025-01-10T22:33:53Z

This PR includes changes to shared. Please review them here: https://github.com/codecov/shared/compare/609e56d2aa30b26d44cddaba0e1ebd79ba954ac9...

this commit essentially does 3 things: - creates the new ta_processor and ta_finisher tasks - the difference between these tasks and the old ones is that these ones use upload states differently - these ones also use the TA storage module to persist data to BQ - updates the version of the test results parser being used - we've gone from parsing individual JUnit XML files to parsing the entire raw upload at once - creates the ta_storage module - the ta_storage module serves as an abstraction for persisting data to both PG and BQ

…asks

github-actions · 2025-01-14T21:52:02Z

This PR includes changes to shared. Please review them here: https://github.com/codecov/shared/compare/de4b37bc5a736317c6e7c93f9c58e9ae07f8c96b...

joseph-sentry requested a review from a team December 19, 2024 21:39

Swatinem reviewed Dec 20, 2024

View reviewed changes

joseph-sentry force-pushed the joseph/new-ta-tasks branch from fd1401b to 1296951 Compare December 20, 2024 17:45

Swatinem reviewed Jan 7, 2025

View reviewed changes

Swatinem approved these changes Jan 10, 2025

View reviewed changes

joseph-sentry force-pushed the joseph/new-ta-tasks branch from 0bfc4ca to 469958e Compare January 10, 2025 22:33

joseph-sentry added 2 commits January 13, 2025 10:45

fix requirements.txt

4159b65

joseph-sentry force-pushed the joseph/new-ta-tasks branch from 469958e to 4159b65 Compare January 13, 2025 15:45

Merge branch 'main' of github.com:codecov/worker into joseph/new-ta-t…

86c8b7d

…asks

Swatinem approved these changes Jan 15, 2025

View reviewed changes

joseph-sentry added this pull request to the merge queue Jan 15, 2025

Merged via the queue into main with commit 2c7bd18 Jan 15, 2025
18 of 27 checks passed

joseph-sentry deleted the joseph/new-ta-tasks branch January 15, 2025 20:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: new ta tasks #976

feat: new ta tasks #976

joseph-sentry commented Dec 19, 2024

sentry-io bot commented Dec 19, 2024

github-actions bot commented Dec 19, 2024

codecov-staging bot commented Dec 19, 2024 •

edited by codecov-notifications bot

Loading

codecov bot commented Dec 19, 2024 •

edited

Loading

codecov-qa bot commented Dec 19, 2024 •

edited

Loading

codecov-public-qa bot commented Dec 19, 2024 •

edited

Loading

github-actions bot commented Dec 19, 2024 •

edited

Loading

Swatinem Dec 20, 2024

joseph-sentry Dec 20, 2024

Swatinem Dec 20, 2024

Swatinem Dec 20, 2024

joseph-sentry Dec 20, 2024

Swatinem Dec 20, 2024

Swatinem Dec 20, 2024

joseph-sentry commented Dec 20, 2024

github-actions bot commented Dec 27, 2024

Swatinem Jan 7, 2025

github-actions bot commented Jan 9, 2025

Swatinem Jan 10, 2025

github-actions bot commented Jan 10, 2025

github-actions bot commented Jan 14, 2025

		report__commit__repository__repoid=repo_id,
		report__commit__commitid=commit_id,

feat: new ta tasks #976

feat: new ta tasks #976

Conversation

joseph-sentry commented Dec 19, 2024

sentry-io bot commented Dec 19, 2024

🔍 Existing Issues For Review

github-actions bot commented Dec 19, 2024

codecov-staging bot commented Dec 19, 2024 • edited by codecov-notifications bot Loading

Codecov Report

codecov bot commented Dec 19, 2024 • edited Loading

Codecov Report

codecov-qa bot commented Dec 19, 2024 • edited Loading

❌ 1 Tests Failed:

codecov-public-qa bot commented Dec 19, 2024 • edited Loading

❌ 1 Tests Failed:

github-actions bot commented Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joseph-sentry commented Dec 20, 2024

github-actions bot commented Dec 27, 2024

Choose a reason for hiding this comment

github-actions bot commented Jan 9, 2025

Choose a reason for hiding this comment

github-actions bot commented Jan 10, 2025

github-actions bot commented Jan 14, 2025

codecov-staging bot commented Dec 19, 2024 •

edited by codecov-notifications bot

Loading

codecov bot commented Dec 19, 2024 •

edited

Loading

codecov-qa bot commented Dec 19, 2024 •

edited

Loading

codecov-public-qa bot commented Dec 19, 2024 •

edited

Loading

github-actions bot commented Dec 19, 2024 •

edited

Loading