Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: create cache rollup cron task #857

Merged
merged 12 commits into from
Nov 19, 2024
Merged

Conversation

joseph-sentry
Copy link
Contributor

this is a task that runs nightly and queues up cache test rollups tasks
based on the content of the last_rollup_date table.

basically if a rollup exists that was valid yesterday, we want to
recalculate the aggregates for the next day. Eventually the branch's
rollups become so out of date that we don't care about recalculating
the aggregates anymore

depends on: #821

this commit does 2 things:
- creates the cache test rollups task which runs the test rollups
  aggregation SQL and updates the redis cache with the results in
  pyarrow ipc format for that specific branch and interval. For the
  default branch of the repo then that cache entry won't be expired
  else it will be expired after 3 days.
- triggers this task in the test results finisher
- adds tests for the above
this commit switches the cache test rollups task to using polars instead
of pyarrow. It seems polars uses pyarrow internally anyways but we want
to use polars in the api for additional filtering and ordering of the
results we calculate and store in the cache task, so for parity with the
api we use polars here as well.
we want to have an async task that will get the cached rollup aggregates
from GCS and put them in redis for an hour.

the reason this is async is because we don't want to spend too much time
uploading to redis in the API call
this is a task that runs nightly and queues up cache test rollups tasks
based on the content of the last_rollup_date table.

basically if a rollup exists that was valid yesterday, we want to
recalculate the aggregates for the next day. Eventually the branch's
rollups become so out of date that we don't care about recalculating
the aggregates anymore
Copy link

sentry-io bot commented Nov 6, 2024

🔍 Existing Issues For Review

Your pull request is modifying functions with the following pre-existing issues:

📄 File: tasks/test_results_finisher.py

Function Unhandled Issue
process_impl_within_lock AttributeError: 'TestResultsFinisherTask' object has no attribute 'schedule_new_user_activated_task' ...
Event Count: 1

Did you find this useful? React with a 👍 or 👎

Copy link

github-actions bot commented Nov 6, 2024

This PR includes changes to shared. Please review them here: codecov/shared@83b3376...bf285c7

Copy link

codecov bot commented Nov 6, 2024

Codecov Report

Attention: Patch coverage is 98.61111% with 1 line in your changes missing coverage. Please review.

Project coverage is 98.01%. Comparing base (03c73da) to head (d5f9c17).
Report is 2 commits behind head on main.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
celery_config.py 66.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #857      +/-   ##
==========================================
- Coverage   98.06%   98.01%   -0.05%     
==========================================
  Files         444      446       +2     
  Lines       35474    35608     +134     
==========================================
+ Hits        34786    34902     +116     
- Misses        688      706      +18     
Flag Coverage Δ
integration 41.99% <43.05%> (+0.05%) ⬆️
unit 90.88% <98.61%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@codecov-staging
Copy link

codecov-staging bot commented Nov 6, 2024

Codecov Report

Attention: Patch coverage is 98.61111% with 1 line in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
celery_config.py 66.66% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@codecov-qa
Copy link

codecov-qa bot commented Nov 6, 2024

❌ 12 Tests Failed:

Tests completed Failed Passed Skipped
12 12 0 0
View the top 3 failed tests by shortest run time
 tasks.tests.integration.test_ghm_sync_plans
Stack Traces | 0s run time
No failure message available
 tasks.tests.integration.test_status_set_pending_task
Stack Traces | 0s run time
No failure message available
 tasks.tests.integration.test_timeseries_backfill
Stack Traces | 0s run time
No failure message available

To view more test analytics, go to the Test Analytics Dashboard
Got feedback? Let us know on Github

Copy link

Test Failures Detected: Due to failing tests, we cannot provide coverage reports at this time.

❌ Failed Test Results:

Completed 12 tests with 12 failed, 0 passed and 0 skipped.

View the full list of failed tests

pytest

  • Class name:
    Test name: services.tests.test_repository_service

    No failure message available
  • Class name:
    Test name: tasks.tests.integration.test_ai_pr_review

    No failure message available
  • Class name:
    Test name: tasks.tests.integration.test_ghm_sync_plans

    No failure message available
  • Class name:
    Test name: tasks.tests.integration.test_http_request_task

    No failure message available
  • Class name:
    Test name: tasks.tests.integration.test_notify_task

    No failure message available
  • Class name:
    Test name: tasks.tests.integration.test_send_email_task

    No failure message available
  • Class name:
    Test name: tasks.tests.integration.test_status_set_error_task

    No failure message available
  • Class name:
    Test name: tasks.tests.integration.test_status_set_pending_task

    No failure message available
  • Class name:
    Test name: tasks.tests.integration.test_sync_pull

    No failure message available
  • Class name:
    Test name: tasks.tests.integration.test_timeseries_backfill

    No failure message available
  • Class name:
    Test name: tasks.tests.integration.test_upload_e2e

    No failure message available
  • Class name:
    Test name: tasks.tests.unit

    No failure message available

@joseph-sentry joseph-sentry marked this pull request as ready for review November 18, 2024 20:51
@joseph-sentry joseph-sentry requested a review from a team November 18, 2024 20:51
Copy link

✅ All tests successful. No failed tests were found.

📣 Thoughts on this report? Let Codecov know! | Powered by Codecov

@@ -163,6 +166,16 @@ def run_impl_within_lock(self, repoid, branch):
)

serialized_table = df.write_ipc(None)

storage_service.write_file("codecov", storage_key, serialized_table)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you already call write_file down below, so I’m pretty sure this is misplaced. also you hardcode the codecov bucket which is wrong.

Comment on lines 27 to 30
self.app.tasks[cache_test_rollups_task_name].create_signature(
args=None,
kwargs=dict(repoid=repo.repoid, branch=branch, update_date=False),
).apply_async()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.app.tasks[cache_test_rollups_task_name].create_signature(
args=None,
kwargs=dict(repoid=repo.repoid, branch=branch, update_date=False),
).apply_async()
self.app.tasks[cache_test_rollups_task_name].s(repoid=repo.repoid, branch=branch, update_date=False).apply_async()

TBH, its really weird how celery names this function just s, but its the version of the function where you give the arguments just as you would, without explicitly spelling out the kwargs.

use celery s function instead of create_signature and remove unnecessary
write_file call
@joseph-sentry joseph-sentry added this pull request to the merge queue Nov 19, 2024
Merged via the queue into main with commit a3ca1d9 Nov 19, 2024
20 of 27 checks passed
@joseph-sentry joseph-sentry deleted the joseph/nightly-cache-cron branch November 19, 2024 22:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants