Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control scheduled harvest jobs to run at night time #5009

Open
1 task
FuhuXia opened this issue Dec 9, 2024 · 1 comment
Open
1 task

Control scheduled harvest jobs to run at night time #5009

FuhuXia opened this issue Dec 9, 2024 · 1 comment

Comments

@FuhuXia
Copy link
Member

FuhuXia commented Dec 9, 2024

User Story

In order to have minimal impact to the Catalog frontend, data.gov team wants the scheduled harvest jobs to run at night time 2-3 am ET.

Scheduled jobs are supposed to run at night time, but the 15-min job run interval means that each harvest job is run some minutes later than the last run and eventually nightly jobs become jobs at business hours. This creates a few issues related to catalog frontend performance and user experiences such as solr downtime.

As of now most jobs are run around 9am.
image

Acceptance Criteria

[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]

  • NewRelic query space_name:prod app_name:catalog-gather shows the peak activity happens at 2-3am ET

Background

Research shows that a weekly job start time shift from 9am to 5pm (8 hours) over one year period. Daily job is supposed to shift faster.

Security Considerations (required)

[Any security concerns that might be implicated in the change. "None" is OK, just be explicit here!]

Sketch

If we correct the job start time manually, we need to do the step regularly, say every 6 months.

@btylerburton
Copy link
Contributor

Determine property in CKAN db that holds the time for next harvest and then reset all via SQL command to 02:00 EST

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: 📥 Queue
Development

No branches or pull requests

3 participants