Add actionable logging to patching runs #2009

melange396 · 2024-07-30T16:57:25Z

Patch runs are very similar to regular indicator runs, but have different reasons/purposes and theyre not run on a schedule. We should include information in our logs to signify when these runs are happening. This additional info can then be incorporated into monitoring and alerting systems to distinguish normal and patching activity, which will let us see that aberrations are due to patching.

The format of the logging additions is yet to be determined (New, additional log messages? New parameters on existing log messages? Both? Something else???), but it should be done in a way that is easily integrable into elastic and such.

melange396 · 2024-07-31T21:13:45Z

The "acquisition" step of patching runs can potentially be detected with the log message:
logger.info(event='processing csv files from issue'...
found at https://github.com/cmu-delphi/delphi-epidata/blob/8746ff2ef7a936bb93628bc1358471d7c6c4f5f8/src/acquisition/covidcast/csv_importer.py#L128

This works because patching runs need to put CSV files in a specific directory structure to specify the "issue" date for import (instead of the default "today"), and that is where that log message is emitted. This will likely only work until the following ticket is addressed, after which all indicators will supply acquisition with specific "issue" dates:

Indicator runners should output files with issue date #1907

minhkhul · 2024-08-13T22:42:59Z

Here's the plan to add just patch acquisition log to elastic:
Currently, our normal indicator acquisition jobs log out here: /var/log/epidata/csv_upload_{acq_ind_name}.log
Then that log content gets picked up by filebeat as configured here to be available on elastic stuff through this ingest pipeline.
Right now, patch acquisition is logged out here:
/var/log/filebeat-pickup/epidata.acquisition.covidcast.csv_to_database_batch-issue-upload-$(date -u +"%Y-%m-%dT%H_%M_%SZ").log
Therefore, all that has to be done to add patch acquisition log to elastic is change patch acquisition to log out at /var/log/epidata/csv_upload_patch.log in the Acquisition cronicle job, and rely on current processes to pick up the logs as usual.

To test this (and potentially other later stuff), I'm gonna set up patch acquisition log pickup to elastic on staging:

Uncomment this.
Adjust current dashboards that cares only for prod data to ignore staging.
Then check how things goes on staging with some fake patch data and these jobs.

melange396 · 2024-08-13T23:04:08Z

if you want to be sure to keep things out of other dashboards for testing purposes, instead of just uncommenting the pipeline in the staging filebeat config, change its name to filebeat-epidata-pipeline-staging and create a matching ingest pipeline with a new target_field, like "epidata_data__test"

minhkhul · 2024-09-05T23:31:02Z

Switching patch logging to be output to /var/log/epidata/batch_issue_upload.log instead of to /var/log/filebeat-pickup/epidata.acquisition.covidcast.csv_to_database_batch-issue-upload-$(date -u +"%Y-%m-%dT%H_%M_%SZ").log. This is so patch acquisition logs could be processed under the same pipeline as normal acquisition logs on elastic, which makes it easier for patch acquisition info to be seen on dashboards.
Tested the change on staging and the logs showed up as expected on elastic, so applying this to prod.

next steps: address #1907

melange396 added data quality Missing data, weird data, broken data enhancement future-solution Solutions to problems we don't have yet but still dread devops labels Jul 30, 2024

minhkhul self-assigned this Aug 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add actionable logging to patching runs #2009

Add actionable logging to patching runs #2009

melange396 commented Jul 30, 2024

melange396 commented Jul 31, 2024

minhkhul commented Aug 13, 2024 •

edited

Loading

melange396 commented Aug 13, 2024

minhkhul commented Sep 5, 2024

Add actionable logging to patching runs #2009

Add actionable logging to patching runs #2009

Comments

melange396 commented Jul 30, 2024

melange396 commented Jul 31, 2024

minhkhul commented Aug 13, 2024 • edited Loading

melange396 commented Aug 13, 2024

minhkhul commented Sep 5, 2024

minhkhul commented Aug 13, 2024 •

edited

Loading