From 01581ac054317d8ea38aa4f637eaf8a629de8d9c Mon Sep 17 00:00:00 2001 From: Graham Lee Date: Mon, 28 Feb 2022 15:40:08 +0000 Subject: [PATCH] Document the new script #1564 --- ingestion/monitoring/README.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/ingestion/monitoring/README.md b/ingestion/monitoring/README.md index 7e4f5d395..14d34caf6 100644 --- a/ingestion/monitoring/README.md +++ b/ingestion/monitoring/README.md @@ -1,3 +1,20 @@ +# Error monitoring + +The `errorLogsToSlack.py` script reads log messages from a given Cloudwatch stream +and posts any errors to Slack. It has three inputs, all passed via the environment: + + - `SLACK_WEBHOOK` is the webhook URL to post messages to Slack. + - `INGESTION_LOG_GROUP` is the Cloudwatch log group name. + - `INGESTION_LOG_STREAM` is the Cloudwatch log stream name. + +Typically, all would be set up EventBridge in AWS when it's run in Batch. + +## To set up for a new instance + +1. see https://api.slack.com/messaging/webhooks for details on creating a Slack app and enabling web hooks. +2. change the Slack user IDs in the script to ones that represent users in your workspace (who should get notified on ingestion errors). +3. deploy to Batch + # Data monitoring Data monitoring scripts, currently there's a script to alert daily about