Issue 1366 - Slack integrations [ Docker Events ] #1472

PedroFonsecaDEV · 2020-12-01T08:23:42Z

Issue This PR Addresses

This is the beginning of the implementation related to #1366.

Type of Change

This

Bugfix: Change which fixes an issue
New Feature: Change which adds functionality
Documentation Update: Change which improves documentation
UI: Change which improves UI

Description

I created a new workspace to test it, and it's working. This is just a test, a draft.

This feature will send a message to our slack channel whenever our servers are down and when they back to work again. So we can have a live check on our servers(prod and dev). This feature doesn't rely on user commands; it will always be running.
This one is covering just production.

We will have slack commands, but for now, I believe this feature is more important.

Checklist

Quality: This PR builds and passes our npm test and works locally
Tests: This PR includes thorough tests or an explanation of why it does not
Screenshots: This PR includes screenshots or GIFs of the changes made or an explanation of why it does not (if applicable)
Documentation: This PR includes updated/added documentation to user exposed functionality or configuration variables are added/changed or an explanation of why it does not(if applicable)

humphd

Nice work, this is going to be very cool.

I agree that this polling technique (doing a request every 5s to determine server health) is not ideal. What about a backwards approach? Two ideas:

First, we have code in Telescope to manage shutdowns, both graceful and error-based, see https://github.com/Seneca-CDOT/telescope/blob/736eb97c754047fc8b159d11a0216106fcb86965/src/backend/lib/shutdown.js. We could modify this code to POST to the autodeployment server with details about why we are shutting down. Then this code could be converted to a pure webhook, along the same lines as we do now from GitHub.

Second, I wonder if we could do this directly from pm2/docker-container. I'm not sure of the details, but I suspect that we can hook into events in our process manager and do something similar when the Telescope app goes down.

@manekenpix will have opinions as well, which I'd like to hear.

humphd · 2020-12-01T14:52:53Z

tools/autodeployment/package.json

@@ -4,6 +4,7 @@
  "description": "A tool for automatic deployment",
  "main": "server.js",
  "dependencies": {
+    "axios": "0.21.0",


axios isn't very well maintained. Let's use one of node-fetch or got

humphd · 2020-12-01T14:53:37Z

tools/autodeployment/env.example

@@ -16,3 +16,6 @@ UNSPLASH_CLIENT_ID=

 # Path to certificates
 PATH_TO_CERTS=
+
+#slack message webhook


Space after # and let' give a more complete comment about what this is. A URL? What format? Give an example, link to docs one might need, etc.

humphd · 2020-12-01T14:56:36Z

tools/autodeployment/slack_app/index.js

@@ -0,0 +1,54 @@
+require('dotenv').config();


We only need to do this once per app, and ideally as early as possible. We already have it in https://github.com/Seneca-CDOT/telescope/blob/master/tools/autodeployment/server.js#L1, so you can assume the env is populated already and just use it.

humphd · 2020-12-01T14:59:00Z

tools/autodeployment/slack_app/index.js

+    let prodResponse = {};
+
+    try {
+      prodResponse = await axios.get('https://telescope.cdot.systems/posts', {


We have a health check API endpoint for this, see https://telescope.cdot.systems/health.

manekenpix · 2020-12-01T23:24:28Z

tools/autodeployment/server.js

@@ -7,6 +7,7 @@ const mergeStream = require('merge-stream');
 const fs = require('fs');

 const { buildStart, buildStop, handleStatus } = require('./info');
+const { prodStatus } = require('./slack_app');


We use the same code for prod and dev, so you can name this status or telescopeStatus.

manekenpix · 2020-12-01T23:28:42Z

tools/autodeployment/slack_app/index.js

+const prodStatus = () => {
+  let alarm = false;
+
+  setInterval(async () => {


If we end up keeping this, I think it's better to use setInterval Async, like we do in the wiki-feed-parser.

manekenpix · 2020-12-01T23:29:23Z

tools/autodeployment/slack_app/index.js

+        const downDate = new Date().toTimeString();
+        axios.post(SLACK_MESSAGE_HOOK, {
+          response_type: 'in_channel',
+          text: `\n:rotating_light: PRODUCTION IS DOWN. :rotating_light:\n`,


Here, instead of hardcoding Prod or Dev, you can use DEPLOY_TYPE's value to know which server is running this code.

manekenpix · 2020-12-01T23:40:28Z

I like the first approach suggested by @humphd, but I'd still keep the automated check (not every 5 secs, but maybe every 5 or 10 min) so we get an alert in those weird cases in which, after a successful build, the front-end works but the back-end crashes at launch.

PedroFonsecaDEV · 2020-12-02T01:12:25Z

@humphd What about check if our backend is up in the root(OS)? Something like cron jobs checking the backend container. What do you think?

humphd · 2020-12-02T02:11:43Z

Apparently we can get events from docker-compose itself:

PedroFonsecaDEV · 2020-12-02T02:19:32Z

@humphd okay cool. I will try pm2/docker.
Thanks.

humphd · 2020-12-02T14:18:42Z

Another thought I had this morning: dev could monitor prod, and vice versa, like siblings checking in on each other periodically. This would be in addition to the internal checks, basically seeing if the backend is still online.

manekenpix · 2020-12-03T01:04:37Z

A bit of research on docker events: a bash script that processes docker events and filters the ones including start or stop.
I think it's a good start for testing. This approach can be used to send messages to slack depending on emitted events,

#!/bin/bash
docker events |  while read line; do  if [[ ${line} = *"start"* || ${line} = *"stop"* ]];then echo " Event: ${line}" ;fi ; done

PedroFonsecaDEV · 2020-12-03T04:36:24Z

@humphd @manekenpix
I like both ideas. I found this PM2 tool:
https://pm2.keymetrics.io/docs/usage/docker-pm2-nodejs/

PedroFonsecaDEV · 2020-12-03T10:20:07Z

I followed all the advice, and the partial result is good.
Probably I will land a PR later today(Thursday).

Thanks.
@humphd @manekenpix

PedroFonsecaDEV · 2020-12-04T08:01:31Z

@humphd @manekenpix

I've been talking with Josue for a few days; he explained to me a lot about how Telescope builds on CDOT machine and about Telescope's architecture, and he also helped me with a lot of things. We discussed many options for this new feature, and I decided to use an event-driven approach to listen to our docker system.
This PR is totally different from the last one. Now we use the auto-deployment server to listen to docker events using bash scripts. Here you will find a sketch of my proposal. All the error messages are just for demonstration purposes.
Once we agree, I will continue to work on the feature.
Thanks

humphd · 2020-12-07T17:29:11Z

tools/autodeployment/env.example

+
+# Slack "Incoming WebHook" to send messages to #telescope channel.
+# For more information: https://api.slack.com/messaging/webhooks
+SLACK_SEND_MESSAGE=


Let's add an example of what this should look like in addition to the docs.

humphd · 2020-12-07T17:32:54Z

tools/autodeployment/src/docker_events/index.js

+
+const dockerEvents = new Events.EventEmitter();
+
+const message = {


const message = { down: { response_type: 'in_channel', text: `\n:rotating_light: PRODUCTION IS DOWN. :rotating_light:\n`, ...etc, }, up: { ...values for up message } };

Now you can use message.up and message.down below.

humphd · 2020-12-07T17:33:53Z

tools/autodeployment/src/docker_events/index.js

+};
+
+const waitingDocker = () => {
+  shell.chmod('700', './waitDocker.sh');


This seems odd/dangerous to do in a script. Can we not do it as part of installing the autodeployment server once?

humphd · 2020-12-07T17:35:09Z

tools/autodeployment/src/docker_events/failCheck.sh

@@ -0,0 +1,8 @@
+#!/bin/bash
+


Let's add a comment block to the top of this file explaining what you're doing and why.

humphd · 2020-12-07T17:35:59Z

tools/autodeployment/src/docker_events/waitDocker.sh

@@ -0,0 +1,8 @@
+#!/bin/bash
+


Same thing here, let's document what this is.

PedroFonsecaDEV · 2020-12-08T22:09:56Z

@humphd
Okay. working on it.

PedroFonsecaDEV · 2020-12-12T02:10:14Z

@manekenpix @humphd

I will break the #1472 into different PRs.

Here we have just the docker event listener, and it is not ready yet. It still needs to improve. I'm still working on the following "bugs":

Docker doesn't emit an event when we close it (Docker -> Quit Docker Desktop), and docker events runs only if the Docker app is running. So if docker crashes or closes for some reason, the docker events will quit without any error code; without sent us an Event. (Maybe enforce PM2 to restart Docker if it's crashed or not responding??).

About PM2, I believe that we need to enforce it to keep the auto-deployment server running(restart the server if needed).

There will be a second PR landing ~~in the next hours~~ tomorrow (Saturday) covering the "siblings check". as David suggested:

humphd commented 10 days ago
Another thought I had this morning: dev could monitor prod, and vice versa, like siblings checking in on each other periodically. This would be in addition to the internal checks, basically seeing if the backend is still online.

Thanks.

PedroFonsecaDEV · 2020-12-12T02:15:24Z

tools/autodeployment/src/docker_events/index.js

+  });
+  event.stdout.on('data', (data) => {
+    const containerEvent = JSON.parse(data);
+    if (containerEvent.Actor.Attributes.name.includes('telescope')) {


@manekenpix, this line is for filter containers. Because we could have more than one project running Docker in the machine, let's think of how it would work on CDOT machines.
This line works on my local machine because all the docker's containers start with the word "telescope"; but I think this is different on CDOT machines.

PedroFonsecaDEV · 2020-12-12T02:19:13Z

tools/autodeployment/src/docker_events/index.js

+    return;
+
+  if (eventsDown.includes(Action)) {
+    containersUp.has(name) ? containersUp.delete(name) : null;


@humphd
How can I improve my coding style here?
I'm trying to avoid callbacks hell, and then I had to disable linter (/* eslint-disable no-unused-expressions */) to code these lines where I use a ternary operator with null.

PedroFonsecaDEV · 2020-12-12T02:52:18Z

It's notifying about every single container of the docker-compose.
The message is to demonstrate how it works. We need to find a better message.

humphd · 2021-01-26T15:51:17Z

@PedroFonsecaDEV from triage today, we are thinking that this is something to return to after the next.js port. Is that your feeling? Or should we work on this in the 1.6 time frame? Happy to go either way.

PedroFonsecaDEV · 2021-01-26T19:10:47Z

@humphd agree with you. We should focus on porting to Next.

izhuravlev · 2021-02-23T15:15:18Z

Thank you very much to everyone committing to this PR, your work is very appreciated.

We will be coming back to this PR later, so we are closing it for now.

PedroFonsecaDEV requested a review from manekenpix December 1, 2020 08:23

vercel bot deployed to Preview December 1, 2020 08:23 View deployment

PedroFonsecaDEV requested a review from humphd December 1, 2020 08:23

PedroFonsecaDEV self-assigned this Dec 1, 2020

PedroFonsecaDEV added type: enhancement New feature or request type: nice to have Feature that'd be nice to have, but not a priority labels Dec 1, 2020

humphd requested changes Dec 1, 2020

View reviewed changes

manekenpix requested changes Dec 1, 2020

View reviewed changes

PedroFonsecaDEV force-pushed the issue-1366 branch from 1a4696d to efc01d9 Compare December 4, 2020 07:34

vercel bot deployed to Preview December 4, 2020 07:35 View deployment

PedroFonsecaDEV requested review from humphd and manekenpix December 4, 2020 08:01

humphd requested changes Dec 7, 2020

View reviewed changes

PedroFonsecaDEV added 7 commits December 11, 2020 19:17

Updated: env.example

dd95140

Added: failCheck.sh and Updated: package.json

5456f52

Added: waitDocker.sh

bc6a67a

New dockerEvent module

25167fa

message object

bee603f

chmod removed

6a7fffe

Comments on .sh

ba275eb

PedroFonsecaDEV added 4 commits December 11, 2020 19:17

env.example doc upgraded

8cf88b2

checking server (prod or dev)

591d4de

updated async functions

505e065

docker-listener

39d5652

PedroFonsecaDEV force-pushed the issue-1366 branch from efc01d9 to 39d5652 Compare December 12, 2020 01:43

vercel bot deployed to Preview December 12, 2020 01:43 View deployment

PedroFonsecaDEV requested a review from humphd December 12, 2020 02:10

PedroFonsecaDEV commented Dec 12, 2020

View reviewed changes

PedroFonsecaDEV changed the title ~~Issue 1366 - Investigate Slack integrations for failed or successful builds~~ Issue 1366 - Slack integrations [ Docker Events ] Dec 13, 2020

PedroFonsecaDEV mentioned this pull request Dec 13, 2020

Issue 1366 - Slack integrations [ Siblings Check ] #1530

Closed

8 tasks

PedroFonsecaDEV added area: tools area: deployment Production or Staging deployment labels Dec 13, 2020

PedroFonsecaDEV linked an issue Dec 13, 2020 that may be closed by this pull request

Investigate Slack integrations for failed or successful builds #1366

Open

PedroFonsecaDEV removed a link to an issue Dec 13, 2020

Investigate Slack integrations for failed or successful builds #1366

Open

PedroFonsecaDEV linked an issue Dec 13, 2020 that may be closed by this pull request

Investigate Slack integrations for failed or successful builds #1366

Open

humphd requested a review from Metropass January 19, 2021 15:50

PedroFonsecaDEV added the Blocked Can't do this, until something else is done label Jan 26, 2021

izhuravlev closed this Feb 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue 1366 - Slack integrations [ Docker Events ] #1472

Issue 1366 - Slack integrations [ Docker Events ] #1472

PedroFonsecaDEV commented Dec 1, 2020

humphd left a comment

humphd Dec 1, 2020

humphd Dec 1, 2020

humphd Dec 1, 2020

humphd Dec 1, 2020

manekenpix Dec 1, 2020

manekenpix Dec 1, 2020

manekenpix Dec 1, 2020

manekenpix commented Dec 1, 2020

PedroFonsecaDEV commented Dec 2, 2020

humphd commented Dec 2, 2020

PedroFonsecaDEV commented Dec 2, 2020

humphd commented Dec 2, 2020

manekenpix commented Dec 3, 2020 •

edited

Loading

PedroFonsecaDEV commented Dec 3, 2020

PedroFonsecaDEV commented Dec 3, 2020 •

edited

Loading

PedroFonsecaDEV commented Dec 4, 2020

humphd Dec 7, 2020

humphd Dec 7, 2020

humphd Dec 7, 2020

humphd Dec 7, 2020

humphd Dec 7, 2020

PedroFonsecaDEV commented Dec 8, 2020

PedroFonsecaDEV commented Dec 12, 2020 •

edited

Loading

PedroFonsecaDEV Dec 12, 2020

PedroFonsecaDEV Dec 12, 2020

PedroFonsecaDEV commented Dec 12, 2020 •

edited

Loading

humphd commented Jan 26, 2021

PedroFonsecaDEV commented Jan 26, 2021

izhuravlev commented Feb 23, 2021


		const dockerEvents = new Events.EventEmitter();

		const message = {

Issue 1366 - Slack integrations [ Docker Events ] #1472

Issue 1366 - Slack integrations [ Docker Events ] #1472

Conversation

PedroFonsecaDEV commented Dec 1, 2020

Issue This PR Addresses

Type of Change

Description

Checklist

humphd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

manekenpix commented Dec 1, 2020

PedroFonsecaDEV commented Dec 2, 2020

humphd commented Dec 2, 2020

PedroFonsecaDEV commented Dec 2, 2020

humphd commented Dec 2, 2020

manekenpix commented Dec 3, 2020 • edited Loading

PedroFonsecaDEV commented Dec 3, 2020

PedroFonsecaDEV commented Dec 3, 2020 • edited Loading

PedroFonsecaDEV commented Dec 4, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PedroFonsecaDEV commented Dec 8, 2020

PedroFonsecaDEV commented Dec 12, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PedroFonsecaDEV commented Dec 12, 2020 • edited Loading

humphd commented Jan 26, 2021

PedroFonsecaDEV commented Jan 26, 2021

izhuravlev commented Feb 23, 2021

manekenpix commented Dec 3, 2020 •

edited

Loading

PedroFonsecaDEV commented Dec 3, 2020 •

edited

Loading

PedroFonsecaDEV commented Dec 12, 2020 •

edited

Loading

PedroFonsecaDEV commented Dec 12, 2020 •

edited

Loading