Skip to content

Conversation

@cameel
Copy link
Collaborator

@cameel cameel commented Oct 21, 2021

Depends on #12173 (draft until it's merged). Merged.

In the past we had some situations where the nightlies were failing and no one noticed because only two jobs post notifications to our Gitter channel. This PR fixes the problem by adding making all jobs used by the nightly workflow post notifications on failure. It also disables success notifications for t_ems_test_ext_colony (I think a failure notification is enough here not to spam the channel) but leaves the success notification for that one fuzzer job on so that we can see that notifications still work.

Unfortunately CircleCI does not have any feature that would let us enable/disable the notification job based on workflow. We have to either duplicate jobs or add parameters to every job. The only reasonable solution I found is to always include the notification but then in the notification step check if we're running a scheduled run or a PR check.

@cameel cameel requested a review from bshastry October 21, 2021 17:41
@cameel cameel self-assigned this Oct 21, 2021
@cameel cameel force-pushed the circleci-failure-notifications-for-all-nightly-jobs branch 3 times, most recently from c854887 to fd1f5ef Compare October 22, 2021 18:51
@cameel cameel force-pushed the circleci-base-dicts branch from 22b897c to e3d79ea Compare October 22, 2021 19:28
@cameel cameel force-pushed the circleci-failure-notifications-for-all-nightly-jobs branch from fd1f5ef to 842362a Compare October 22, 2021 19:28
@cameel cameel force-pushed the circleci-base-dicts branch from e3d79ea to ab3b17f Compare October 25, 2021 11:55
@cameel cameel force-pushed the circleci-failure-notifications-for-all-nightly-jobs branch 2 times, most recently from 7d67d21 to e4d304c Compare October 26, 2021 09:06
@cameel cameel force-pushed the circleci-base-dicts branch from 9dbc57f to f8853c9 Compare October 28, 2021 09:32
@cameel cameel force-pushed the circleci-failure-notifications-for-all-nightly-jobs branch from e4d304c to 90ac3d2 Compare October 28, 2021 09:32
@cameel cameel force-pushed the circleci-base-dicts branch 3 times, most recently from 246f161 to c76a8a7 Compare November 3, 2021 11:55
Base automatically changed from circleci-base-dicts to develop November 3, 2021 18:00
@cameel cameel force-pushed the circleci-failure-notifications-for-all-nightly-jobs branch from 90ac3d2 to e8885fa Compare November 4, 2021 17:29
@cameel cameel changed the title [CI] Gitter notifications on failure for all nightly jobs [CI] Gitter notifications on failure for all nightly and non-PR jobs Nov 4, 2021
@cameel
Copy link
Collaborator Author

cameel commented Nov 4, 2021

#12173 has been merged so this is now reviewable.

Also, I realized that the notifications will go not only for nightlies but also for jobs that run directly on develop and breaking. I think it's actually a good thing so I added the notify task to all jobs. I hope it won't be too spammy.

@cameel cameel marked this pull request as ready for review November 4, 2021 17:32
@cameel cameel force-pushed the circleci-failure-notifications-for-all-nightly-jobs branch from e8885fa to 205e05b Compare November 8, 2021 16:59
@bshastry
Copy link
Contributor

bshastry commented Nov 8, 2021

Also, I realized that the notifications will go not only for nightlies but also for jobs that run directly on develop and breaking. I think it's actually a good thing so I added the notify task to all jobs. I hope it won't be too spammy.

The newly added notifications for jobs on develop and breaking are triggered on failure only right? So, this means that apart from getting notified about nightly success/failure on gitter, we would also be notified on jobs that fail on non PR CIs. What would these be? External tests?

@bshastry
Copy link
Contributor

bshastry commented Nov 8, 2021

The changes look technically good to me, I am just not sure if the notification load would be too much.

@cameel
Copy link
Collaborator Author

cameel commented Nov 8, 2021

Yes only on failure.

Non-PRs would for example be release runs (when a new tag is pushed) or merging a PR into develop and breaking. The latter might be a bit spammy if we get breakage on develop and that's not uncommon. On the other hand it would be a pretty good motivator to fix it and it's a bit wasteful to have CI run on develop but then ignore these runs. Because we ignore them, we often notice breakage on develop only indirectly, from side-effects on PRs so this would help us spot it.

Recently I discovered that you can get e-mail notifications for the whole project on CircleCI but you have to explicitly "follow" it. It would be an alternative way to get notified about these runs but I think not many people on the team do it so gitter notifications give breakage much more visibility.

Finally, there's a technical difficulty in discerning failures in scheduled runs from ones on non-PRs. I would probably have to hard-code workflow names in the command, which seems very easy to break. Doing a notification in both cases was just easier to implement.

@cameel
Copy link
Collaborator Author

cameel commented Nov 8, 2021

I am just not sure if the notification load would be too much.

Well, I'd argue that breakage outside of PRs is meant to be an exception rather than the rule so it shouldn't be a big problem. The biggest downside I see is that if it fails, we're likely to get notifications from multiple jobs. It would be better to get it just once per workflow but there does not seem to be any easy way to achieve this.

Copy link
Collaborator

@ekpyron ekpyron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can try it - if it gets annoying we can still revert the behavior again!

@cameel
Copy link
Collaborator Author

cameel commented Nov 9, 2021

Let's try then :)

@cameel cameel merged commit cde5533 into develop Nov 9, 2021
@cameel cameel deleted the circleci-failure-notifications-for-all-nightly-jobs branch November 9, 2021 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants