Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to communicate CI failures? #2

Open
benjamingr opened this issue Jun 6, 2018 · 9 comments
Open

How to communicate CI failures? #2

benjamingr opened this issue Jun 6, 2018 · 9 comments

Comments

@benjamingr
Copy link
Member

How should collaborators report unrelated CI failures?

  • In this repo? (with this?) Do the instructions in the README replace the "older process"?
  • In a nodejs/node issue? (I've seen this before)
  • In a ping to nodejs/build in the issue (I've also seen this before)?

For the instructions, "notifying the team" means a @ ping?

@joyeecheung
Copy link
Member

joyeecheung commented Jun 6, 2018

I prefer to migrate the existing flaky test issues here, just so we can reduce the noise in the core issue tracker because the biggest problem with flakes is noise anyway. Also the automation stuff should probably reply to issues and send PRs here and it just makes more sense if those are all done in one repo. It also makes it easier to search for stuff.

I think we should create subteams for nodejs/build. nodejs/build is not for fixing flaky tests, also we sometimes ping them for fixing build files but I don't think (everyone in) that team is for that either. Hence I suggested nodejs/build-infra and nodejs/build-files. The difference is: if someone from the build team would need SSH access or account access or sending PRs to the build repo to fix this, ping nodejs/build-infra, if someone would need to open a PR to the build files in core to fix this, ping nodejs/build-files

And yes, I think notification should be a ping. If it's urgent, like for infra failures, we can hop on the #nodejs-build IRC as well.

@richardlau
Copy link
Member

Maybe we can restart https://github.com/nodejs/testing / nodejs/testing? nodejs/build should only be for infrastructure (at the moment I don't see nodejs/build-infra being anything other than an alias for nodejs/build).

I don't think nodejs/build-files belongs under the Build WG.

@benjamingr
Copy link
Member Author

Ping @nodejs/testing

@maclover7
Copy link
Contributor

How should collaborators report unrelated CI failures?

I had been working on a separate UI for reporting Jenkins CI statuses at http://node-builder.herokuapp.com/, with the idea that at some point we could point out what failures were flaky tests/infrastructure issues/actual failing tests. IMHO, the current Jenkins UI makes it tough to easily and quickly identify sources of failures (right now you have to cross check nodejs/node, nodejs/build, and possibly other repos too). This might be something to look into with the commit queue work.

I think we should create subteams for nodejs/build. nodejs/build is not for fixing flaky tests, also we sometimes ping them for fixing build files but I don't think (everyone in) that team is for that either.

This would be great, to help reduce pings of nodejs/build. Should be straightforward to setup with GitHub's subteams feature

@mhdawson
Copy link
Member

mhdawson commented Jun 6, 2018

Just my 2 cents, I'd be a bit worried about removing the "noise" from the main repo. I'm thinking we want more focus/help on getting these kinds of issues. Making them less visible/troublesome may result in less focus/outside help other than those already interested in the problem.

@mmarchini
Copy link

Making them less visible/troublesome may result in less focus/outside help other than those already interested in the problem.

I don't think there will be a difference because those who are not interested in the problem are already ignoring the noise generated by flaky test issues. But IMO having a single place to easily view the state of flakiness + infra issues is more important than reducing noise.

@joyeecheung
Copy link
Member

I don't think nodejs/build-files belongs under the Build WG.

@richardlau I agree, I've always been feeling weird when we ping the build WG for changes to Makefile, etc. If people are OK with it, I can create build-infra under build and put infra admins there while creating another build-files and put it under core, just like other subsystem teams.

@joyeecheung
Copy link
Member

joyeecheung commented Jun 7, 2018

By the way there is a new automation tool for walking through the CI and pattern-match all the failure reasons (flakes, build file failures, infra failures) to print to the console. I think it's similar to http://node-builder.herokuapp.com/ , although it's a CLI.

The demo is here: https://asciinema.org/a/184727

To make it automatically report flakes or display a failure as potential flakes to users, the easiest solution is to have a flaky test database in place. The tool can be smarter and compare file changes against failures or look into previous failures to identify flakes itself, but in the end we still need human to triage and confirm.

@refack
Copy link

refack commented Jun 10, 2018

IMO best way to track infra failures and incidents is in the build repo. It's easy to check with "the action items board" and open a new issue if needed.
As for flakiness in tests, in the past we had productivity with a similar project board in the main repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants