Posting build progress to github #236

jbergstroem · 2015-11-02T22:45:10Z

I've been tinkering with a set of scripts that posts feedback to github. Here's a quick outline on how it works:

There's a boolean called POST_STATUS_TO_PR in the node-test-pull-request job. Make sure that's checked and enter your PR.
Each worker as part of the node-test-commit will ping github and say they've started building/testing (as of now, see Unfold make-ci in jenkins #229 for improvements in this regard). These workers are currently: ci/linter, ci/freebsd, ci/smartos, ci/linux, ci/plinux, ci/windows-fanned, ci/osx ci/arm and ci/arm-fanned.
Once the test suite completes we're checking if the test suite returned either unstable or success status. If that's the case we post a success and a text mentioning total tests run plus any skips (linter accts differently plus needs more work since we need tap output; see this example pr for work in progress).
If the suite has any other status (currently only supports failed tests) we post the number of failed tests, skipped and total.

In order to achieve above I had to install a new jenkins plugin that gives us access to execute logic as part of the post-build phase. I'm using the xml api and parse output in python (for portability) when needed.

Have a look at a few of the above jobs (edit in jenkins) for how it currently works. There's a few things that still needs to be done, where at least one would be up for discussion:

port bash stuff to windows
unfold build steps (Unfold make-ci in jenkins #229) so we can distinguish between a compile and test error.
land changes to linter can output tap (tools: add tap output to cpplint node#3448 and wip PR for eslint)
figure out how to safely store github credentials (currently using the personal token)

So, the last one is a bit of a tricky problem. Since we can't really trust input from a PR this is an issue. Jenkins has a few ways of storing passwords (credential store and global passwords) and tries to mask it in some cases; but that sandbox was pretty easy to escape from.

The text was updated successfully, but these errors were encountered:

jbergstroem · 2015-11-02T22:50:43Z

One option to solve the credentials issue would be to create a small proxy script that adds this credential. This creates another trust issue since we then lack logic to control when it should fire, but that might be the better trade.

Starefossen · 2015-11-04T01:27:20Z

Just making sure, and you are probably aware, that there exists an GitHub OAuth API scope which only grants access to the Status API – namely the repo:status scope.

jbergstroem · 2015-11-04T02:20:15Z

@Starefossen yep, that's what we're using but its still leakable in its current form.

DavidTPate · 2015-11-04T03:39:06Z

The credentials is where it really starts to get difficult, I haven't seen a good way yet. If you look at Travis CI for example. When dealing with encrypted things (such as credentials or keys) it just doesn't provide them with PRs that aren't from the same repository.

You could totally limit the impact of the credentials being discovered by limiting the scope of the keys (and you want to do this regardless). But the ideal case would be to not have the keys leaked in the first place. What typically happens with most SaSS products that do this right now is that their status is reported by a service that they manage which has open access. It's not exactly ideal since anyone could update the status, but it gets us to the point where we have kept our credentials safe.

The last part would be limiting access to the service for updating statuses. I'm not familiar with the infrastructure, but if a web service can be created which simply updates the status for PRs and either have access limited by CIDRs, Routing, or some other method that would get us pretty much there.

jbergstroem · 2015-11-04T03:57:54Z

@DavidTPate that's roughly what I suggested with my 'proxy' script. The problem is still that anyone can call it if they know what they're doing. The layer of security by obscurity is tricky to get rid of when you allow people to modify source code.

thefourtheye · 2015-11-04T04:07:23Z

Given that our CI is mostly red these days, if we could somehow show the list of failing tests and their corresponding environments, it would be awesome.

DavidTPate · 2015-11-04T04:08:48Z

Yeah, someone who knows what they are doing would still be able to manipulate it it would just have a tougher barrier to get to that point. It's definitely a tough problem.

The only way that I can think to really do this and limit exposure would be to have some credential generated for each build that allows exactly one call to update the build status for each job.

jbergstroem · 2015-11-04T04:28:18Z

Thing is. If you know what you're doing you can privilege escalate other stuff (or rm -rf) as well. I think this is more about finding "good enough" security, then trust that people that start jobs actually glance over a PR before submitting it for execution.

DavidTPate · 2015-11-04T16:14:26Z

@jbergstroem Yeah, that seems to be the case to me, there just doesn't seem to be a good way to completely secure something like this and "good enough" is a great start.

orangemocha · 2015-11-04T17:05:59Z

Is there any way that we can distinguish Jenkins' success from unstable (only flaky tests failed)? I am concerned that if people start relying on the status checks to vet their PRs, that we'll lose visibility on flaky tests. Reporting the list of failed flaky tests back to GitHub would be ideal.

jbergstroem · 2015-11-04T19:37:21Z

@orangemocha at github, not really -- we've got in progress, success or failed. What I've done though is added a note in the text mentioning how many flaky tests were run.

jbergstroem · 2015-11-09T02:32:54Z

After giving it some thought I'm thinking we should do what @rvagg has been suggesting;

have a hook in node-test-pull-request that pings a server that starts polling
poll node-test-commit for slaves
poll each slave for updates until the parent is closed

polling sucks but this completely avoids any security-related issues and makes it more portable, meaning others can benefit from our work.

Starefossen · 2015-11-09T07:31:38Z

Makes sense. Nothing wrong in taking the secure route here.

DavidTPate · 2015-11-09T16:29:56Z

That sounds like a good solution, polling does suck but it seems like a very acceptable tradeoff here.

Starefossen · 2015-12-05T17:03:16Z

@jbergstroem what is the status (pun not intended) here? I will have more time the next few weeks to help out with this if needed.

jbergstroem · 2015-12-05T20:18:21Z

@Starefossen great news! Haven't started with this yet. Let coordinate something.

Starefossen · 2015-12-05T22:01:52Z

Great! Is the polling service still the plan? I have played around with the node-test-pull-request REST API and it looks like we can get all the status we need from that single endpoint without having to poll the individual node-test-commit jobs.

https://ci.nodejs.org/job/node-test-pull-request/932/api/json

  "subBuilds": [
    {
      "abort": false,
      "build": {
        "subBuilds": [
          {
            "abort": false,
            "build": {

            },
            "buildNumber": 1395,
            "duration": "16 min",
            "icon": "blue.png",
            "jobName": "node-test-commit-arm",
            "parentBuildNumber": 1359,
            "parentJobName": "node-test-commit",
            "phaseName": "Tests",
            "result": "SUCCESS",

Just need to know how this endpoint behaves during a build and we should be good to go, my guess is that "building": true while it is building.

jbergstroem · 2015-12-06T07:30:42Z

While at it, I think we should make something more generic. My thoughts are currently something in style with:

create jobs with endpoints. a job would represent a job at jenkins. also, understand the notion of sub-jobs.
create an api endpoint which receives a post for job notifications. This could be from github or jenkins (jenkins in our case, every time a job is created)
poll the specific job about connected slaves
store and update states;
- what's going on right now?
- is github successfully updated about it?
once a slave finishes:
- report back to gh, finish or pass
- store all information since things like flaky tests might be available at github in a later version.

We could also have a generic poller -- wouldn't be my preferred route though.

Starefossen · 2015-12-06T08:17:41Z

Not sure I got the first create jobs part, but otherwise this is my understanding on how this service could be implemented:

# post jenkins build status to github pull request
algorithm jenkins-github-status is
  input: Integer job_id
         Integer pr_id
  output: Void

  # save sub-build result between loop intervals
  cache ← new Map()

  do
    job ← jenkins.getJob(job_id)

    for build in job.subBuilds do
      # only update GitHub if status has changed since last loop
      if cache[build.jobName] ≠ build.result
        cache[build.jobName] ← build.result
        github.postStatus(pr_id, build.jobName, build.result)
      end if                 
    end for                  

  while                      
    job.building == true     

  end do                     

  return

end algorithm

This is obviously a simplification as you can not post a status to GitHub without having the shasum of one of the commits in the pull request, and since our builds take 16+ minutes to complete there should probably be a 60 seconds delay between the while loops intervals etc.

jbergstroem · 2015-12-06T08:49:10Z

Generally looks good. Few comments;

jenkins has an guesstimation for job length; we can use that as part of our "polling interval frequency algoritm"
the job definition probably needs to be thought through; it can contain things like what input is expected to launch a poller against a job (looking up through shasum might not be impossible since that information is available in the node-test-commit sub-task)

Starefossen · 2015-12-06T09:01:42Z

jenkins has an guesstimation for job length; we can use that as part of our "polling interval frequency algoritm"

Good suggestion!

the job definition probably needs to be thought through; it can contain things like what input is expected to launch a poller against a job (looking up through shasum might not be impossible since that information is available in the node-test-commit sub-task)

From the (current) node-test-pull-request action parameters we can query the GitHub API to get the commits for the pull request under testing (TARGET_GITHUB_ORG + TARGET_REPO_NAME + PR_ID) assuming the service has access to that repo of course.

We can also use the POST_STATUS_TO_PR parameter to control whether the service should post statuses to GitHub or not.

  "actions": [
    {
      "parameters": [
        {
          "name": "TARGET_GITHUB_ORG",
          "value": "nodejs"
        },
        {
          "name": "TARGET_REPO_NAME",
          "value": "node"
        },
        {
          "name": "PR_ID",
          "value": "4116"
        },
        {
          "name": "POST_STATUS_TO_PR",
          "value": true

jbergstroem · 2015-12-06T09:10:41Z

Yes, but we need to unfold this into the workers at node-test-commit. At that level we'll have sha1 as well. Just saying that it'd be pretty easy to find a job based on sha1 (what we would get from github if we chose that route) since in most cases there'll only be one test-commit running the same sha.

Not saying using sha is the way to go here; I just see this util useful for more people than us.

jbergstroem · 2015-12-06T09:14:27Z

The main problem with a hook from jenkins is that we'd have to share a secret; similar to the constraints of the current solution. Polling would remedy that, but so would sha1 from github as well; for instance having a hook receive input from github on new pr's or comments on a pr, checking pr id and/or together with link in comment.

Fishrock123 · 2016-04-10T19:18:08Z

As a note, nodejs-github-bot now posts GitHub statuses. The live bot hasn't been updated yet, but it should first roll out for readable-stream, nodejs.org either at the same time or later. (pr#15).

I'll try look into how we might do this, but any help on the build end would be great.

Some important notes:

Possible GitHub statuses: success, pending, failure, error. Additional "description" info can also be provided.
Statuses also have a url parameter to link directly to the build.

We currently do it by PR for travis, but fully by commit only is totally possible. So we could either do the linking via PR id or by sha.

jbergstroem · 2016-04-10T21:16:51Z

@Fishrock123 I have a few suggestions/ideas; will post them shortly.

Fishrock123 · 2016-04-11T15:13:25Z

Is there any way that we can distinguish Jenkins' success from unstable (only flaky tests failed)? I am concerned that if people start relying on the status checks to vet their PRs, that we'll lose visibility on flaky tests. Reporting the list of failed flaky tests back to GitHub would be ideal.

I contacted GitHub support about this, and their suggestion was to report back flaky tests as a separate status. I'm not really sure that is possible to separate out of jenkins easily though?

(Actually maybe I'm overthinking it and it isn't that hard..)

Fishrock123 · 2016-04-11T15:15:23Z

If it's green we can just report double green (or just a single green) status. If it is flay we change/add one that notes that flaky tests failed.

jbergstroem · 2016-04-11T15:21:41Z

My plan of attack is to report each sub-worker of node-test-pr with results (if skip or fail; mention fails, skips and total) as well as introducing linter results and doing new code for commit messages.

Fishrock123 · 2016-04-11T21:12:29Z

@jbergstroem do you plan on doing directly from the CI, or just providing hooks to the bot? 9I sorta prefer the latter because then we can do a lot of it in JS..)

jbergstroem · 2016-04-12T03:01:39Z

@Fishrock123 No, we can't do it from CI. We need to have the bot poll both github [pubsubhub events] and CI (api) to match what's being run, then poll each sub-job and post pending/ok/fail. I'm a bit in transit this week but are putting together a document that should outline what I see needs being done.

Fishrock123 · 2016-04-12T03:15:41Z

@jbergstroem ok sounds good, the bot sound be able to adapt to that. I'm pretty sure anything we do will be less of a mess than trying to proxy travis haha. :)

jbergstroem · 2016-04-12T03:19:56Z

@Fishrock123: yeah; really looking forward to seeing this in action.

maclover7 · 2017-11-04T01:28:14Z

Can this be closed in favor of #790?

jbergstroem mentioned this issue Nov 11, 2015

Flaky tests #248

Closed

jbergstroem mentioned this issue Nov 26, 2015

Enable CI/CD for PRs to iojs/website #57

Closed

jbergstroem mentioned this issue Feb 9, 2016

Enable and require use of node-accept-pull-request? #324

Closed

jbergstroem mentioned this issue Apr 4, 2016

Monitor updates to pr and post Jenkins CI progress nodejs/github-bot#5

Closed

jbergstroem mentioned this issue May 3, 2016

Proxy Jenkins build status to GitHub nodejs/github-bot#35

Merged

rvagg closed this as completed Nov 4, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Posting build progress to github #236

Posting build progress to github #236

jbergstroem commented Nov 2, 2015

jbergstroem commented Nov 2, 2015

Starefossen commented Nov 4, 2015

jbergstroem commented Nov 4, 2015

DavidTPate commented Nov 4, 2015

jbergstroem commented Nov 4, 2015

thefourtheye commented Nov 4, 2015

DavidTPate commented Nov 4, 2015

jbergstroem commented Nov 4, 2015

DavidTPate commented Nov 4, 2015

orangemocha commented Nov 4, 2015

jbergstroem commented Nov 4, 2015

jbergstroem commented Nov 9, 2015

Starefossen commented Nov 9, 2015

DavidTPate commented Nov 9, 2015

Starefossen commented Dec 5, 2015

jbergstroem commented Dec 5, 2015

Starefossen commented Dec 5, 2015

jbergstroem commented Dec 6, 2015

Starefossen commented Dec 6, 2015

jbergstroem commented Dec 6, 2015

Starefossen commented Dec 6, 2015

jbergstroem commented Dec 6, 2015

jbergstroem commented Dec 6, 2015

Fishrock123 commented Apr 10, 2016

jbergstroem commented Apr 10, 2016

Fishrock123 commented Apr 11, 2016

Fishrock123 commented Apr 11, 2016

jbergstroem commented Apr 11, 2016

Fishrock123 commented Apr 11, 2016

jbergstroem commented Apr 12, 2016

Fishrock123 commented Apr 12, 2016

jbergstroem commented Apr 12, 2016

maclover7 commented Nov 4, 2017

Posting build progress to github #236

Posting build progress to github #236

Comments

jbergstroem commented Nov 2, 2015

jbergstroem commented Nov 2, 2015

Starefossen commented Nov 4, 2015

jbergstroem commented Nov 4, 2015

DavidTPate commented Nov 4, 2015

jbergstroem commented Nov 4, 2015

thefourtheye commented Nov 4, 2015

DavidTPate commented Nov 4, 2015

jbergstroem commented Nov 4, 2015

DavidTPate commented Nov 4, 2015

orangemocha commented Nov 4, 2015

jbergstroem commented Nov 4, 2015

jbergstroem commented Nov 9, 2015

Starefossen commented Nov 9, 2015

DavidTPate commented Nov 9, 2015

Starefossen commented Dec 5, 2015

jbergstroem commented Dec 5, 2015

Starefossen commented Dec 5, 2015

jbergstroem commented Dec 6, 2015

Starefossen commented Dec 6, 2015

jbergstroem commented Dec 6, 2015

Starefossen commented Dec 6, 2015

jbergstroem commented Dec 6, 2015

jbergstroem commented Dec 6, 2015

Fishrock123 commented Apr 10, 2016

jbergstroem commented Apr 10, 2016

Fishrock123 commented Apr 11, 2016

Fishrock123 commented Apr 11, 2016

jbergstroem commented Apr 11, 2016

Fishrock123 commented Apr 11, 2016

jbergstroem commented Apr 12, 2016

Fishrock123 commented Apr 12, 2016

jbergstroem commented Apr 12, 2016

maclover7 commented Nov 4, 2017