Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate build ID from ProwJob ID for Jenkins jobs #6853

Merged
merged 2 commits into from
Feb 16, 2018

Conversation

stevekuznetsov
Copy link
Contributor

It is the intent in the initupload and sidecar container utilities
to use the full information provided by Prow to the job in $JOB_SPEC
to determine where in GCS artifacts from the job should be pushed. Right
now, this is possible and useful for jobs triggered using plank but
not so for those triggered using jenkins-operator. The largest issue
is that for Jenkins builds, the $BUILD_ID parameter was being used
as an identifier to allow Prow to correlate builds with ProwJobs,
whereas in plank pods the parameter was a monotonically increasing
identifier vended by tot used for e.g. GCS. This patch introduces the
$PROW_JOB_ID variable and field to the job spec to separate these two
concerns and allow for them to be fulfilled together. We also thread
through links so that user-facing pages use the $BUILD_ID identifier
and not the $BUILD_NUMBER identifier that Jenkins produces. This also
handily solves issues with loss of state or Jenkins inconsistencies by
allowing the source of truth for job build state to live in tot.

Signed-off-by: Steve Kuznetsov skuznets@redhat.com

/cc @BenTheElder @spxtr
/assign @Kargakis @cjwagner

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 15, 2018
@@ -178,9 +178,9 @@ func main() {
MaxConcurrency: 1,
}

if err = jc.BuildFromSpec(&spec, o.jobName); err != nil {
if err = jc.BuildFromSpec(&spec, "0", o.jobName); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bad API for the manual trigger but it's not super important to tackle right now.

// to vend build identifier for the job
func GetBuildID(name, totURL string) (string, error) {
var err error
url := totURL + "/vend/" + name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use path.Join in case totURL is specified with a trailing slash.

if buf, err := ioutil.ReadAll(resp.Body); err == nil {
return string(buf), nil
}
return "", err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ReadAll fails we won't return an error here because err is redefined instead of just being assigned in the conditional.

}
defer resp.Body.Close()
if resp.StatusCode != 200 {
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set err before continuing. Otherwise if we retry 60 times and always get a valid non-200 response we won't return an error.

url := totURL + "/vend/" + name
for retries := 0; retries < 60; retries++ {
if retries > 0 {
time.Sleep(2 * time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exponential backoff and fewer retries might be better than 60 retries made uniformly over 2 minutes.

@@ -439,7 +436,7 @@ func (c *Controller) syncTriggeredJob(pj kube.ProwJob, pm map[string]kube.Pod, r
}

if pj.Status.State == kube.TriggeredState {
// BuildID needs to be set before we execute the job url template.
// ProwJobID needs to be set before we execute the job url template.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProwJobID is not set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, bad refactor rename button in IDEA

@stevekuznetsov stevekuznetsov force-pushed the skuznets/jenkins-pj-id branch 2 times, most recently from 3e64e4f to e738b08 Compare February 15, 2018 22:52
@stevekuznetsov
Copy link
Contributor Author

@cjwagner made the requested changes to the tot connection code

Copy link
Member

@cjwagner cjwagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjwagner made the requested changes to the tot connection code

Thanks for cleaning that up!
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 15, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cjwagner, stevekuznetsov

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@BenTheElder
Copy link
Member

BenTheElder commented Feb 16, 2018

Test failure looks suspicious, and prow/cmd/jenkins-operator/main.go has a conflict (probably from #6850).

@BenTheElder
Copy link
Member

Seconding the thanks though 😄
/area prow

@k8s-ci-robot k8s-ci-robot added the area/prow Issues or PRs related to prow label Feb 16, 2018
It is the intent in the `initupload` and `sidecar` container utilities
to use the full information provided by Prow to the job in `$JOB_SPEC`
to determine where in GCS artifacts from the job should be pushed. Right
now, this is possible and useful for jobs triggered using `plank` but
not so for those triggered using `jenkins-operator`. The largest issue
is that for Jenkins builds, the `$BUILD_ID` parameter was being used
as an identifier to allow Prow to correlate builds with ProwJobs,
whereas in `plank` pods the parameter was a monotonically increasing
identifier vended by `tot` used for e.g. GCS. This patch introduces the
`$PROW_JOB_ID` variable and field to the job spec to separate these two
concerns and allow for them to be fulfilled together. We also thread
through links so that user-facing pages use the `$BUILD_ID` identifier
and not the `$BUILD_NUMBER` identifier that Jenkins produces. This also
handily solves issues with loss of state or Jenkins inconsistencies by
allowing the source of truth for job build state to live in `tot`.

Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
@stevekuznetsov
Copy link
Contributor Author

@cjwagner fun:

url := path.Join([]string{"http://yuck:1234", "vend", "pls"})
// url is: 'http:/yuck:1234/vend/pls'
_, err := http.Get(url)
// err is 'Get http:///yuck:1234/vend/pls: http: no Host in request URL'
// see this         ^^^

Handling some cases where `err` asignment would not have happened
correctly and the error would have been dropped, as well as using an
exponential backoff and reducing the total number of retries done.

Signed-off-by: Steve Kuznetsov <skuznets@redhat.com>
@k8s-ci-robot k8s-ci-robot merged commit 351c416 into kubernetes:master Feb 16, 2018
// the ProwJob.ObjectMeta.Name field.
PodName string `json:"pod_name,omitempty"`

// ProwJobID is the build identifier vended by `tot`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not only vended by tot; there is also the option to use the snowflake library.

}
url.Path = path.Join(url.Path, "vend", name)
sleep := 100 * time.Millisecond
for retries := 0; retries < 10; retries++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should start using apimachinery's ExponentialBackoff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/prow Issues or PRs related to prow cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants