Add cherrypicker to prow deployment bundle #7393

jennybuckley · 2018-03-23T20:27:36Z

Follow up to #7345

Fixes up cherrypicker's build to look more like needs-rebase and adds it to the prow deployment bundle.

Fixes #7332
/area prow

jennybuckley · 2018-03-23T20:59:11Z

Having issues with prow/bump.sh...

INFO: Running command line: bazel-bin/prow/release-push
E0323 13:56:20.441786  216948 docker_session_.py:335] Error during upload of: gcr.io/k8s-prow/cherrypicker:v20180323-7a14d0f59
F0323 13:56:20.442425  216948 fast_pusher_.py:162] Error publishing gcr.io/k8s-prow/cherrypicker:v20180323-7a14d0f59: response: {'content-length': '57', 'x-google-shellfish-status': 'CA0gBEBG', 'x-google-gfe-response-code-details-trace': 'response_code_set_by_backend', 'x-google-netmon-label': '/bns/ph/borg/ph/bns/cloud-containers-registry/registry.server/5', 'x-xss-protection': '1; mode=block', 'cache-control': 'private', 'alt-svc': 'hq=":443"; ma=2592000; quic=51303432; quic=51303431; quic=51303339; quic=51303335,quic=":443"; ma=2592000; v="42,41,39,35"', 'x-google-gfe-service-trace': 'cloud-containers-registry', 'status': '403', 'transfer-encoding': 'chunked', 'x-google-backends': '/bns/ph/borg/ph/bns/cloud-containers-registry/registry.server/5,pcaj77-v6:443', '-content-encoding': 'gzip', 'date': 'Fri, 23 Mar 2018 20:56:17 GMT', 'x-google-gfe-request-trace': 'pcaj77-v6:443,/bns/ph/borg/ph/bns/cloud-containers-registry/registry.server/5,pcaj77-v6:443', 'x-google-dos-service-trace': 'main:cloud-containers-registry', 'x-google-service': 'cloud-containers-registry', 'server': 'Docker Registry', 'x-google-gfe-response-body-transformations': 'chunked', 'docker-distribution-api-version': 'registry/2.0', 'x-frame-options': 'SAMEORIGIN', 'content-type': 'application/json'}
Access denied.: None

BenTheElder · 2018-03-23T21:00:17Z

prow/bump.sh publishes images etc., please don't do that in this PR. we'll bump prow in a different PR.

jennybuckley · 2018-03-23T21:06:06Z

@BenTheElder
Okay, but how can I write the deployment.yaml without knowing what the image will be tagged as?

BenTheElder · 2018-03-23T21:07:21Z

@jennybuckley you can put any image tag in the initial deployment.yaml, the script will update the tag.

0xmichalis · 2018-03-28T20:24:25Z

prow/cluster/cherrypicker_deployment.yaml

+          secretName: hmac-token
+      - name: oauth
+        secret:
+          secretName: oauth-token


I would suggest using a bot different from k8s-ci-robot. All it needs is org membership in order for its PRs to get automatically tested but other than that it shouldn't need any write permissions in repos.

fejta-bot perhaps? 🙃
thoughts @fejta?

If we do decide to use that bot do we just replace "oauth-token" with "fejta-bot-token" ?

Or do we need a new bot / token

fejta-bot sgtm assuming it needs no permissions to anything aside from a github account

do we need to set up a new token for it, or is there an existing one?

I can't find any mention of fejta-bot in this repo, aside from a few comments.

Search prow/config.yaml for prior art:

test-infra/prow/config.yaml

Lines 13705 to 13716 in 5b2902d

- --token-file=/etc/token/bot-github-token

- --test-owners-csv=/test_owners.csv

- --triage-window=1

- --triage-count=10

- --flakyjob-count=3

volumeMounts:

- name: token

mountPath: /etc/token

volumes:

- name: token

secret:

secretName: fejta-bot-token

spiffxp · 2018-04-02T21:59:24Z

If we're enabling this for k/k, is it going to do battle with the cherrypick mungegithub instance and related mungers?

sig-contribex is attempting to better document how the mungegithub-based cherry picker works, are there docs for how this works? kubernetes/community#1980

cjwagner · 2018-04-02T23:45:34Z

@spiffxp The cherrypicker Prow external plugin doesn't overlap with the cherrypick mungers AFAIK. The plugin just provides a command to automatically create a cherrypick PR on a different branch from the current PR. The mungers don't actually create cherrypick PRs, they just control process after the cherrypick PR exists.

spiffxp · 2018-04-02T23:55:05Z

@cjwagner ok, I hadn't really paged in the full cherrypick process, now that I've reviewed that PR, it looks like this is sort of a replacement for hack/cherry_pick_pull.sh in kubernetes?

cjwagner · 2018-04-03T00:29:38Z

@spiffxp Yeah, that seems accurate. It looks like the hack/cherry_pick_pull.sh script can do some stuff related to generated docs as well, but the primary purpose of both is to create a new cherrypick PR from an existing PR.

spiffxp · 2018-05-09T22:22:54Z

What remains to be done here? It'd be good to have this in and well understood prior to 1.11.x code freeze, so we could at least tell people to use a /cherrypick command instead of an arcane bash script

stevekuznetsov · 2018-05-10T14:51:05Z

We have @openshift-cherrypick-robot for this as the bot needs fewer rights than the others, unclear if you guys want to do similarly.

fejta · 2018-06-08T01:20:03Z

/assign @fejta @BenTheElder

What remains to be done here?

☝️ Very cool, let's get this in (is this still alive?)

fejta-bot · 2018-08-06T19:41:29Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

spiffxp · 2018-08-10T16:49:17Z

/retest
/remove-lifecycle rotten

Are there reasons we're holding off on this?

/cc @cblecker
I think you had concerns with this approach

I'm interested in this as a path to reduce the complexity of cherry picking (see issues in #1795)

cblecker · 2018-08-10T18:29:12Z

Yeah, I feel ways about having bots open PRs:

difficult/impossible to rebase with conflicts (unprivileged users wouldn't have access to branches)
creates a situation where a single person can generate a PR, LGTM it, approve it, and have it merge (against our principals of code review)

@Kargakis @stevekuznetsov How do you feel about/handle these in openshift? I'm really not sure this is a workflow we want in Kubernetes.

stevekuznetsov · 2018-08-10T19:39:24Z

difficult/impossible to rebase with conflicts (unprivileged users wouldn't have access to branches)

How long do cherry-picks stay alive? We don't create the PR if it cannot be made cleanly in the first place and have not seen cases where a cherrypick lands first and ruins the possibility of another open cherrypick. In those cases, drop back to the normal workflows we all know and love. In the happy case, just use the bots.

creates a situation where a single person can generate a PR, LGTM it, approve it, and have it merge (against our principals of code review)

If the PR already merged into master clearly the code is good, so I don't really see why a clean cherry-pick shouldn't start with LGTM, even, or why the author should not be allowed to add it. Plus, in k/k won't you need the cherry-pick to be approved before tide merges anything anyway, thereby not allowing one dev to make this series of changes alone?

guineveresaenger · 2018-08-27T20:15:54Z

Please excuse my butting in!
We talked about cherrypicks this morning on the 1.12 release team meeting and we all agree that the process needs to be simplified and policies need to be clarified in some cases.

I have further improvements to the cherrypick docs sitting in kubernetes/community#2408 and have also (somewhat successfully but laboriously) shepherded a coworker's cherrypicks to completion.

With those experiences in mind, I for one would love to see a /cherrypick bot command.

All cherrypick PRs are subject to final approval (the manual adding of cherrypick-approved) by the branch manager, so there is no danger of a single person sneaking in a cherrypick sight unseen. Unless that person is themselves the branch manager? We'd need to only choose branch managers who do not have unrestricted write access to k/k to avoid this...it seems a bit farfetched to worry.

@cblecker I don't entirely understand the concern about rebase conflicts and unprivileged users.

cblecker · 2018-08-28T23:59:34Z

Improving the cherrypick process.. 100% all for it! Still have major, major concerns about the bot opening a PR route.

The code review concern, yes, is mitigated by the requirement of the branch manager's approval. I hadn't thought about that, and that makes me much more comfortable on that point.

Let me try to explain further on the rebase concern..

Release branches are all point in time snapshots of what was on master. As time goes on, with some commits cherry picked back and others not, the code will drift. If I make a PR to master right now, there's a okay chance it will pick back to 1.11 without a merge conflict. The same can't be said for 1.10 and 1.9.

Even if it does merge cleanly, there may be tests or other parts of the code that break because of the cherry picked code.

The cherry picks can also be held for awhile between generation and merge, such as when a branch manager is preparing for a branch cut, or because there is a security fix pending.

If there is any modifications required (merge conflict on PR generation, rebase needed after PR generation, or test failures requiring code changes), then a bot generated PR is basically a waste and needs to be discarded and recreated using the cherry pick pull script. These aren't hypotheticals.. these are actual situations I see all the time in k/k.

To summarize: Having a bot generate a PR may work a small percentage of the time and make the process super quick.. but I think a majority of the time one of the above scenarios will impact the PR, and then they have to fall back to the alternative process anyways. I'd rather improve and work on the alternative process, making it the one true process that is the best we can make it, then have two different paths.. especially when one of those paths may require discarding entire PRs because of circumstances outside of the contributor's control.

0xmichalis · 2018-08-29T08:33:02Z

If there is any modifications required (merge conflict on PR generation, rebase needed after PR generation, or test failures requiring code changes), then a bot generated PR is basically a waste and needs to be discarded and recreated using the cherry pick pull script. These aren't hypotheticals.. these are actual situations I see all the time in k/k.

This is mostly not true with the cherrypick bot, you can push changes to open PRs if you will but that should be very rare. Only if there is a conflict in the first place, the bot will not open the PR and inform with a message about the conflict.

To summarize: Having a bot generate a PR may work a small percentage of the time and make the process super quick.. but I think a majority of the time one of the above scenarios will impact the PR, and then they have to fall back to the alternative process anyways. I'd rather improve and work on the alternative process, making it the one true process that is the best we can make it, then have two different paths.. especially when one of those paths may require discarding entire PRs because of circumstances outside of the contributor's control.

We actually see the opposite in openshift repos: https://github.com/pulls?utf8=%E2%9C%93&q=is%3Apr+author%3Aopenshift-cherrypick-robot+archived%3Afalse+is%3Aclosed+

nikhita · 2018-08-29T08:58:33Z

Some points/questions:

More than half of the cherry-picks I have opened in k/k have had conflicts. If the cherrypick bot fails with a message that it can't open a PR and wants me to use the script, it sgtm to me.

If there is any modifications required (merge conflict on PR generation, rebase needed after PR generation, or test failures requiring code changes), then a bot generated PR is basically a waste and needs to be discarded and recreated using the cherry pick pull script. These aren't hypotheticals.. these are actual situations I see all the time in k/k.

Have seen this too. There are cases where cherry-pick PRs either fail tests due to changed code or a discussion arises on the cherry-pick PR which ends up with removing some part in the cherry-picked PR and selecting only some code out of it.

Basically, with the bot it means that all cherry-picks created without conflicts can only have two future states: they are either ready to be merged or to be closed. This would make "reviews" on a cherry-pick PR redundant. Are we ok about that? 🤔

To be clear, I am all in for a bot but just want to understand the process and semantics better. :)

I'd rather improve and work on the alternative process, making it the one true process that is the best we can make it, then have two different paths.. especially when one of those paths may require discarding entire PRs because of circumstances outside of the contributor's control.

Personally, the script has always been reliable. Though I agree that the cherrypick process should be made easier. Is there an issue with the list of problems or suggestions for UX improvements for the script? Even if we decide to choose a bot-focused way, we should definitely fix those problems.

nikhita · 2018-08-29T08:59:46Z

/sig release

nikhita · 2018-08-29T09:21:55Z

creates a situation where a single person can generate a PR, LGTM it, approve it, and have it merge (against our principals of code review)

Also, I am not sure if this is really a concern but this allows someone listed in OWNERS of the respective change to create a cherry-pick PR and lgtm+approve it.

It's not always necessary that the branch manager will be familiar with the nitty-gritties of the change and may cherrypick-approve the PR if it looks good as a whole and has the lgtm+approved labels. This does not catch scenarios where the change might look good as a whole but have unintended side effects of being merged into a release branch.

If the approver had the power to "approve" the PR it means that they should be familiar with the change and understand the unintended side effects but this also essentially means that an approver listed in the OWNERS file is allowed to self-lgtm their PRs. Again, are we ok with that? 🤔

There are examples of PRs with cherrypick-approved which have been closed later by an OWNER in k/k today.

cblecker · 2018-08-29T14:25:05Z

This is mostly not true with the cherrypick bot, you can push changes to open PRs if you will

@Kargakis How would you do this without write access?

0xmichalis · 2018-08-29T15:27:28Z

Only org members are allowed to /cherrypick. At least in Openshift. The bot is also deployed in a different project with --allow-all=true but I haven't seen anybody needing to edit an open cherrypick so far even if it's supported only for contributors.

…

On Wed, Aug 29, 2018, 16:26 Christoph Blecker ***@***.***> wrote: This is mostly not true with the cherrypick bot, you can push changes to open PRs if you will @Kargakis <https://github.com/kargakis> How would you do this without write access? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7393 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADuFf0GpEf_LGZyUmjcxsVTbhiu6yNmDks5uVqSFgaJpZM4S5VKK> .

stevekuznetsov · 2018-08-29T15:46:29Z

@cblecker your comments are 100% valid, but I guess if we just look at the query @Kargakis posted we see 2500+ (!!!!!) PRs that cherry-picked and merged cleanly with the bot only. I don't think cherry-picks to OpenShift release branches are necessarily less complicated than those to k8s release branches. I'm not sure that the fact that automatic cherry-picks might fail in some cases should be an argument against letting the flow help where it can, best-effort.

cblecker · 2018-08-29T16:44:41Z

@stevekuznetsov Can you compare the workflow of openshift/openshift-docs vs openshift/origin or k/k? Of the ~2500 PRs there, ~1800 of them are in openshift/openshift-docs. It looks like there is only travis CI there, and the PRs are merged by humans. There are only 81 PRs in that query that are actually for openshift/origin, but again, I know very little about the openshift workflow.

stevekuznetsov · 2018-08-29T17:23:52Z

Even without the docs repo -- which does have a simplistic test workflow -- we still have 700+ PRs made to more complex repositories where code is quickly changing and release branches are maintained long-term. We do more cherry-picks to a different repository for our long-term releases for OpenShift which unfortunately are not public.

Is there some way that you would want to quantify the cost (in a potentially more complex workflow if the automated /cherry-pick doesn't work) versus the benefit (for any best-effort successes)?

cblecker · 2018-08-29T17:31:47Z

That's what I'm kind of thinking about right now. And there's the two different perspectives.. prow as a software product, vs how we use it in kubernetes. I think the ~2500 number absolutely justifies it as a part of prow the software product.

I'm just.. grappling with what I see in k/k (which currently has over 50 open cherry picks dating as far back as April -- https://github.com/kubernetes/kubernetes/pulls?q=is%3Apr+is%3Aopen+-base%3Amaster) and what kind of contributor experience people are having. I just see this potentially causing a whole lot more headache for both contributors and reviewers then the current system we have.

fejta-bot · 2018-11-27T17:35:06Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta · 2018-11-28T21:36:43Z

Is this still alive? Please rebase and/or close

fejta-bot · 2018-12-28T21:45:22Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

jennybuckley requested review from cjwagner and stevekuznetsov as code owners March 23, 2018 20:27

k8s-ci-robot added area/prow Issues or PRs related to prow size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 23, 2018

jennybuckley changed the title ~~Add cherrypicker to deployment bundle~~ [WIP] Add cherrypicker to deployment bundle Mar 23, 2018

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 23, 2018

Add cherrypicker to deployment bundle

3317d39

jennybuckley force-pushed the prow-cherrypicker-2 branch from 7a14d0f to 3317d39 Compare March 23, 2018 21:25

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 23, 2018

jennybuckley changed the title ~~[WIP] Add cherrypicker to deployment bundle~~ [WIP] Add cherrypicker to prow deployment bundle Mar 23, 2018

jennybuckley changed the title ~~[WIP] Add cherrypicker to prow deployment bundle~~ Add cherrypicker to prow deployment bundle Mar 23, 2018

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 23, 2018

Enable cherrypicker in k/k

d7d6017

0xmichalis reviewed Mar 28, 2018

View reviewed changes

spiffxp mentioned this pull request Apr 3, 2018

Improve cherry-pick process to better track them #1795

Closed

k8s-ci-robot assigned BenTheElder and fejta Jun 8, 2018

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 6, 2018

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Aug 10, 2018

k8s-ci-robot requested a review from cblecker August 10, 2018 16:49

k8s-ci-robot added the sig/release Categorizes an issue or PR as relevant to SIG Release. label Aug 29, 2018

0xmichalis mentioned this pull request Sep 26, 2018

/cherrypick <branch> doesn't work in k/k #9582

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 27, 2018

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 28, 2018

fejta closed this Jan 25, 2019

Bobgy mentioned this pull request Jun 16, 2020

[Release] Branch based release kubeflow/pipelines#3729

Closed

	- --token-file=/etc/token/bot-github-token
	- --test-owners-csv=/test_owners.csv
	- --triage-window=1
	- --triage-count=10
	- --flakyjob-count=3
	volumeMounts:
	- name: token
	mountPath: /etc/token
	volumes:
	- name: token
	secret:
	secretName: fejta-bot-token

Add cherrypicker to prow deployment bundle #7393

Add cherrypicker to prow deployment bundle #7393

Conversation

jennybuckley commented Mar 23, 2018

jennybuckley commented Mar 23, 2018 • edited Loading

BenTheElder commented Mar 23, 2018

jennybuckley commented Mar 23, 2018 • edited Loading

BenTheElder commented Mar 23, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

spiffxp commented Apr 2, 2018

cjwagner commented Apr 2, 2018

spiffxp commented Apr 2, 2018

cjwagner commented Apr 3, 2018

spiffxp commented May 9, 2018 • edited Loading

stevekuznetsov commented May 10, 2018

fejta commented Jun 8, 2018

fejta-bot commented Aug 6, 2018

spiffxp commented Aug 10, 2018

cblecker commented Aug 10, 2018

stevekuznetsov commented Aug 10, 2018

guineveresaenger commented Aug 27, 2018

cblecker commented Aug 28, 2018

0xmichalis commented Aug 29, 2018

nikhita commented Aug 29, 2018 • edited Loading

nikhita commented Aug 29, 2018

nikhita commented Aug 29, 2018 • edited Loading

cblecker commented Aug 29, 2018 • edited Loading

0xmichalis commented Aug 29, 2018 via email

stevekuznetsov commented Aug 29, 2018

cblecker commented Aug 29, 2018

stevekuznetsov commented Aug 29, 2018

cblecker commented Aug 29, 2018 • edited Loading

fejta-bot commented Nov 27, 2018

fejta commented Nov 28, 2018

fejta-bot commented Dec 28, 2018

jennybuckley commented Mar 23, 2018 •

edited

Loading

jennybuckley commented Mar 23, 2018 •

edited

Loading

spiffxp commented May 9, 2018 •

edited

Loading

nikhita commented Aug 29, 2018 •

edited

Loading

nikhita commented Aug 29, 2018 •

edited

Loading

cblecker commented Aug 29, 2018 •

edited

Loading

cblecker commented Aug 29, 2018 •

edited

Loading