Branch Protection Rules support for main branch #16

richm · 2023-02-18T00:32:35Z

This adds a playbook update_branch_protection_rules.yml for
setting Branch Protection Rules for the main branch.

The default rules are quite strict - require approvals, require
status checks, clear approvals if a new commit is pushed, do not
allow admin override, and more.

The status checks are not currently configurable per group or
per role, but you can have additional status checks. For example,
the network role has additional status checks not used by the
other roles. There are additional python status checks, and
additional shell checks, for the roles that support and use those
checks.

More default status checks will be added - we're in the process
of revising the names of the baseos ci status checks, so more
will be coming soon.

Signed-off-by: Rich Megginson rmeggins@redhat.com

richm · 2023-02-18T00:33:04Z

Branch protection rule.pdf
This is an example of what the new rules look like in the kernel_settings role.

martinpitt · 2023-02-18T05:18:23Z

Thanks @richm ! This is quite a nice way to sync configuration across many related GH projects. Big 👍 TBH, I consider most of these restrictions as a "bare minimum", and they should really be enable by default. The only disputable one is "Require branches to be up to date before merging", which isn't suitable for projects which get more than ~ one PR per day and have too expensive CI to constantly rebase (like Cockpit), but for LSR we can afford to enforce it.

Cheers!

tyll · 2023-02-18T21:34:18Z

The only disputable one is "Require branches to be up to date before merging", which isn't suitable for projects which get more than ~ one PR per day and have too expensive CI to constantly rebase (like Cockpit), but for LSR we can afford to enforce it.

GitHub now has beta support for merge queues which should solve this problem for Cockpit as well: https://github.blog/changelog/2021-10-27-pull-request-merge-queue-limited-beta/ - it seems that I might have applied the network role for this already, since I got a notification recently, that we can use it.

The challenge that I see, is that the system roles CI does not seem to be reliable enough for this to properly work.

tyll · 2023-02-18T21:50:54Z

inventory/host_vars/network.yml

+  - DCO
+  - markdownlint
+  - integration (c8s)
+  - integration (c9s)
+  - integration (c7)
+  - python-26
+  - woke


Suggested change

- DCO

- markdownlint

- integration (c8s)

- integration (c9s)

- integration (c7)

- python-26

- woke

- DCO

- markdownlint

- integration (c8s)

- integration (c9s)

- integration (c7)

- python-26

- woke

- Analyze (python)

Currently, the UI also shows a source for each check (DCO, GitHub Actions, GitHub Code Scanning) depending on the status, the alternative seems to be "any source" which might be what this will be configuring, since it does not collect the source. It would be great to keep the sources as well.

The ansible_lint etc. are the defaults applied to all roles - so the network.yml contains only those status checks unique (currently) to network

Currently, the UI also shows a source for each check (DCO, GitHub Actions, GitHub Code Scanning) depending on the status, the alternative seems to be "any source" which might be what this will be configuring, since it does not collect the source. It would be great to keep the sources as well.

Why keep the sources?

Currently, the UI also shows a source for each check (DCO, GitHub Actions, GitHub Code Scanning) depending on the status, the alternative seems to be "any source" which might be what this will be configuring, since it does not collect the source. It would be great to keep the sources as well.

Why keep the sources?

My assumption is that there is that there might be a security impact (maybe not that big since it seems to require write access to the repo to set a state) and to avoid that status checks conflict. For example, if someone experiments with a new tool that would set a markdownlint status, it could approve the PR without the official markdownlint being run. I guess, this also would allow users to circumvent the branch protection rules but they need to have already write access.

Here is the API that allows you to create a status check context with the id: https://docs.github.com/en/graphql/reference/input-objects#requiredstatuscheckinput - the appId field is used to specify a particular action that must already be in the database. Here is the output of searching the network role with the full requiredStatusCheck name + app information:

createdAt: '2017-03-25T13:51:49Z' databaseId: 1861 description: "This App enforces the [Developer Certificate of Origin](https://developercertificate.org/)\ \ (DCO) on Pull Requests. It requires all commit messages to contain\ ... url: https://probot.github.io/apps/dco/ context: DCO ... createdAt: '2020-03-17T21:25:09Z' databaseId: 57789 description: '' name: GitHub Code Scanning slug: github-code-scanning updatedAt: '2021-12-14T16:45:01Z' url: https://github.com/features/security context: CodeQL ... createdAt: '2018-07-30T09:30:17Z' databaseId: 15368 description: Automate your workflow from idea to production slug: github-actions updatedAt: '2019-12-10T19:04:12Z' url: https://help.github.com/en/actions context: markdownlint ....

Except for DCO and CodeQL, every "app" is the generic github-actions "app". I guess I could change the input to reflect that. I'm assuming the databaseId field in the output is the appId field I should use in the input - the documentation doesn't really say. It's odd that "real" github actions like ansible_lint do not have their own appId - perhaps because we have custom code in there - for the ansible-lint check we have some custom code to munge meta/main.yml role_name and namespace - for ansible-test we have code to convert to collection.

Note that the key is the context - this must exactly match the name of the job e.g. https://github.com/linux-system-roles/.github/blob/main/playbooks/templates/.github/workflows/ansible-lint.yml#L12

jobs: ansible_lint:

so as long as there is only 1 github action which defines jobs.ansible_lint, we can be guaranteed that if we refer to a status check context name as ansible_lint it will use the one defined in .github/workflows/ansible-lint.yml, which requires write access to the repo to change.

Unless you can foresee some attack vector based on status check context naming, I would prefer to use status check context name as is being used currently in this PR.

inventory/group_vars/active_roles.yml

richm · 2023-02-18T22:44:49Z

The only disputable one is "Require branches to be up to date before merging", which isn't suitable for projects which get more than ~ one PR per day and have too expensive CI to constantly rebase (like Cockpit), but for LSR we can afford to enforce it.

GitHub now has beta support for merge queues which should solve this problem for Cockpit as well: https://github.blog/changelog/2021-10-27-pull-request-merge-queue-limited-beta/ - it seems that I might have applied the network role for this already, since I got a notification recently, that we can use it.

Ok. That's another thing I need to figure out how to apply to all roles . . . seems like overkill though with as few PRs as we get across all of the system roles, plus the CI problems mentioned below.

The challenge that I see, is that the system roles CI does not seem to be reliable enough for this to properly work.

tyll · 2023-02-20T07:08:22Z

GitHub now has beta support for merge queues which should solve this problem for Cockpit as well: github.blog/changelog/2021-10-27-pull-request-merge-queue-limited-beta - it seems that I might have applied the network role for this already, since I got a notification recently, that we can use it.

Ok. That's another thing I need to figure out how to apply to all roles . . . seems like overkill though with as few PRs as we get across all of the system roles, plus the CI problems mentioned below.

The unreliable CI and possible high latency is actually an argument for the merge queues because it will reduce the amount of necessary CI runs on a specific day when there is a deadline and changes pile up.

richm · 2023-02-20T14:23:26Z

GitHub now has beta support for merge queues which should solve this problem for Cockpit as well: github.blog/changelog/2021-10-27-pull-request-merge-queue-limited-beta - it seems that I might have applied the network role for this already, since I got a notification recently, that we can use it.

Ok. That's another thing I need to figure out how to apply to all roles . . . seems like overkill though with as few PRs as we get across all of the system roles, plus the CI problems mentioned below.

The unreliable CI and possible high latency is actually an argument for the merge queues because it will reduce the amount of necessary CI runs on a specific day when there is a deadline and changes pile up.

I still don't think it is necessary to use merge queues for system roles, despite the recently flurry of activity in the network role.

tyll · 2023-02-20T16:07:03Z

I still don't think it is necessary to use merge queues for system roles, despite the recently flurry of activity in the network role.

It seems it is also necessary to have merge queues for automated merging to properly work. Otherwise GitHub seems to wait for a human to click the "rebase" button. I enabled merge queues for the network role so we can take a look there.

Regarding disallowing admins to override the branch protection rules: I used this for the current network changelog PR since it was waiting on the integration tests and it is also the fallback in case the code owner is unavailable. What's the plan for that? I guess there should also be a more sophisticated codeowners file to add the system-roles core team as owners for LSR specific paths such as the changelog, maybe tox and .github?

richm · 2023-02-20T16:39:06Z

I still don't think it is necessary to use merge queues for system roles, despite the recently flurry of activity in the network role.

It seems it is also necessary to have merge queues for automated merging to properly work. Otherwise GitHub seems to wait for a human to click the "rebase" button. I enabled merge queues for the network role so we can take a look there.

Ok. Then the network role will be the PoC for merge queues, and we'll see if they will be generally useful for all of the system roles.

Regarding disallowing admins to override the branch protection rules: I used this for the current network changelog PR since it was waiting on the integration tests and it is also the fallback in case the code owner is unavailable.

Right.

What's the plan for that?

Typically what I do is go into Settings -> Branch Protection Rules and turn off this switch, merge the change, and then turn back on the switch.

I guess there should also be a more sophisticated codeowners file to add the system-roles core team as owners for LSR specific paths such as the changelog, maybe tox and .github?

You can do that with codeowners?

richm · 2023-02-20T21:26:24Z

I still don't think it is necessary to use merge queues for system roles, despite the recently flurry of activity in the network role.

It seems it is also necessary to have merge queues for automated merging to properly work. Otherwise GitHub seems to wait for a human to click the "rebase" button. I enabled merge queues for the network role so we can take a look there.

Ok. Then the network role will be the PoC for merge queues, and we'll see if they will be generally useful for all of the system roles.

Regarding disallowing admins to override the branch protection rules: I used this for the current network changelog PR since it was waiting on the integration tests and it is also the fallback in case the code owner is unavailable.

Right.

What's the plan for that?

Typically what I do is go into Settings -> Branch Protection Rules and turn off this switch, merge the change, and then turn back on the switch.

The only repo currently that has isAdminEnforced: true is the cockpit role - I guess @martinpitt prefers the extra layer of security and isn't too concerned with having to override status checks/approvals and force merge. Martin - are you ok if we set isAdminEnforced: false for now? @tyll are you willing to try isAdminEnforced: true?

It's too bad there isn't a switch to "allow admin to merge if CI/github action check failure" but "disallow admin to merge if there aren't enough reviews".

I guess there should also be a more sophisticated codeowners file to add the system-roles core team as owners for LSR specific paths such as the changelog, maybe tox and .github?

You can do that with codeowners?

Apparently yes, but I would rather do that in a subsequent PR.

richm · 2023-02-21T17:28:56Z

@tyll @martinpitt 80123f9 allows isAdminEnforced per role, and makes cockpit the only role currently that uses true

This adds a playbook `update_branch_protection_rules.yml` for setting Branch Protection Rules for the `main` branch. The default rules are quite strict - require approvals, require status checks, clear approvals if a new commit is pushed, do not allow admin override, and more. The status checks are not currently configurable per group or per role, but you can have additional status checks. For example, the network role has additional status checks not used by the other roles. There are additional python status checks, and additional shell checks, for the roles that support and use those checks. More default status checks will be added - we're in the process of revising the names of the baseos ci status checks, so more will be coming soon. Signed-off-by: Rich Megginson <rmeggins@redhat.com>

tyll reviewed Feb 18, 2023

View reviewed changes

inventory/group_vars/active_roles.yml Outdated Show resolved Hide resolved

richm added 6 commits March 10, 2023 11:44

update README

23268ce

sort status checks

966e972

allow isAdminEnforced per-role

0f89832

additional debugging

bbf071d

merge group permissions wip

88c0572

richm force-pushed the branch-protection-rules branch from fc2358f to 88c0572 Compare March 10, 2023 18:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Branch Protection Rules support for main branch #16

Branch Protection Rules support for main branch #16

richm commented Feb 18, 2023

richm commented Feb 18, 2023 •

edited

Loading

martinpitt commented Feb 18, 2023

tyll commented Feb 18, 2023

tyll Feb 18, 2023 •

edited

Loading

richm Feb 18, 2023

richm Feb 19, 2023

tyll Feb 20, 2023

richm Feb 20, 2023

richm commented Feb 18, 2023

tyll commented Feb 20, 2023

richm commented Feb 20, 2023

tyll commented Feb 20, 2023

richm commented Feb 20, 2023

richm commented Feb 20, 2023

richm commented Feb 21, 2023

Branch Protection Rules support for main branch #16

Are you sure you want to change the base?

Branch Protection Rules support for main branch #16

Conversation

richm commented Feb 18, 2023

richm commented Feb 18, 2023 • edited Loading

martinpitt commented Feb 18, 2023

tyll commented Feb 18, 2023

tyll Feb 18, 2023 • edited Loading

Choose a reason for hiding this comment

richm Feb 18, 2023

Choose a reason for hiding this comment

richm Feb 19, 2023

Choose a reason for hiding this comment

tyll Feb 20, 2023

Choose a reason for hiding this comment

richm Feb 20, 2023

Choose a reason for hiding this comment

richm commented Feb 18, 2023

tyll commented Feb 20, 2023

richm commented Feb 20, 2023

tyll commented Feb 20, 2023

richm commented Feb 20, 2023

richm commented Feb 20, 2023

richm commented Feb 21, 2023

richm commented Feb 18, 2023 •

edited

Loading

tyll Feb 18, 2023 •

edited

Loading