Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for allowing skipped checks in has_successful_status #789

Conversation

iainlane
Copy link
Contributor

The has_successful_status predicate currently requires predicates to be present and successful in order to pass. But workflows can be run conditionally - for example only if certain paths change - and it is currently not very convenient to write a policy which considers such skipped workflows as passing. The only feasible workaround is to duplicate the path filters in the policy, and this quickly gets unwieldy and prone to getting out-of-sync in large repositories.

Here we add direct support for specifying such rules. This is done by introducing a new alernative form that has_successful_status predicates can take:

has_successful_status:
  options:
    skipped_is_success: true
  statuses:
    - "status 1"
    - "status 2"

In this mode, we will consider the skipped result as acceptable. The current form:

has_successful_status:
  - "status 1"
  - "status 2"

remains supported. We have done this by implementing a custom unmarshaling function to be able to handle both forms.

Closes: #760

@palantirtech
Copy link
Member

Thanks for your interest in palantir/policy-bot, @iainlane! Before we can accept your pull request, you need to sign our contributor license agreement - just visit https://cla.palantir.com/ and follow the instructions. Once you sign, I'll automatically update this pull request.

@iainlane
Copy link
Contributor Author

I'll square off the CLA question internally as soon as I can, but in the meantime hopefully we can proceed with reviews 😁

@bluekeyes
Copy link
Member

Thanks for this proposal! Looking at an actual implementation, I'm now thinking this could work better as a new predicate that deprecates has_successful_status. This new has_status predicate would let you specify both the contexts and the allowed conclusions:

# Option 1: same conclusions for all contexts
has_status:
  conclusions: ["success", "skipped"]
  contexts:
    - "status 1"
    - "status 2"

# Option 2: conclusions per context
has_status:
  "status 1": ["success"]
  "status 2": ["success", "skipped"]

Internally, we can convert the has_successful_status predicate in to this new one to avoid duplicating logic. The new predicate also avoids having two forms of a single predicate and allows policies that can work with failure, neutral, and other conclusions. What do you think?

@iainlane iainlane force-pushed the iainlane/has-successful-status-allow-skipped-workflows-to-count-as-passes branch 2 times, most recently from b911ad5 to f75329e Compare July 1, 2024 12:50
@iainlane
Copy link
Contributor Author

iainlane commented Jul 1, 2024

Cheers for the review 👍

Thanks for this proposal! Looking at an actual implementation, I'm now thinking this could work better as a new predicate that deprecates has_successful_status.

Sure, that makes sense. It was a bit awkward that we had two forms in one predicate.

Take a look at what I've got now? It's fundamentally the same as what I had before, except now:-

  1. has_successful_status is kept separate
  2. Any conclusion can be given, instead of only allowing skipped as we did before

We've still got a custom unmarshal function here, which handles unpacking either form into the same struct so they can share the implementation.

A bit more stuff has appeared to handle the list of statuses but it's not too bad.

# Option 1: same conclusions for all contexts
has_status:
  conclusions: ["success", "skipped"]
  contexts:
    - "status 1"
    - "status 2"

# Option 2: conclusions per context
has_status:
  "status 1": ["success"]
  "status 2": ["success", "skipped"]

I went for the first one. You can express the second in terms of it as far as I can see, and it feels to me like it'll be more common to want to apply the same conclusions to a whole predicate than select them individually. But I don't feel that strongly about this - it can be changed to the other form easily enough.

I've used statuses rather than contexts. That's only because I've not used that word in the context of GitHub CI and I found it bounced off me a bit. But it also doesn't matter too much.

@iainlane
Copy link
Contributor Author

iainlane commented Jul 5, 2024

I was just testing this in a real deployment, instead of using the testsuite... and this doesn't work like I thought it did.

Given this workflow

on:
  pull_request:
    paths:
      - a/**

jobs:
   # some jobs here

There is nothing reported when the job isn't run due to the path filter not being matched. Or at least nothing that I can find.

I think the PR is still useful for conditional jobs, but not conditional workflows, unless there is a way we can see those.

@iainlane iainlane force-pushed the iainlane/has-successful-status-allow-skipped-workflows-to-count-as-passes branch 2 times, most recently from c22674b to 451855b Compare July 7, 2024 11:10
iainlane added 3 commits July 7, 2024 12:12
If a predicate has an `allowedConclusions` member, it will unmarshal
from a list `["a", "b", "c"]` into a set, so that the handler can easily
check if the incoming conclusion was allowed.
The `has_successful_status` predicate currently requires predicates to
be present and successful in order to pass. But workflows can be run
conditionally - for example only if certain paths change - and it is
currently not very convenient to write a policy which considers such
skipped workflows as passing. The only feasible workaround is to
duplicate the path filters in the policy, and this quickly gets unwieldy
and prone to getting out-of-sync in large repositories.

Here we add direct support for specifying such rules. This is done by
introducing a new alernative form that `has_successful_status`
predicates can take:

```yaml
has_successful_status:
  options:
    skipped_is_success: true
  statuses:
    - "status 1"
    - "status 2"
```

In this mode, we will consider the `skipped` result as acceptable. The
current form:

```yaml
has_successful_status:
  - "status 1"
  - "status 2"
```

remains supported. We have done this by implementing a custom
unmarshaling function to be able to handle both forms.

Closes: palantir#760
Implement `has_successful_status` in terms of `has_status` and deprecate
it.

This avoids us having two forms for one predidcate.

An example would be

```yaml
has_status:
  conclusions: ["success", "skipped"]
  statuses:
    - "status 1"
    - "status 2"
```
@iainlane iainlane force-pushed the iainlane/has-successful-status-allow-skipped-workflows-to-count-as-passes branch from 451855b to d96d2a0 Compare July 7, 2024 11:13
@bluekeyes
Copy link
Member

There is nothing reported when the job isn't run due to the path filter not being matched. Or at least nothing that I can find.

Yeah, that sounds right. If a workflow trigger doesn't match, GitHub skips doing anything with the workflow. Posting skipped statuses for the jobs probably requires doing at least some evaluation to determine what jobs exist.

It's pretty difficult to account for things like this, since all statuses start out as missing until the reporting system responds to a webhook and posts the initial value. When Policy Bot evaluates a PR, there's no way that I know of to distinguish between a status that will never appear and one that will appear, but just hasn't appeared yet.

If you're able to share more details about what you want to achieve with your policy (maybe on the original issue), I can try to think about alternate approaches that might work.

@@ -48,6 +48,8 @@ type Options struct {

RequestReview RequestReview `yaml:"request_review"`

CountSkippedStatusAsPassed bool `yaml:"count_skipped_status_as_passed"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is leftover from a previous iteration? It appears unused now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes indeed. I've dropped it in 2d9c41c.

README.md Outdated
Comment on lines 325 to 330
# "has_successful_status" is satisfied if the status checks that are specified
# are marked successful on the head commit of the pull request.
has_successful_status:
- "status-name-1"
- "status-name-2"
- "status-name-3"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the benefit of people with older policies, I'd like to keep the documentation for this predicate, but with a note that it is deprecated:

Deprecated: Use `has_status` instead, which is more flexible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep sure: 06ae646

// deprecated `has_successful_status` format.
// 2. A full structure with `statuses` and `conclusions` fields as in
// `has_status`.
func (pred *HasStatus) UnmarshalYAML(unmarshal func(interface{}) error) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this works, it does mean people could use the new has_status predicate by providing only a list of strings or use the old has_successful_status predicate with a conclusions list that doesn't contain success.

Instead, I suggest keeping the old HasSuccessfulStatus type and replacing the Evaluate method with a version that creates a HasStatus object and calls Evaluate on it. That would keep these as two separate predicates for policy parsing purposes, while avoiding most of the code duplication.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Done here: 1793921.

Since we brought the type back, I added it to the testsuite again. Then it started to look a bit unwieldy with all the cases going on, so I reworked that a bit to be hopefully clearer vs. my previous.

Comment on lines 34 to 40
type unit struct{}
type set map[string]unit

// allowedConclusions can be one of:
// action_required, cancelled, failure, neutral, success, skipped, stale,
// timed_out
type allowedConclusions set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious for your opinion here: there's a decent amount of code in this PR for converting between this set representation and a list of strings in different circumstances. Given that the number of elements in the set is almost always going to be less than 3, what do you think about using a simple []string and testing for matching conclusions with slices.Contains?

That could allow duplicates, but duplicates would be obvious in the policy definition and shouldn't cause any problems (other than slightly awkward messages.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah no worries. I'm used to writing such code so I tend to reach for it but it's fine to skip it in this case. See ebee532.

If needed we could dedupe when printing, when looking at the slice or any other time. But I doubt it'll be a big deal since we're talking about a small number of possible values.

iainlane added 4 commits July 14, 2024 12:21
Mark it as deprecated instead of removing it.
This is more direct.

We want to test `HasSuccessfulStatus` now, so jiggle the testsuite a bit
to make that less repetitive by introducing a `StatusTestSuite` struct.
It's more code for not too much gain since we're only dealing with tiny
slices and not really looking through them that often.
Copy link
Contributor Author

@iainlane iainlane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the second look. All the comments should be addressed now. 👍

If you're able to share more details about what you want to achieve with your policy (maybe on the original issue), I can try to think about alternate approaches that might work.

Okey dokey, I'll do that.

@@ -48,6 +48,8 @@ type Options struct {

RequestReview RequestReview `yaml:"request_review"`

CountSkippedStatusAsPassed bool `yaml:"count_skipped_status_as_passed"`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes indeed. I've dropped it in 2d9c41c.

README.md Outdated
Comment on lines 325 to 330
# "has_successful_status" is satisfied if the status checks that are specified
# are marked successful on the head commit of the pull request.
has_successful_status:
- "status-name-1"
- "status-name-2"
- "status-name-3"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep sure: 06ae646

Comment on lines 34 to 40
type unit struct{}
type set map[string]unit

// allowedConclusions can be one of:
// action_required, cancelled, failure, neutral, success, skipped, stale,
// timed_out
type allowedConclusions set
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah no worries. I'm used to writing such code so I tend to reach for it but it's fine to skip it in this case. See ebee532.

If needed we could dedupe when printing, when looking at the slice or any other time. But I doubt it'll be a big deal since we're talking about a small number of possible values.

// deprecated `has_successful_status` format.
// 2. A full structure with `statuses` and `conclusions` fields as in
// `has_status`.
func (pred *HasStatus) UnmarshalYAML(unmarshal func(interface{}) error) error {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Done here: 1793921.

Since we brought the type back, I added it to the testsuite again. Then it started to look a bit unwieldy with all the cases going on, so I reworked that a bit to be hopefully clearer vs. my previous.

Otherwise people can't directly construct a `HasStatus`
Copy link
Member

@bluekeyes bluekeyes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for iterating on this with me. Let me know when you have the CLA sorted out (or if you have any questions about it) and then we can merge this.

@iainlane
Copy link
Contributor Author

iainlane commented Jul 24, 2024

Looks good, thanks for iterating on this with me. Let me know when you have the CLA sorted out (or if you have any questions about it) and then we can merge this.

I got the reply from our legal folks earlier. It's good to go and I've signed it.

Thanks - if you could hit the button for me, please. 👍

I'll get to the other PRs and issues tomorrow now I shouldn't be blocking from this side any more.

@bluekeyes bluekeyes merged commit dd73f68 into palantir:develop Jul 24, 2024
8 checks passed
@stevoland
Copy link

Thanks so much to both of you, excellent stuff!

@bluekeyes
Copy link
Member

@iainlane while looking at #807, I realized we may have a bug related to this: the handler for the check_run event, and probably the handler for the status event, exits early when the conclusion of the incoming status is not success. This was an optimization to make fewer requests, but it means we could miss evaluating a policy that requires some other status if there are no other triggering events.

Unfortunately, this could complicate the request optimizations you might be looking at. While we have the "trigger" concept for policies that deal with statuses, it doesn't include the different conclusions, so we may needlessly evaluate policies on pending events when they could only be satisfied by successes.

@iainlane iainlane deleted the iainlane/has-successful-status-allow-skipped-workflows-to-count-as-passes branch August 2, 2024 09:04
@iainlane
Copy link
Contributor Author

iainlane commented Aug 2, 2024

@iainlane while looking at #807, I realized we may have a bug related to this: the handler for the check_run event, and probably the handler for the status event, exits early when the conclusion of the incoming status is not success. This was an optimization to make fewer requests, but it means we could miss evaluating a policy that requires some other status if there are no other triggering events.

Unfortunately, this could complicate the request optimizations you might be looking at. While we have the "trigger" concept for policies that deal with statuses, it doesn't include the different conclusions, so we may needlessly evaluate policies on pending events when they could only be satisfied by successes.

Ah, yeah, hmm. Good spot.

Would it be acceptable to look for status being completed instead? So we'd only react when runs are finished, but now if they finished with any conclusion.

Or would we be looking at figuring out to get at the set of interesting conclusions in the handler? Feels tricky.

@bluekeyes
Copy link
Member

Yeah, I think restricting has_status to completed conclusions and then looking for that would be reasonable. If someone has a compelling reason to write a policy that matches on pending statuses in the future, we can revisit at that point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Request: Option to count skipped jobs in has_successful_status
4 participants