-
-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cron job to close stale PRs #93
Comments
+1. on today, there are 17 open PR where the CLA is not signed. |
@Mariatta I am interested in working on this. |
@souravsingh Sure! I think this is not a straight forward task, and it has some fun challenges :) My initial thoughts about this: A PR is considered "stale" if:
and
The workflow:
Side task, I think it can be a separate task. It will be a webhook instead of a cron.
I think the code can be added to bedevere, so we don't need to create any new bot. |
My one worry with adding this to Bedevere is GitHub quota. Going through 453 issues to find out details about them is going to eat through quota rather quickly and might be too much with Bedevere's current quota usage. |
Hm, the rate limit is something I haven't considered. |
Quite possibly. We might not know until we do the queries and see what the count comes out to. This might require smearing the work across the day, e.g. start at midnight, find out how many pages of issues there are, and then do a page per hour (where hopefully there are less than 24 pages 😉 ). |
For the first task of applying the The API calls will be: |
Nice catch on the GET arguments! So it looks like 40 issues in total ATM are either labeled "CLA not signed" or "awaiting changes". But the problem is that out of 19 pages worth of issues, only the first 5 have any "awaiting" label really applied to them. So that means the 17 issues that are awaiting changes could be extrapolated to be about 68 (although possibly more since those PRs are probably/hopefully languishing for a reason), which puts it at 90 issues or more (aside: maybe we should write a script that we run once that goes through and does a back-fill of all issues without an "awaiting" label? We could have a custom message when we set "awaiting changes" saying "we're totally guessing here, so please say 'I didn't expect the Spanish Inquisition' if we're wrong" or something). So let's worst-case it to 200 issues. If a PR is now considered stale/inactive, then there will be a comment and an added label (so 2 requests). If a PR has passed it's one week grace period, then it will be a comment and closure (2 requests). Add in pagination on the issue list and we're looking at worst-case 400+ requests out of our (I believe) 5000 request quota, so we're approaching 10%. I have opened python/bedevere#64 so that we can see what kind of margin we currently have so we don't have to keep guessing. |
…ting changes and no activity python/core-workflow#93
Hi, |
@chetankm-cs it will be however you set up a cron job on Heroku. |
One option is to use the crontab scheduler in celery. |
I have sent a pull request for this issue. |
There's existing integration from GitHub for this: https://probot.github.io/apps/stale/ |
Can we update the stale workflow to automatically close stale PRs without a signed CLA? That should be pretty uncontroversial, as we cannot accept PRs from persons who did not sign the CLA. I'll put up a PR if this sounds good to you all. Propsed patchNote: this patch has not been tested; GitHub Actions is not my area of expertise. diff --git a/.github/workflows/stale.yml b/.github/workflows/stale.yml
index e3b8b9f942..85eb108348 100644
--- a/.github/workflows/stale.yml
+++ b/.github/workflows/stale.yml
@@ -16,7 +16,18 @@ jobs:
- uses: actions/stale@v4
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
+ exempt-pr-labels: 'CLA not signed'
stale-pr-message: 'This PR is stale because it has been open for 30 days with no activity.'
stale-pr-label: 'stale'
days-before-stale: 30
days-before-close: -1
+
+ - uses: actions/stale@v4
+ with:
+ repo-token: ${{ secrets.GITHUB_TOKEN }}
+ only-pr-labels: 'CLA not signed'
+ stale-pr-message: 'This PR is stale because it has been open for 30 days with no activity. If the CLA is not signed within one week, it will be closed.'
+ stale-pr-label: 'stale'
+ close-pr-message: 'Closing this stale PR because the CLA is still not signed.'
+ days-before-stale: 30
+ days-before-close: 7
|
@erlend-aasland the patch looks good but I'd update the message to include a link to the documentation explaining why it's important and how to sign it. |
Good point, I'll do that on the PR. |
Idea looks good to me but I'd give the author 14 days for signing the CLA as this sometimes requires communicating with the employer, etc. 7 days is a little tight. Question: if there is activity on the PR after the label is applied, or the label is subsequently removed, will the PR still be closed? If not, the current suggested message is misleading. As for giving more information on how to sign the CLA I think that's redundant with the original message bedevere-bot is sending asking for the CLA. We can put a link to https://devguide.python.org/pullrequest/#licensing as a reminder. |
I'll try and find out. If someone with more GitHub Actions experience than me already knows, please shout out :) |
I believe not: activity will remove the |
And to test it, I've applied PR python/cpython#30500 to my fork's @erlend-aasland Would you like to create a dummy PR (e.g. edit README) to my fork https://github.com/hugovk/cpython? I'll add the Actually, let's do two dummy PRs: one I'll add the label to, and one I won't. |
Thanks, @hugovk, will do! |
Quoting the stale action REAME: "If an update/comment occur on stale issues or pull requests, the stale label will be removed and the timer will restart" |
The two test PRs:
First cron run: Both steps ran and processed the two PRs. No actions taken because it's less than 1 day since there was activity. Will check back tomorrow! |
Test restarted: We updated the PR to avoid double negatives ("has 'CLA signed'?" instead of "does not have 'CLA not signed'?" python/cpython#30500 (comment)) and I've updated the test repo and PRs to match, reseting the inactivity counters. |
First run with new test: Looks good so far, as last time:
|
No, it will not be closed. See python/cpython#30500 (comment) |
Test complete, both worked as expected! 👍 |
python/cpython#30500 has now been merged. Should we still keep this ticket open? |
We could keep this open a little longer to monitor its progress, especially after the next daily cron run? Here's the current list of 22 I expect, if no other activity before the cron:
|
The new cron ran and we now have 15
✅
Only 4 were closed: 13 still open: #22696 #27848 #27552 #27341 #25744 #26930 #26458 #24556 #24942 #24474 #23998 #22695 #22671 1 closed, 1 merged 🎉, 2 still open. So why didn't we close all the expected ones? Because the cron job action doesn't run on all the PRs. The last run: https://github.com/python/cpython/actions/runs/1861772715 From "Check PRs with 'CLA not signed' label":
That warning:
https://github.com/actions/stale#operations-per-run explains how the action is rate limited, and we can increase it but "if reached will block these API calls for one hour (or API calls from other actions using the same user (a.k.a.: the github-token from the repo-token option)". The limit:
https://docs.github.com/en/rest/overview/resources-in-the-rest-api#requests-from-github-actions That sounds pretty high, I expect we can raise it. If Note we have two steps running in the action, one for adding the stale label, one for closing stale+no-CLA. Also: We can change the order PRs are fetched, from default https://github.com/actions/stale#ascending Does it make sense to deal with the older PRs first? They're more likely to be stale/abandoned. At least for the closing step. |
Sounds good to me :)
+1 |
Please see PR python/cpython#31407. |
That was merged and ran last night, and is looking good. https://github.com/python/cpython/runs/5255121854?check_suite_focus=true Some more detail. From "Check PRs with 'CLA not signed' label":
-> Switching to processing oldest first has helped, it's labelled some old PRs from 2017. From "Check PRs with 'CLA not signed' label":
->
These 10 are now closed: #27848 #27552 #27341 #26930 #26458 #24942 #24474 #23998 #22695 #22671 And 3 still open: #22696 curiously had the stale label removed:
So what happened? The label was applied some time ago, comments were then made, the label should have been removed but the PR dropped off the list of processed PRs until now. So that's okay. |
Looks good. Thanks for going through all the details for us, @hugovk! |
Just came to mind, the BPO -> GitHub issues migration is happening soon: We don't want this action running on issues (at least not yet, it's something to reconsider once the migration is done). And we're using
https://github.com/actions/stale#only-pr-labels |
I'm classifying "stale" as someone who created a PR but has not signed the CLA or someone who received a code review requesting changes and and never followed up with more commits or review comments. These two scenarios specifically make sure that the PR submitter is the hold up and not any core dev(s). (A PR that can't be merged probably shouldn't be classified as such since they shouldn't have to perpetually maintain a PR if they don't get a review from a core dev).
It would probably be nice to receive a warning that a PR will be closed within a week, but re-opening a PR I don't think is hard either so it should be a big deal to just close either if it's too much work to give the warning (that would need to be verified, though, that a PR submitter and re-open their own PR before making a decision about this).
The text was updated successfully, but these errors were encountered: