-
-
Notifications
You must be signed in to change notification settings - Fork 17k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: add a GitHub action to quell spam PRs #5449
Comments
I am not sure we need an action for this unless there is an action which does more than that one (like spam detection with ai or something). It is easy to delete/close/lock with as many clicks as it is to comment and we don't need an additional third party action (added security risk and maintenance) to do it right? |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
Not sure about AI spam detection, but almost all of these PRs update the readme, have default name (Update filename), change only a single line and have no description, so it should be rather easy to close them automatically. Most authors of these PRs have a repsitory called "localrepo", so that's another rule that could be used to detect spam from this source (Apna College). |
Yeah this is a one-off issue of the day. Spam PRs have been a problem for years and they come in different varieties. This is why I am hesitant do add an action for this as I would rather ask GitHub for better spam management tooling. |
This comment was marked as off-topic.
This comment was marked as off-topic.
Hey @wesleytodd. I found this on X: https://twitter.com/github/status/1311772722234560517 |
Agree with @wesleytodd. The issue is quite complex because the moderation in GitHub is not an easy thing:
So, I think that we are quite limited on how much we can do with GitHub actions in this case. There are other scenarios that are less frequent but more prone to use GitHub Actions to moderate, for example when people do comments that include offensive content or heavily language. In most projects that I am involved the moderation is done by the humans behind the project or a specific team that volunteer to do it, it is a heavy job. The same way as it is hard to keep a slack/discord/gitter community channel a safe space by moderating content. |
I think historically the express project has not needed the same style of moderation as things like Node.js does. There have been less contentious discussions and mostly the CoC violations have come from folks outside the project so it was relatively simple to ban and move on. I don't think we need to immediately spin up a moderation team but I do think that the Triage team and TC should have the tools to properly moderate. Right now I don't think we have that well in hand. I believe that we can add this to the list of TODOs to address after we can get next weeks TC meeting organized and finished. Are we in agreement that a GH Action is most likely not the direction we would want to take to solve this problem? |
Yes, we can add it to the TODO list and start to work on it in few weeks
+1 from me |
Maybe a pull reqest template would make some people think before creating a PR? |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
Doesn't need to stop the conversation, but since we specifically don't want to use a GHA to do this then this issue is complete. There are more threads to follow up on, but I think we can do that in other discussions more specific to those. |
A workflow could be an option too, These spam PRs generally don't have more than 2-3 words. Closing PRs with less than 3 words sounds reasonable. GitHub too does something similar in it's documentation repository. GitHub's workflow |
@wesleytodd, I think you should reopen this issue. @gaurishhs made another point about using GitHub actions |
I do agree that workflow is a bit more well suited IMO. I am still hesitant and I would like to also look at other ways but yeah in the mean time lets re-open the issue so we don't end up having multiple on the same topic or miss out on good ideas like @gaurishhs'. |
even one more better thing is maximum spam pr revolve around README.md ,if we can specify to block those pr updating readme for now (upto the spam pr's get less) we can handle this through github actions for this so that good pr's may not effect. |
Is it possible to have the workflow also incorporate a check for new contributors when doing the x words check? I feel like that people who contributed, and already have code merged, in the past do not pose much of a threat in regards to spam. Or am I overthinking it? |
I don't think you're overthinking it @TimGels. Previous contributors should be spared in some way. |
And if word count check is added, they must find some other way to open spam prs. The problem here is those who make spam prs, they just started their development journey and when they got introduced to github (by some youtubers or some articles), they tried to test it by themselves. They didn't have any idea that the thing they are doing is a headace of someone else. The only way we can stop this is by creating awarness. |
True, but knowing that there those who create spam PRs out of malice, raising awareness is not enough. Protective measures are a must too. |
There are Rest APIs for getting the info of prs and also for actions. Making a bot which can automate spam pr closing using those rest apis can help. But not sure how to detect spam. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
I agree with @QuantGeekDev, detecting this kind of spam is relatively easy and using AI would be an overkill. A simple rule-based approach will work well in this case: https://github.com/animesh-chouhan/github-spamstop/blob/main/index.js#L48-L66 This uses a pull-request template which needs to be edited with the details of the change which would easily deter a large amount of such PRs. PR template: https://github.com/animesh-chouhan/github-spamstop/blob/main/.github/pull_request_template.md |
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
Really, I think you could probably just reject anything with "update: ''" as a PR title, I don't think that would be too disruptive for normal PRs. |
Maybe we can add a PR template when people want to issue a PR It is probably as this people won't be able to know how to change basic things in the PR template |
Most of them can be auto-closed if they meet all of the below criteria:
This should auto-close almost all of them. Maybe 1 or 2 every month may not match, but it's much less work to manually check and close them. PS: The regex match should be on commit subject, not PR title because the newbie spammers are more likely to change the title. For fun: Close them with a cheeky comment: Congrats! You are now an open source spam contributor. Now learn how to make real contributions and your training will be complete. [Add a Hindi translation of the same.] |
I think we can get auther (who made PR) total lifetime PR count if it is more then 10 or whatever we can prevent that ? like this name: Spam Detect
on:
pull_request:
types:
- opened
jobs:
check_author_pr_count:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: '14'
- name: Install dependencies
run: npm install github-api
- name: Count Merged PRs
id: pr_count
run: |
AUTHOR=$(curl -sSL https://api.github.com/repos/${{ github.repository }}/pulls/${{ github.event.pull_request.number }} | jq -r '.user.login')
MERGED_PR_COUNT=$(curl -sSL -H "Authorization: Bearer ${{ secrets.GITHUB_TOKEN }}" "https://api.github.com/search/issues?q=is:pr+author:$AUTHOR+is:merged" | jq -r '.total_count')
echo "merged_pr_count=$MERGED_PR_COUNT" >> $GITHUB_OUTPUT
- name: Check PR Count
run: |
if [[ "${{ steps.pr_count.outputs.merged_pr_count }}" -gt 20 ]]; then
echo "PR count is greater than 20. Proceed with the workflow."
else
echo "PR count is not greater than 20. Blocking the PR."
exit 1
fi this is it's test https://github.com/Kamleshpaul/github-spam-action-test/pulls |
They already made a github action which takes care of this |
Well, it looks like the issue will be solved when the pr is merged |
Hey folks, just to update you: We have a thread open with GitHub support where they are passing along our requests to the various product teams on how to improve the moderation features. I personally am still 👎 on an action which auto closes PRs as it is both a maintenance burden and something best handled by features on GitHub's side. We are focused right now on landing some governance changes and will be following up with re-constituting the triage team. I think ideally this decision is left up to the triage team members. So likely there will not be action on this or the PR until we can get that team organized again. If you are interested in helping that please follow the instructions to get involved there (as some of you already have, thank you very much for that). |
They have slowed down too for the time being, the last one already is two days old. |
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
Closing as not planned. Many words have been written now in this issue and others about auto-closing spam PRs, and we have come to the same conclusion every time. It is trivial to implement, but is a sledgehammer approach for what is now a mosquitto level annoyance. Closing and locking a handful of spam issues/PRs now and then is the easiest thing a maitainer will be doing that day, so it doesn't save much by labor. There is very little benefit in having an auto-close action. It will still spam people's Github notifications no matter if it's closed and locked at the speed of a human or a machine. The only thing we'd prevent there is people commenting disparagingly on the PR before it gets locked by a human, generating more notifications. We haven't had any of that in the past couple weeks (we did get some hostile comments when this topic was trending originally, which it no longer is). The PRs are still coming, but intermittently. The storm has passed, this can be reevaluated if another flood happens in the future from the previous source or any other. Thank you all for your eagerness to help. If you want an ear to the ground on things that the project is prioritizing and possibly looking for help with, peruse the Discussions repo https://github.com/expressjs/discussions/issues |
This can be reconsidered with the use of See an example from Renovate: |
Hey @not7cd, I love the idea of "create a meaningful message to the spammer to inform them why it's getting closed". Would you be willing to sign up to own this and help participate in the project to maintain the tooling? The main thing here is that closing a few spam PRs is a simple but short term thing. Working on setting up automation means time taken away from much more important project priorities we have right now. So it is mostly a prioritization thing. So far the folks who showed up to make noise about ways to "solve" this problem have not showed up to reply with this sort of context in the spam PRs. If anyone in the threads wants to start posting these prewritten messages to the authors then we (the maintainers) can stop with our current approach of closing them as spam. This would save us a lot of time and help those users learn, but it is work and we are volunteers. For everyone in this thread, please we need your help:
|
Problem
I was scrolling through Twitter and someone posted about the spam PR.
Possible Solution
Implement this GitHub action. It'll lessen the workload for maintainers.
The text was updated successfully, but these errors were encountered: