Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spam requests to nodejs.org from Artifactory deployments #3223

Closed
targos opened this issue Mar 15, 2023 · 42 comments
Closed

Spam requests to nodejs.org from Artifactory deployments #3223

targos opened this issue Mar 15, 2023 · 42 comments
Labels

Comments

@targos
Copy link
Member

targos commented Mar 15, 2023

Our server has been consistently bombarded by requests to inexistent paths which correspond to npm package names.

For example:

2023/03/15 11:31:13 [error] 8929#8929: *67745 open() "/home/dist/nodejs/release/yargs" failed (2: No such file or directory), client: 162.158.62.202, server: nodejs.org, request: "GET /dist/yargs HTTP/1.1", host: "nodejs.org"
2023/03/15 11:31:13 [error] 8930#8930: *60355 open() "/home/dist/nodejs/release/cli-truncate" failed (2: No such file or directory), client: 162.158.63.149, server: nodejs.org, request: "GET /dist/cli-truncate HTTP/2.0", host: "nodejs.org"
2023/03/15 11:31:13 [error] 8930#8930: *61845 open() "/home/dist/nodejs/release/karma-parallel" failed (2: No such file or directory), client: 162.158.63.150, server: nodejs.org, request: "HEAD /dist/karma-parallel HTTP/2.0", host: "nodejs.org"
2023/03/15 11:31:13 [error] 8932#8932: *60398 open() "/home/dist/nodejs/release/is-running" failed (2: No such file or directory), client: 162.158.63.149, server: nodejs.org, request: "HEAD /dist/is-running HTTP/2.0", host: "nodejs.org"
2023/03/15 11:31:13 [error] 8929#8929: *67745 open() "/home/dist/nodejs/release/is-url" failed (2: No such file or directory), client: 162.158.62.202, server: nodejs.org, request: "HEAD /dist/is-url HTTP/1.1", host: "nodejs.org"
2023/03/15 11:31:13 [error] 8930#8930: *60365 open() "/home/dist/nodejs/release/npm-run-path" failed (2: No such file or directory), client: 162.158.62.240, server: nodejs.org, request: "HEAD /dist/npm-run-path HTTP/2.0", host: "nodejs.org"
2023/03/15 11:31:13 [error] 8929#8929: *67745 open() "/home/dist/nodejs/release/ms/-/ms-0.7.2.tgz" failed (2: No such file or directory), client: 162.158.62.202, server: nodejs.org, request: "HEAD /dist/ms/-/ms-0.7.2.tgz HTTP/1.1", host: "nodejs.org"
2023/03/15 11:31:13 [error] 8933#8933: *61259 open() "/home/dist/nodejs/release/is2" failed (2: No such file or directory), client: 162.158.63.150, server: nodejs.org, request: "HEAD /dist/is2 HTTP/2.0", host: "nodejs.org"
2023/03/15 11:31:13 [error] 8933#8933: *33223 open() "/home/dist/nodejs/release/@rollup/plugin-typescript" failed (2: No such file or directory), client: 162.158.62.240, server: nodejs.org, request: "HEAD /dist/@rollup%2Fplugin-typescript HTTP/2.0", host: "nodejs.org"
2023/03/15 11:31:13 [error] 8933#8933: *60282 open() "/home/dist/nodejs/release/dedent" failed (2: No such file or directory), client: 162.158.63.150, server: nodejs.org, request: "GET /dist/dedent HTTP/2.0", host: "nodejs.org"

I used the Cloudflare Instant Logs to monitor the 404 requests for a few seconds and identified two sources:

"ClientRequestUserAgent": "Artifactory/7.41.7 74107900",
"ClientIP": "198.203.175.175",

IP belongs to UnitedHealth Group Incorporated.

"ClientRequestUserAgent": "Artifactory/6.23.13",
"ClientIP": "200.220.180.71",

IP belongs to Banco Santander (Brasil) S.A.

I don't really know what we can do at this point but ideally we should find a way to make them stop.

@targos
Copy link
Member Author

targos commented Mar 15, 2023

There are also requests for Java libraries:

For example:

"ClientIP": "137.254.7.167",
"ClientRequestPath": "/dist/org/netbeans/api/org-openide-io/RELEASE82/org-openide-io-RELEASE82.jar",
"ClientRequestUserAgent": "Nexus/2.15.1-02 (PRO; Linux; 5.4.17-2136.313.6.el7uek.x86_64; amd64; 1.8.0_351) apacheHttpClient4x/2.15.1-02",

@richardlau
Copy link
Member

I suppose in the worst case scenario we could probably block those IP addresses at CloudFlare level.

@domdomegg
Copy link

I suspect someone has entered nodejs.org in the URL field for their upstream NPM registry: https://www.jfrog.com/confluence/display/JFROG/npm+Registry

(and possibly somehow for their Java libraries)

Blocking the IPs at the CloudFlare level proably works if it's just these two.

If it's more people getting confused then maybe:

  1. Reach out to JFrog and ask them to add a check in their product that shows a warning if people put in nodejs.org into their config here.
  2. Make some fake libraries that just print out a message explaining they've misconfigured their software product and terminate. Could suggest that they probably want to use a registry URL like registry.npmjs.org. Then serve these with nginx for NPM registry-looking URLs and hope the message gets read once their builds start failing.

@rvagg
Copy link
Member

rvagg commented Mar 16, 2023

This has been going on for years IIRC, logs full of requests for Java components from garbage Artifactory deployments, I've never tried to do anything about it because I've not had a great deal of faith in getting anything sensible out of that ecosystem but maybe someone should!

@ovflowd
Copy link
Member

ovflowd commented Mar 16, 2023

+1 on blocking the known IP addresses

@targos
Copy link
Member Author

targos commented Mar 16, 2023

Just for monitoring, I created a CF firewall rule to block 198.203.175.175. After a few minutes it already blocked 3.15k requests.

How about replacing it with this:
CleanShot 2023-03-16 at 13 29 59

@ovflowd
Copy link
Member

ovflowd commented Mar 16, 2023

+1 for user agent filtering

@targos
Copy link
Member Author

targos commented Mar 16, 2023

Do we need some kind of consensus to do this? @nodejs/build

@richardlau
Copy link
Member

I'm +1 to trying User Agent filtering.

@targos
Copy link
Member Author

targos commented Mar 16, 2023

Stats for that single IP:
CleanShot 2023-03-16 at 14 02 39

@ovflowd
Copy link
Member

ovflowd commented Mar 16, 2023

These artifactory requests are indeed being problematic.

@ovflowd
Copy link
Member

ovflowd commented Mar 16, 2023

Glad to hear we're tackling this ❤️

@nschonni
Copy link
Member

Wouldn't Artifactory also be downloading the actual Node binaries on correctly configured instances? I'm not sure about blocking the user agent

@targos
Copy link
Member Author

targos commented Mar 16, 2023

I don't know. Is Artifactory supposed to download Node binaries?

@nschonni
Copy link
Member

If they're using it to proxy the downloads and headers like https://stackoverflow.com/a/32299168

@targos
Copy link
Member Author

targos commented Mar 17, 2023

I enabled this rule:
CleanShot 2023-03-17 at 09 28 17

It should allow them to download binaries as they all have an extension (so a dot) in their paths.
I expect it to mitigate this issue a lot, since most npm package name don't contain a dot.

Edit: for those with CF access, you can monitor it here: https://dash.cloudflare.com/07be8d2fbc940503ca1be344714cb0d1/nodejs.org/security/events?rule-id=9dd43c487a9145a5af38fdbf45680b22

@targos
Copy link
Member Author

targos commented Mar 17, 2023

Looks like it works (already 10k blocked requests). There are still many non-blocked requests for ".jar" and ".xml", ".pom", ".module"... will need to tweak the rule for that.
We could use a pattern for the extension, allowing only those which we serve on /dist.

@targos
Copy link
Member Author

targos commented Mar 17, 2023

CleanShot 2023-03-17 at 09 53 38
CleanShot 2023-03-17 at 09 54 19

@targos
Copy link
Member Author

targos commented Mar 17, 2023

Changed the rule to:
CleanShot 2023-03-17 at 16 21 24@2x

Edit: added .json and .tab to the allowlist

@targos
Copy link
Member Author

targos commented Mar 17, 2023

Also added AzureArtifacts/ and Gradle/ to the user agent regex

@ghost
Copy link

ghost commented Mar 17, 2023

Is the expected error from this filter 403 or 404? We're now getting 404s on Artifactory trying to proxy nodejs.org/dist (for downloading node, not for NPM packages), but I'm also seeing 403 when trying to hit nodejs.org/dist directly from Gradle.

@dansisan
Copy link

This is breaking all of our builds, which uses Gradle, on Github Actions. An example HEAD request that's now a 403 "https://nodejs.org/dist/v8.9.3/node-v8.9.3-linux-x64.tar.gz". (It's a 200 w/o the User-Agent.) I don't think we've misconfigured anything?

@marclefrancois
Copy link

+1 on the gradle problem reported by @dansisan
Getting the same issue running a gradle build

@dansisan
Copy link

@targos Would you consider undoing this please? It's quite a disruption.

@targos
Copy link
Member Author

targos commented Mar 17, 2023

We found that there is a bug in the Cloudflare firewall rule builder.
We manually updated the rule and this should now be fixed.
Gradle/ is still part of it, but shouldn't be affected anymore for legitimate requests.

Current rule:

(http.user_agent matches "^(Artifactory/|Nexus/|AzureArtifacts/|Gradle/)" and not http.request.uri.path matches "\.(gz|xz|pkg|7z|zip|msi|txt|asc|sig|lib|exe|json|tab)$")

@ghost
Copy link

ghost commented Mar 17, 2023

Yes, seems to be working for both Gradle and Artifactory now. Thanks!

@labkey-jeckels
Copy link

We found that there is a bug in the Cloudflare firewall rule builder. We manually updated the rule and this should now be fixed. Gradle/ is still part of it, but shouldn't be affected anymore for legitimate requests.

Current rule:

(http.user_agent matches "^(Artifactory/|Nexus/|AzureArtifacts/|Gradle/)" and not http.request.uri.path matches "\.(gz|xz|pkg|7z|zip|msi|txt|asc|sig|lib|exe|json|tab)$")

Thank you, @targos. Our Gradle-based build is looking better now too.

@marclefrancois
Copy link

Thanks, fixed for us too !

@dansisan
Copy link

@targos We're unblocked now too - thank you!

@TimothyTitan
Copy link

Our artifactory instance suddenly started receiving 403 responses for metadata requests. Were there any changes to the firewall rules to specifically block artifactory?

Requests via nvm directly through to nodejs.org/ and curl requests from the same servers are successful, so it appears as though only requests from artifactory are blocked.

@ovflowd
Copy link
Member

ovflowd commented Jun 27, 2023

Hey, @TimothyTitan please refer to srs/gradle-node-plugin#127 as this is intended and not a bug.

This is pretty much (probably) because you have a misconfigured Artifactory installation. If you read the related issue it is explained there.

Thanks!

@tim-goto
Copy link

tim-goto commented Jul 6, 2023

We found that there is a bug in the Cloudflare firewall rule builder. We manually updated the rule and this should now be fixed. Gradle/ is still part of it, but shouldn't be affected anymore for legitimate requests.

Current rule:

(http.user_agent matches "^(Artifactory/|Nexus/|AzureArtifacts/|Gradle/)" and not http.request.uri.path matches "\.(gz|xz|pkg|7z|zip|msi|txt|asc|sig|lib|exe|json|tab)$")

could you please add tgz to the allowed matches?

@targos
Copy link
Member Author

targos commented Jul 6, 2023

@tim-goto Can you please give an example of legit URL that ends with tgz ?

@francis-goto
Copy link

@ovflowd
Copy link
Member

ovflowd commented Jul 6, 2023

That is not a legit use case. Actually that’s exactly what we’re trying to block :)

@francis-goto
Copy link

We're trying to proxy cache everything under https://nodejs.org/dist/ with Artifactory. The volta utility downloads node on every pipeline run.

@ovflowd
Copy link
Member

ovflowd commented Jul 6, 2023

We're trying to proxy cache everything under nodejs.org/dist with Artifactory. The volta utility downloads node on every pipeline run.

So the nodejs.org/dist is not a repository for NPM packages. It is just where our Node.js binaries live.

If you configured your Artifactory installation for using nodejs.org as a repository, that's incorrect and should not be done 🙇

@ovflowd
Copy link
Member

ovflowd commented Jul 6, 2023

That would be true for npm registry mirrors, but you can (and we have) configured Artifactory as a generic caching proxy, since our build machines do not have internet access but Artifactory does.

I'm not sure what your goals are with this comment. You can easily filter what Artifactory should be searching for (Through Regular Expressions, etc). Give a read on srs/gradle-node-plugin#127

@jmccann
Copy link

jmccann commented Jul 6, 2023

Hey, @TimothyTitan please refer to srs/gradle-node-plugin#127 as this is intended and not a bug.

This is pretty much (probably) because you have a misconfigured Artifactory installation. If you read the related issue it is explained there.

Thanks!

This comment didn't make much sense to me. @TimothyTitan commented on trying to mirror node binaries from https://nodejs.org (which exist at https://nodejs.org/dist/) and was directed to a gradle plugin issue. I looked through the issue and wasn't sure how it related to mirroring https://nodejs.org/dist/ from an Artifactory instance. Sorry if its obvious but could use more direction on what I'm looking for in that issue.

We're trying to proxy cache everything under nodejs.org/dist with Artifactory. The volta utility downloads node on every pipeline run.

So the nodejs.org/dist is not a repository for NPM packages. It is just where our Node.js binaries live.

If you configured your Artifactory installation for using nodejs.org as a repository, that's incorrect and should not be done 🙇

This feels also like what @TimothyTitan's comment above was asking for, mirroring everything under https://nodejs.org/dist/ in Artifactory for node binaries.

Are you saying we should not mirror https://nodejs.org/dist/ internally with Artifactory? Or do you think the other person is trying to setup an NPM mirror using https://nodejs.org/dist/ (instead of a NODE binaries mirror)? I think they are just trying to mirror NODE binaries (like we are).

Our use case is basically https://sionwilliams.com/posts/2020-12-09-node-n-npm-mirror

This would allow use to reduce traffic hitting https://nodejs.org/dist/ directly by being able to use our internal mirror.

We had tried to use it and it was working until as @TimothyTitan pointed out we now intermittently get 403 responses back from https://nodejs.org/dist/. This causes the mirror to become unreliable. We've had to revert to having everything use https://nodejs.org/dist/ directly again.

@ovflowd
Copy link
Member

ovflowd commented Jul 6, 2023

This comment didn't make much sense to me. @TimothyTitan commented on trying to mirror node binaries from nodejs.org (which exist at nodejs.org/dist) and was directed to a gradle plugin issue. I looked through the issue and wasn't sure how it related to mirroring nodejs.org/dist from an Artifactory instance. Sorry if its obvious but could use more direction on what I'm looking for in that issue.

@francis-goto gave an example of an npm .tgz file, which I said they should not attempt to mirror. I have no knowledge of Artifactory and Volta, so I have no recollection of what you should do for A, B, C.

I've only stated that those tools should not be used for whatever means of "scanning" nodejs.org/dist for NPM packages, only Node binaries.

Sorry, but this issue is inappropriate for requesting support for either of those tools (Artifactory, Volta, or whatever 😅). (not saying you are, but it's just not the best place for that)

Are you saying we should not mirror nodejs.org/dist internally with Artifactory? Or do you think the other person is trying to setup an NPM mirror using nodejs.org/dist (instead of a NODE binaries mirror)? I think they are just trying to mirror NODE binaries (like we are).

The second statement. Most people affected over here are trying to setup (accidentally, just because they followed StackOverflow, etc.) NPM mirrors.

We had tried to use it and it was working until as @TimothyTitan pointed out we now intermittently get 403 responses back from nodejs.org/dist. This causes the mirror to become unreliable. We've had to revert to having everything use nodejs.org/dist directly again.

As mentioned, the current rules are there to enforce an intended behaviour. I'd recommend reaching out to Artifactory folks for a better understanding of how they internally attempt to "mirror" Node binaries.

We cannot, and will not, update the rules to allow .tgz files as they are virtually used for NPM packages. (The reasons are all documented on this issue)

@ovflowd
Copy link
Member

ovflowd commented Jul 6, 2023

I'm also sorry for not being of better use, the best I can recommend is to reach out the maintainers of the tools that you use :)

@ovflowd
Copy link
Member

ovflowd commented Jul 6, 2023

This issue can also be closed as it is because we've already implemented the rules to block Artifactory for NPM packages.

Updating the rules to fine-tune even more for some exceptions (such as for some reason Artifactory (or whatever) requesting npm-1.XXX.tgz on nodejs.org/dist) could be explored, but it's tricky, and I wouldn't recommend at all anyone to be using NPM 1.0 as it was released in 2011 and it has no use (I doubt it even works with the current NPM registry)

Also, support questions directed to the downstream libraries, such as Artifactory, Gradle and Volta, should be done on their respective repositories. We have no control over what they do and why they do X.

We want to avoid this issue from becoming a spam/off-topic nightmare.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests