`user-agent` header control #37

annevk · 2015-04-05T11:05:15Z

Should fetch() set user-agent by default? Allow appending bytes? Allow replacing it? Allow it to be omitted?

See w3c/ServiceWorker#348 (comment) for context.

And why is it on the forbidden header list? It has been since forever, but is there strong rationale?

The text was updated successfully, but these errors were encountered:

hallvors · 2015-04-05T15:46:09Z

I think we 'inherited' User-Agent from a list of headers Flash disallowed changing in its implementation. There was some speculation that some services might exist that only allowed specific customized browsers to access them, using UA string as a sort of auth token, but I don't think anybody ever found a real-life case of such a system. (Obviously it would be astonishingly poor design).

Personally, I think we can and should drop User-Agent from the list of headers you're not allowed to set.

annevk · 2015-04-05T16:20:04Z

If we allow setting it to any value and effectively replace anything the user agent would set we should probably make it a parameter of RequestInit as the supplied headers argument combines with existing headers (and never replaces). So e.g.

fetch(url, {userAgent: "Hi"})

hallvors · 2015-04-05T18:16:48Z

(It might be worth noting that GitHub itself recommends User-Agent set to your GH user name in API requests - this is of course trivial if you write Python/Node/whatever clients, but not currently possible from an API consumer running in a browser. So there's a big and important use case right here.)

domenic · 2015-04-05T18:23:37Z

If we allow setting it to any value and effectively replace anything the user agent would set we should probably make it a parameter of RequestInit as the supplied headers argument combines with existing headers (and never replaces). So e.g.

Hmm, I'm not sure I necessarily agree. It seems a straightforward extension to say that headers provided by the RequestInit overwrite default headers. What do you think would be confusing about allowing User-Agent to be set through the normal mechanisms?

dgraham · 2015-04-05T18:38:19Z

GitHub itself recommends User-Agent set to your GH user name in API requests

https://developer.github.com/v3/#user-agent-required

This helps us contact application owners when there are problems, like a rogue script making lots of API calls. However, many API resources are only available to authenticated sessions, so the user account is already known. And when accessing the API through a browser, the CORS request includes the Origin header.

The User-Agent header guidance in the API is helpful, but it's not necessarily an example for or against this capability being added to fetch.

dgraham · 2015-04-05T18:52:04Z

If User-Agent is allowed to be assigned, I think the expected usage would be to pass it in with the rest of the headers for the request.

fetch('/users', {
  headers: {
    'User-Agent': 'Web/2.0',
    'Accept': 'application/json'
  }
})

annevk · 2015-04-05T19:00:41Z

I guess that might be okay as well. We could have a step in https://fetch.spec.whatwg.org/#http-network-fetch that appends it if it is not already present in HTTPRequest's header list.

annevk · 2015-04-06T05:34:25Z

So actually, the reason you want it to be an argument rather than part of headers is so you can force it to be omitted.

annevk · 2015-04-06T06:21:26Z

Probable rationale for why it is currently forbidden is in this email by @sicking: https://lists.w3.org/Archives/Public/public-webapi/2008May/0456.html

domenic · 2015-04-06T06:55:50Z

So actually, the reason you want it to be an argument rather than part of headers is so you can force it to be omitted.

Can you expand on this? I don't quite understand why one location or the other would matter for this, or why you would force it to be omitted.

annevk · 2015-04-06T07:01:51Z

If we offer customization, I think it would make sense to also offer omitting it altogether e.g. to reduce the size of the request or debug a server. The location matters since by default it is included and the Headers class is not an instruction set but rather a list of headers included in the request.

domenic · 2015-04-06T07:07:58Z

Oh, I see!

Would it be conceivable not to set the header by default at all? Hmm, probably not very good, I'm being too Node-influenced here...

So how would you omit it? omitUserAgent: true? userAgent: null?

annevk · 2015-04-06T07:09:27Z

null was my idea, since undefined already means doing the default, which is to include it and seems highly unlikely to change.

domenic · 2015-04-06T07:10:56Z

Why would userAgent: null be good, but 'User-Agent': null not be good?

Related question: when you do const h = new Headers(), does h.get("User-Agent") return something? Why or why not?

annevk · 2015-04-06T07:14:12Z

"User-Agent": null === "User-Agent": "null". new Headers() creates an empty multimap, it is not populated in any way.

domenic · 2015-04-06T07:20:19Z

Right, so why is userAgent: null !== userAgent: "null"? This seems kind of WTFWebPlatform-ish to me.

It looks like from my reading of the spec there's no actual way to tell what User-Agent header gets sent then (currently)? E.g. const r = new Request(...); r.headers.get("User-Agent") will return undefined? That seems worrying.

One last point before I go to sleep: I think what's a bit tricky for me here is that the most straightforward mental model for the programmer is that in the constructor, there's some kind of

this.headers = mergeMultimaps(defaultHeaders, passedHeaders);

That is, just naively looking at the shape of the API, and knowing the bit of extra information that there are more headers sent in the request/response than you set, I think most programmers will guess that there's a set of default headers and you can set new headers or override old ones by using the headers option. So, the deviations we make from that model need to be done cautiously.

For example in that model I think if passedHeaders contained "User-Agent": undefined, that would override the default "User-Agent": "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0) Gecko/20100101 Firefox/37.0".

It seems like that is not the model though, as evidenced by the r.headers.get("User-Agent") === undefined. Rather, r.headers is a straight copy of the passed-in headers, and then the actual fetch uses a hidden set of headers which are compiled based on r.headers + other knowledge (possibly from RequestInit, also from the cache, also from the user-agent string, etc.)

annevk · 2015-04-06T07:24:25Z

Basically the model is that the network layer appends a set of headers that are not in control of the developer. And that headers.set() stringifies its arguments and validates them against the HTTP header syntax seems totally natural and not at all weird.

sicking · 2015-04-06T10:06:15Z

FWIW, if that email from me is really the reason that we don't allow setting the user-agent, then I believe that that's fixable.

If we simply treat user-agent as a "custom" header, then things seem fine with me. I.e. we can let it be set completely by the page for same-origin requests. For cross-origin requests we can allow it to be set, but it'd cause a preflight, and would require the server to send a "access-control-allow-headers: user-agent" header in the response.

I don't have opinions about how to deal with getting the default value or removing the header completely.

annevk · 2015-04-07T10:02:58Z

Given the requirements and existing constraints I see a couple of options:

User-Agent is passed in as part of headers. We provide a distinct omitUserAgent boolean option for removing any User-Agent headers from the request in the network layer. omitUserAgent is also put on Request.prototype. The network layer only appends User-Agent header with a default value if no User-Agent header is present.
userAgent is an option that is undefined, null, or a header value. undefined indicates the network layer inserts a User-Agent header with a default value. null means the network layer does nothing. A header value means the network layer inserts User-Agent header with that value.

2 seems a little cleaner to me, but I don't care strongly.

mathiasbynens · 2015-04-07T11:12:49Z

@cure53 comments:

Will XSS via User-Agent string become a reality thanks to fetch()?

cure53 · 2015-04-07T11:13:35Z

Despite the discussion having progressed a lot already, do you consider XSS or CRSF-to-XSS via UA string in scope for the spec? Or would that be something, the implementers have to take care of on their own?

P.S. @mathiasbynens You are faster than light :D

cure53 · 2015-04-07T11:17:05Z

One note about my context maybe:

Right now, for penetration tests, we use malformed UA strings to aim for persistent XSS or Intranet XSS. Giving an attacker control over the UA string via fetch() opens the door to abuse that in a CSRF scenario and beyondd. Not sure if that is a good idea. Thus my question, if you consider that to be in scope or not.

annevk · 2015-04-07T11:39:24Z

The header can only be set for fetches that are same-origin or subject to CORS (with a preflight where the server needs to opt into User-Agent), same as all developer-set headers. If that still enables attacks I would love to see an example.

cure53 · 2015-04-07T11:46:30Z

@annevk So, to quote @sicking here:

For cross-origin requests we can allow it to be set, but it'd cause a preflight, and would require the server to send a "access-control-allow-headers"

If that means, the request will not even be sent with the modified UA in case of cross-origin requests and a failed CORS preflight, then spec-wise this should be fine.

cure53 · 2015-04-07T12:03:24Z

@annevk Alright, I had a closer look at how fetch() is implemented in Blink at the moment.

If the custom UA string header is in fact being implemented the same way as any other custom header - meaning that the particular header has to be permitted via CORS for cross-origin requests, then this should indeed be safe! No objections from my side so far.

steike · 2015-04-07T12:24:31Z

There is value in having a reliable user-agent header. Historically we've had some browser bugs where it was possible to protect the user server-side, but where the fix would be too expensive (in terms of cost, perf or user annoyance) to apply to all users.

More commonly, a new browser feature might allow some feature to be reimplemented in a more secure way; with a reliable user-agent header, it's easy to disable the "unsafe" back-compat implementation in decent browsers.

If a malicious script can lie about the browser version, protecting against such attacks becomes a lot harder.

(The fact that this is same-origin-or-CORS-only helps a lot, of course, but not in the case where the particular browser bug is that the browser is confused about what counts as the same origin...)

Would a solution where the user data is either appended or prepended to the 'responsible' user agent be an acceptable compromise?

annevk · 2015-04-07T13:00:04Z

If the browser is confused about same-origin that would be a much bigger problems than setting User-Agent.

steike · 2015-04-07T13:44:05Z

Wait, I think you got that backwards ("if the zombies attack, you have bigger problems than your broken shotgun"). I'm not worried about the user agent as an attack vector; I want to keep using it for defense.

cure53 · 2015-04-07T13:49:00Z

@steike I believe, with this feature, you have even more possibilities to use the UA string header as a defensive feature. Differently, yes - but more powerful too.

annevk · 2015-04-07T14:05:11Z

@steike you already ceded that same-origin-or-CORS helps and your counter argument to that was bogus. If a browser is confused about same-origin that would be a high priority security bug.

steike · 2015-04-07T14:59:47Z

Yes, a high-pri bug. That doesn't mean it wouldn't take the vendor months to fix it, or that users would upgrade instantly once the fix was out.

Let's say that 10% of users have browsers that are vulnerable to a bug. Let's say when this particular vulnerability is exploited, there is a detectable header anomaly. Let's say 1% of all users happen to be behind various crappy firewalls that introduce the same anomaly for legitimate requests.

With a working user-agent header, we can block the attack by telling 0.1% of users that they must upgrade their browser before they can use the site. Without, the choice would be to force-upgrade 10% of users, outright block 1% of users, or hope no one finds the bug.

...

It's not the end of the world, of course. We can add a few bits to some cookie; it'll just be one more layer of web cruft to carry around. You asked for a rationale for keeping a working UA header. What I have is "those of us who need it will have to reimplement the feature if you take it away". To be fair there's probably not that many of us. If the benefit of User-Agent over X-Requested-With is great, we can live with it.

annevk · 2015-04-07T15:10:10Z

@steike I don't understand your attack scenario. The User-Agent retains its default value for the majority of fetches. It is only fetches that go through fetch() that are potentially impacted and those are under control of the site.

And cross-origin resources need to explicitly opt-in to allowing User-Agent headers with non-default values, but your attack does not seem to concern those.

igrigorik · 2015-04-07T17:36:59Z

userAgent is an option that is undefined, null, or a header value. undefined indicates the network layer inserts a User-Agent header with a default value. null means the network layer does nothing. A header value means the network layer inserts User-Agent header with that value.

+1 to this route. I agree with @domenic's comment on "mergeMultimaps" mental model (#37 (comment)) as being developer friendly and most intuitive.

sicking · 2015-04-08T05:26:21Z

Whatever syntax we end up using, we should make it very explicit that "removing" the user-agent header should from a CORS point of view be equivalent to setting it. So it would still require server opt-in. Adding a note to this effect might increase the chances that the browser actually tests for this case.

mnot · 2015-04-09T00:24:31Z

.02 - it sure would be nice if UA could be appended to, so you can add "mywidget/1.0" or "(test 3)" for example, rather than blowing the entire thing away.

Syntax here:
http://httpwg.github.io/specs/rfc7231.html#header.user-agent

domenic · 2015-04-09T00:42:44Z

navigator.userAgent + 'my string'

annevk · 2015-05-07T12:16:12Z

@sicking I don't understand why omitting the header altogether is a cause for concern. It wasn't when we stopped sending Referer.

annevk · 2015-07-17T10:45:42Z

I'm inclined to define 1 in #37 (comment) but leave the omitUserAgent feature out for now since that is harder to tackle (unless @sicking is wrong and it should not require explicit opt-in to omit it).

annevk · 2015-07-29T13:39:17Z

If we need omitUserAgent please file a new issue. Ideally reach out to user agents for their security requirements, since they're not entirely clear to me.

wanderview · 2015-07-29T16:04:44Z

gecko bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1188932

benwa · 2016-01-06T18:31:38Z

chromium bug: https://crbug.com/571722

dandv · 2020-06-17T22:49:02Z

Is User-Agent still forbidden after dab09b0 ?

annevk · 2020-06-18T05:22:29Z

It's not as per this change.

The Fetch spec has allowed it for a while (in other words, it's no longer forbidden): * https://fetch.spec.whatwg.org/#terminology-headers * https://developer.mozilla.org/en-US/docs/Glossary/Forbidden_header_name Cf. also * whatwg/fetch#37 * whatwg/fetch@dab09b0 [ChangeLog][QtQml][XmlHttpRequest] It is now possible to set the User-Agent header. Change-Id: I1d5bd785223e9df2883011f873d440a63e363a24 Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org> Reviewed-by: Ulf Hermann <ulf.hermann@qt.io> Reviewed-by: Timur Pocheptsov <timur.pocheptsov@qt.io> Reviewed-by: Fabian Kosmale <fabian.kosmale@qt.io>

`User-Agent` used to be a forbidden request header until it was removed in whatwg/fetch#37. However, this was never added as a WPT test, and Chrome still treats it as forbidden. This change adds that test.

) `User-Agent` used to be a forbidden request header until it was removed in whatwg/fetch#37. However, this was never added as a WPT test, and Chrome still treats it as forbidden. This change adds that test.

…-platform-tests#39301) `User-Agent` used to be a forbidden request header until it was removed in whatwg/fetch#37. However, this was never added as a WPT test, and Chrome still treats it as forbidden. This change adds that test.

…forbidden request header, a=testonly Automatic update from web-platform-tests [fetch] Test that `User-Agent` is not a forbidden request header (#39301) `User-Agent` used to be a forbidden request header until it was removed in whatwg/fetch#37. However, this was never added as a WPT test, and Chrome still treats it as forbidden. This change adds that test. -- wpt-commits: 55ea64f9c5c0a073bfda1bb1b3343c0048258171 wpt-pr: 39301

annevk mentioned this issue Apr 5, 2015

User-Agent header control w3c/ServiceWorker#399

Closed

igrigorik mentioned this issue Apr 7, 2015

Initializing context/content specific fetch defaults #43

Closed

annevk closed this as completed in dab09b0 Jul 29, 2015

annevk mentioned this issue Feb 20, 2017

User-Agent can now be set, update XMLHttpRequest and fetch() tests as appropriate web-platform-tests/wpt#2042

Closed

flaki mentioned this issue Mar 18, 2018

GitHub API does not include user-agent in Access-Control-Allow-Headers octokit/octokit.js#817

Closed

lidel mentioned this issue Nov 27, 2018

Add user-agent to default list of Access-Control-Allow-Headers ipfs/kubo#5138

Closed

dandv mentioned this issue Jun 17, 2020

Is user-agent supported? lquixada/cross-fetch#65

Closed

johnboxall mentioned this issue Dec 9, 2021

Custom User-Agent used in browser produces warning in Chrome SalesforceCommerceCloud/commerce-sdk-isomorphic#60

Closed

andreubotella mentioned this issue Mar 31, 2023

[fetch] Test that User-Agent is not a forbidden request header web-platform-tests/wpt#39301

Merged

achingbrain mentioned this issue May 19, 2023

Helia identifies itself to the network ipfs/helia#122

Closed

user-agent header control #37

user-agent header control #37

Comments

annevk commented Apr 5, 2015

hallvors commented Apr 5, 2015

annevk commented Apr 5, 2015

hallvors commented Apr 5, 2015

domenic commented Apr 5, 2015

dgraham commented Apr 5, 2015

dgraham commented Apr 5, 2015

annevk commented Apr 5, 2015

annevk commented Apr 6, 2015

annevk commented Apr 6, 2015

domenic commented Apr 6, 2015

annevk commented Apr 6, 2015

domenic commented Apr 6, 2015

annevk commented Apr 6, 2015

domenic commented Apr 6, 2015

annevk commented Apr 6, 2015

domenic commented Apr 6, 2015

annevk commented Apr 6, 2015

sicking commented Apr 6, 2015

annevk commented Apr 7, 2015

mathiasbynens commented Apr 7, 2015

cure53 commented Apr 7, 2015

cure53 commented Apr 7, 2015

annevk commented Apr 7, 2015

cure53 commented Apr 7, 2015

cure53 commented Apr 7, 2015

steike commented Apr 7, 2015

annevk commented Apr 7, 2015

steike commented Apr 7, 2015

cure53 commented Apr 7, 2015

annevk commented Apr 7, 2015

steike commented Apr 7, 2015

annevk commented Apr 7, 2015

igrigorik commented Apr 7, 2015

sicking commented Apr 8, 2015

mnot commented Apr 9, 2015

domenic commented Apr 9, 2015

annevk commented May 7, 2015

annevk commented Jul 17, 2015

annevk commented Jul 29, 2015

wanderview commented Jul 29, 2015

benwa commented Jan 6, 2016

dandv commented Jun 17, 2020

annevk commented Jun 18, 2020

`user-agent` header control #37

`user-agent` header control #37