Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

user-agent header control #37

Closed
annevk opened this issue Apr 5, 2015 · 43 comments
Closed

user-agent header control #37

annevk opened this issue Apr 5, 2015 · 43 comments

Comments

@annevk
Copy link
Member

annevk commented Apr 5, 2015

Should fetch() set user-agent by default? Allow appending bytes? Allow replacing it? Allow it to be omitted?

See w3c/ServiceWorker#348 (comment) for context.

And why is it on the forbidden header list? It has been since forever, but is there strong rationale?

@hallvors
Copy link

hallvors commented Apr 5, 2015

I think we 'inherited' User-Agent from a list of headers Flash disallowed changing in its implementation. There was some speculation that some services might exist that only allowed specific customized browsers to access them, using UA string as a sort of auth token, but I don't think anybody ever found a real-life case of such a system. (Obviously it would be astonishingly poor design).

Personally, I think we can and should drop User-Agent from the list of headers you're not allowed to set.

@annevk
Copy link
Member Author

annevk commented Apr 5, 2015

If we allow setting it to any value and effectively replace anything the user agent would set we should probably make it a parameter of RequestInit as the supplied headers argument combines with existing headers (and never replaces). So e.g.

fetch(url, {userAgent: "Hi"})

@hallvors
Copy link

hallvors commented Apr 5, 2015

(It might be worth noting that GitHub itself recommends User-Agent set to your GH user name in API requests - this is of course trivial if you write Python/Node/whatever clients, but not currently possible from an API consumer running in a browser. So there's a big and important use case right here.)

@domenic
Copy link
Member

domenic commented Apr 5, 2015

If we allow setting it to any value and effectively replace anything the user agent would set we should probably make it a parameter of RequestInit as the supplied headers argument combines with existing headers (and never replaces). So e.g.

Hmm, I'm not sure I necessarily agree. It seems a straightforward extension to say that headers provided by the RequestInit overwrite default headers. What do you think would be confusing about allowing User-Agent to be set through the normal mechanisms?

@dgraham
Copy link

dgraham commented Apr 5, 2015

GitHub itself recommends User-Agent set to your GH user name in API requests

https://developer.github.com/v3/#user-agent-required

This helps us contact application owners when there are problems, like a rogue script making lots of API calls. However, many API resources are only available to authenticated sessions, so the user account is already known. And when accessing the API through a browser, the CORS request includes the Origin header.

The User-Agent header guidance in the API is helpful, but it's not necessarily an example for or against this capability being added to fetch.

@dgraham
Copy link

dgraham commented Apr 5, 2015

If User-Agent is allowed to be assigned, I think the expected usage would be to pass it in with the rest of the headers for the request.

fetch('/users', {
  headers: {
    'User-Agent': 'Web/2.0',
    'Accept': 'application/json'
  }
})

@annevk
Copy link
Member Author

annevk commented Apr 5, 2015

I guess that might be okay as well. We could have a step in https://fetch.spec.whatwg.org/#http-network-fetch that appends it if it is not already present in HTTPRequest's header list.

@annevk
Copy link
Member Author

annevk commented Apr 6, 2015

So actually, the reason you want it to be an argument rather than part of headers is so you can force it to be omitted.

@annevk
Copy link
Member Author

annevk commented Apr 6, 2015

Probable rationale for why it is currently forbidden is in this email by @sicking: https://lists.w3.org/Archives/Public/public-webapi/2008May/0456.html

@domenic
Copy link
Member

domenic commented Apr 6, 2015

So actually, the reason you want it to be an argument rather than part of headers is so you can force it to be omitted.

Can you expand on this? I don't quite understand why one location or the other would matter for this, or why you would force it to be omitted.

@annevk
Copy link
Member Author

annevk commented Apr 6, 2015

If we offer customization, I think it would make sense to also offer omitting it altogether e.g. to reduce the size of the request or debug a server. The location matters since by default it is included and the Headers class is not an instruction set but rather a list of headers included in the request.

@domenic
Copy link
Member

domenic commented Apr 6, 2015

Oh, I see!

Would it be conceivable not to set the header by default at all? Hmm, probably not very good, I'm being too Node-influenced here...

So how would you omit it? omitUserAgent: true? userAgent: null?

@annevk
Copy link
Member Author

annevk commented Apr 6, 2015

null was my idea, since undefined already means doing the default, which is to include it and seems highly unlikely to change.

@domenic
Copy link
Member

domenic commented Apr 6, 2015

Why would userAgent: null be good, but 'User-Agent': null not be good?

Related question: when you do const h = new Headers(), does h.get("User-Agent") return something? Why or why not?

@annevk
Copy link
Member Author

annevk commented Apr 6, 2015

"User-Agent": null === "User-Agent": "null". new Headers() creates an empty multimap, it is not populated in any way.

@domenic
Copy link
Member

domenic commented Apr 6, 2015

Right, so why is userAgent: null !== userAgent: "null"? This seems kind of WTFWebPlatform-ish to me.

It looks like from my reading of the spec there's no actual way to tell what User-Agent header gets sent then (currently)? E.g. const r = new Request(...); r.headers.get("User-Agent") will return undefined? That seems worrying.

One last point before I go to sleep: I think what's a bit tricky for me here is that the most straightforward mental model for the programmer is that in the constructor, there's some kind of

this.headers = mergeMultimaps(defaultHeaders, passedHeaders);

That is, just naively looking at the shape of the API, and knowing the bit of extra information that there are more headers sent in the request/response than you set, I think most programmers will guess that there's a set of default headers and you can set new headers or override old ones by using the headers option. So, the deviations we make from that model need to be done cautiously.

For example in that model I think if passedHeaders contained "User-Agent": undefined, that would override the default "User-Agent": "Mozilla/5.0 (Windows NT 6.3; WOW64; rv:37.0) Gecko/20100101 Firefox/37.0".

It seems like that is not the model though, as evidenced by the r.headers.get("User-Agent") === undefined. Rather, r.headers is a straight copy of the passed-in headers, and then the actual fetch uses a hidden set of headers which are compiled based on r.headers + other knowledge (possibly from RequestInit, also from the cache, also from the user-agent string, etc.)

@annevk
Copy link
Member Author

annevk commented Apr 6, 2015

Basically the model is that the network layer appends a set of headers that are not in control of the developer. And that headers.set() stringifies its arguments and validates them against the HTTP header syntax seems totally natural and not at all weird.

@sicking
Copy link

sicking commented Apr 6, 2015

FWIW, if that email from me is really the reason that we don't allow setting the user-agent, then I believe that that's fixable.

If we simply treat user-agent as a "custom" header, then things seem fine with me. I.e. we can let it be set completely by the page for same-origin requests. For cross-origin requests we can allow it to be set, but it'd cause a preflight, and would require the server to send a "access-control-allow-headers: user-agent" header in the response.

I don't have opinions about how to deal with getting the default value or removing the header completely.

@annevk
Copy link
Member Author

annevk commented Apr 7, 2015

Given the requirements and existing constraints I see a couple of options:

  1. User-Agent is passed in as part of headers. We provide a distinct omitUserAgent boolean option for removing any User-Agent headers from the request in the network layer. omitUserAgent is also put on Request.prototype. The network layer only appends User-Agent header with a default value if no User-Agent header is present.
  2. userAgent is an option that is undefined, null, or a header value. undefined indicates the network layer inserts a User-Agent header with a default value. null means the network layer does nothing. A header value means the network layer inserts User-Agent header with that value.

2 seems a little cleaner to me, but I don't care strongly.

@mathiasbynens
Copy link
Member

@cure53 comments:

Will XSS via User-Agent string become a reality thanks to fetch()?

@cure53
Copy link

cure53 commented Apr 7, 2015

Despite the discussion having progressed a lot already, do you consider XSS or CRSF-to-XSS via UA string in scope for the spec? Or would that be something, the implementers have to take care of on their own?

P.S. @mathiasbynens You are faster than light :D

@cure53
Copy link

cure53 commented Apr 7, 2015

One note about my context maybe:

Right now, for penetration tests, we use malformed UA strings to aim for persistent XSS or Intranet XSS. Giving an attacker control over the UA string via fetch() opens the door to abuse that in a CSRF scenario and beyondd. Not sure if that is a good idea. Thus my question, if you consider that to be in scope or not.

@annevk
Copy link
Member Author

annevk commented Apr 7, 2015

The header can only be set for fetches that are same-origin or subject to CORS (with a preflight where the server needs to opt into User-Agent), same as all developer-set headers. If that still enables attacks I would love to see an example.

@cure53
Copy link

cure53 commented Apr 7, 2015

@annevk So, to quote @sicking here:

For cross-origin requests we can allow it to be set, but it'd cause a preflight, and would require the server to send a "access-control-allow-headers"

If that means, the request will not even be sent with the modified UA in case of cross-origin requests and a failed CORS preflight, then spec-wise this should be fine.

@cure53
Copy link

cure53 commented Apr 7, 2015

@annevk Alright, I had a closer look at how fetch() is implemented in Blink at the moment.

If the custom UA string header is in fact being implemented the same way as any other custom header - meaning that the particular header has to be permitted via CORS for cross-origin requests, then this should indeed be safe! No objections from my side so far.

@steike
Copy link

steike commented Apr 7, 2015

There is value in having a reliable user-agent header. Historically we've had some browser bugs where it was possible to protect the user server-side, but where the fix would be too expensive (in terms of cost, perf or user annoyance) to apply to all users.

More commonly, a new browser feature might allow some feature to be reimplemented in a more secure way; with a reliable user-agent header, it's easy to disable the "unsafe" back-compat implementation in decent browsers.

If a malicious script can lie about the browser version, protecting against such attacks becomes a lot harder.

(The fact that this is same-origin-or-CORS-only helps a lot, of course, but not in the case where the particular browser bug is that the browser is confused about what counts as the same origin...)

Would a solution where the user data is either appended or prepended to the 'responsible' user agent be an acceptable compromise?

@annevk
Copy link
Member Author

annevk commented Apr 7, 2015

If the browser is confused about same-origin that would be a much bigger problems than setting User-Agent.

@steike
Copy link

steike commented Apr 7, 2015

Wait, I think you got that backwards ("if the zombies attack, you have bigger problems than your broken shotgun"). I'm not worried about the user agent as an attack vector; I want to keep using it for defense.

@cure53
Copy link

cure53 commented Apr 7, 2015

@steike I believe, with this feature, you have even more possibilities to use the UA string header as a defensive feature. Differently, yes - but more powerful too.

@annevk
Copy link
Member Author

annevk commented Apr 7, 2015

@steike you already ceded that same-origin-or-CORS helps and your counter argument to that was bogus. If a browser is confused about same-origin that would be a high priority security bug.

@steike
Copy link

steike commented Apr 7, 2015

Yes, a high-pri bug. That doesn't mean it wouldn't take the vendor months to fix it, or that users would upgrade instantly once the fix was out.

Let's say that 10% of users have browsers that are vulnerable to a bug. Let's say when this particular vulnerability is exploited, there is a detectable header anomaly. Let's say 1% of all users happen to be behind various crappy firewalls that introduce the same anomaly for legitimate requests.

With a working user-agent header, we can block the attack by telling 0.1% of users that they must upgrade their browser before they can use the site. Without, the choice would be to force-upgrade 10% of users, outright block 1% of users, or hope no one finds the bug.

...

It's not the end of the world, of course. We can add a few bits to some cookie; it'll just be one more layer of web cruft to carry around. You asked for a rationale for keeping a working UA header. What I have is "those of us who need it will have to reimplement the feature if you take it away". To be fair there's probably not that many of us. If the benefit of User-Agent over X-Requested-With is great, we can live with it.

@annevk
Copy link
Member Author

annevk commented Apr 7, 2015

@steike I don't understand your attack scenario. The User-Agent retains its default value for the majority of fetches. It is only fetches that go through fetch() that are potentially impacted and those are under control of the site.

And cross-origin resources need to explicitly opt-in to allowing User-Agent headers with non-default values, but your attack does not seem to concern those.

@igrigorik
Copy link
Member

userAgent is an option that is undefined, null, or a header value. undefined indicates the network layer inserts a User-Agent header with a default value. null means the network layer does nothing. A header value means the network layer inserts User-Agent header with that value.

+1 to this route. I agree with @domenic's comment on "mergeMultimaps" mental model (#37 (comment)) as being developer friendly and most intuitive.

@sicking
Copy link

sicking commented Apr 8, 2015

Whatever syntax we end up using, we should make it very explicit that "removing" the user-agent header should from a CORS point of view be equivalent to setting it. So it would still require server opt-in. Adding a note to this effect might increase the chances that the browser actually tests for this case.

@mnot
Copy link
Member

mnot commented Apr 9, 2015

.02 - it sure would be nice if UA could be appended to, so you can add "mywidget/1.0" or "(test 3)" for example, rather than blowing the entire thing away.

Syntax here:
http://httpwg.github.io/specs/rfc7231.html#header.user-agent

@domenic
Copy link
Member

domenic commented Apr 9, 2015

navigator.userAgent + 'my string'

@annevk
Copy link
Member Author

annevk commented May 7, 2015

@sicking I don't understand why omitting the header altogether is a cause for concern. It wasn't when we stopped sending Referer.

@annevk
Copy link
Member Author

annevk commented Jul 17, 2015

I'm inclined to define 1 in #37 (comment) but leave the omitUserAgent feature out for now since that is harder to tackle (unless @sicking is wrong and it should not require explicit opt-in to omit it).

@annevk annevk closed this as completed in dab09b0 Jul 29, 2015
@annevk
Copy link
Member Author

annevk commented Jul 29, 2015

If we need omitUserAgent please file a new issue. Ideally reach out to user agents for their security requirements, since they're not entirely clear to me.

@wanderview
Copy link
Member

@benwa
Copy link

benwa commented Jan 6, 2016

chromium bug: https://crbug.com/571722

@dandv
Copy link

dandv commented Jun 17, 2020

Is User-Agent still forbidden after dab09b0 ?

@annevk
Copy link
Member Author

annevk commented Jun 18, 2020

It's not as per this change.

qtprojectorg pushed a commit to qt/qtdeclarative that referenced this issue Oct 6, 2020
The Fetch spec has allowed it for a while (in other words,
it's no longer forbidden):

* https://fetch.spec.whatwg.org/#terminology-headers
* https://developer.mozilla.org/en-US/docs/Glossary/Forbidden_header_name

Cf. also

* whatwg/fetch#37
* whatwg/fetch@dab09b0

[ChangeLog][QtQml][XmlHttpRequest] It is now possible to set the
User-Agent header.

Change-Id: I1d5bd785223e9df2883011f873d440a63e363a24
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Reviewed-by: Ulf Hermann <ulf.hermann@qt.io>
Reviewed-by: Timur Pocheptsov <timur.pocheptsov@qt.io>
Reviewed-by: Fabian Kosmale <fabian.kosmale@qt.io>
andreubotella added a commit to web-platform-tests/wpt that referenced this issue Mar 31, 2023
`User-Agent` used to be a forbidden request header until it was
removed in whatwg/fetch#37. However, this was never added as a WPT
test, and Chrome still treats it as forbidden. This change adds that
test.
andreubotella added a commit to web-platform-tests/wpt that referenced this issue Mar 31, 2023
)

`User-Agent` used to be a forbidden request header until it was
removed in whatwg/fetch#37. However, this was never added as a WPT
test, and Chrome still treats it as forbidden. This change adds that
test.
cookiecrook pushed a commit to cookiecrook/wpt that referenced this issue Apr 8, 2023
…-platform-tests#39301)

`User-Agent` used to be a forbidden request header until it was
removed in whatwg/fetch#37. However, this was never added as a WPT
test, and Chrome still treats it as forbidden. This change adds that
test.
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Apr 15, 2023
…forbidden request header, a=testonly

Automatic update from web-platform-tests
[fetch] Test that `User-Agent` is not a forbidden request header (#39301)

`User-Agent` used to be a forbidden request header until it was
removed in whatwg/fetch#37. However, this was never added as a WPT
test, and Chrome still treats it as forbidden. This change adds that
test.
--

wpt-commits: 55ea64f9c5c0a073bfda1bb1b3343c0048258171
wpt-pr: 39301
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests