Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URL parsing bug with http://example.com//foo/bar #391

Closed
taer opened this issue Mar 27, 2019 · 3 comments
Closed

URL parsing bug with http://example.com//foo/bar #391

taer opened this issue Mar 27, 2019 · 3 comments
Assignees
Labels

Comments

@taer
Copy link
Contributor

taer commented Mar 27, 2019

When a URL comes into Styx like this

https://test.homeaway.co.uk//www.homeaway.co.uk/p6873759, it gets messed up inside of Styx. The request path that pops out in the request.path()

path=/p6873759

because styx uses URI.create to parse the path from the HTTP first line path, and a //foo/bar as URI encodes foo as the host.

It's here in Styx's Url class

        public static Builder url(String value) {
            URI uri = URI.create(value);
            return new Builder()
                    .scheme(uri.getScheme())
                    .authority(uri.getAuthority())
                    .path(uri.getRawPath())
                    .rawQuery(uri.getRawQuery())
                    .fragment(uri.getFragment()); 

value here is just the contents from the first http line.

But, it also seems the downstream app is given the original request still. So origin will see path of //www.homeaway.co.uk/p6873759, while styx logged and treated as /p6873759

@mikkokar
Copy link
Contributor

mikkokar commented Mar 28, 2019

Summary:

A wrong path component is exposed to the plugins: /p6873759. It should be //www.homeaway.co.uk/p6873759 instead. The request is still correctly proxied in its origin form.

Other observations

Browsers would have parsed the URL and constructed an HTTP request before sending it to styx. This is how Styx sees is:

GET //www.homeaway.co.uk/p6873759 HTTP/1.1
Host: localhost:8080
User-Agent: curl/7.63.0
Accept: */*

Because //www.homeaway.co.uk/p6873759 starts with //, URI.create treats it as an authority part, and the rest as a path.

It is correct behaviour for Styx to proxy the request as //www.homeaway.co.uk/p6873759, but it would be incorrect to expose /p6873759 bit alone as a path component. Also it is unclear how it gets reassembled back to the correct form when proxied.

@mikkokar
Copy link
Contributor

Styx proxies the correct URL to origins because HttpRequestOperation.toNettyRequest uses url().toString() in sending the request. This presumably puts together both authority and path components, thus re-assembling the original URL.

Okay. We now need to figure out how to extract the correct path component for use in Styx URL.

@mikkokar mikkokar self-assigned this Mar 28, 2019
@mikkokar mikkokar added the bug label Mar 28, 2019
@mikkokar
Copy link
Contributor

Hi @taer. I have a fix in here: https://github.com/mikkokar/styx/tree/issue-391-url-parsing. Could you give it a try please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants