Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: URI encoding issue #5654

Closed
hozaifaaoud opened this issue Nov 30, 2021 · 30 comments
Closed

bug: URI encoding issue #5654

hozaifaaoud opened this issue Nov 30, 2021 · 30 comments
Labels

Comments

@hozaifaaoud
Copy link

Issue description

I created a route
/api/v1/groups/*
and made a URI Override
Regexp: /api(/v1/groups/.*)
now, I am trying this url
127.0.0.1:9080/api/v1/groups/
and facing this issue
HTTP Status 400 – Bad Request
submitted request
https://mydomain.com/api/v1/groups/%3CNumber%3E
forwarded request
127.0.0.1:9080/api/v1/groups/

Environment

  • apisix version (cmd: apisix version): 2.5
  • OS (cmd: uname -a): 60-Ubuntu SMP
  • OpenResty / Nginx version (cmd: nginx -V or openresty -V): openresty/1.19.9.1
  • etcd version, if have (cmd: run curl http://127.0.0.1:9090/v1/server_info to get the info from server-info API): "3.4.0"
  • apisix-dashboard version, if have:
  • the plugin runner version, if the issue is about a plugin runner (cmd: depended on the kind of runner):
  • luarocks version, if the issue is about installation (cmd: luarocks --version): 3.4.0

Steps to reproduce

//

Actual result

HTTP Status 400 – Bad Request

Error log

//

Expected result

No response

@tzssangglass
Copy link
Member

pls show the config of route

@hozaifaaoud
Copy link
Author

{
"uris": [
"/api/v1/groups/"
],
"name": "by_groupId",
"desc": "by group by id",
"methods": [
"GET"
],
"plugins": {
"key-auth": {
"disable": false
},
"proxy-rewrite": {
"regex_uri": [
"\/api(\/v1\/groups\/.
)",
"$1"
]
}
},
"service_id": "379565504387351559",
"upstream_id": "379565438486447111",
"labels": {
"API_VERSION": "1.0"
},
"status": 1
}

@tzssangglass
Copy link
Member

Based on this configuration, what do you want to achieve and what are the expected results?

I created a route
/api/v1/groups/*
and made a URI Override
Regexp: /api(/v1/groups/.*)
now, I am trying this url
127.0.0.1:9080/api/v1/groups/
and facing this issue
HTTP Status 400 – Bad Request
submitted request
mydomain.com/api/v1/groups/%3CNumber%3E
forwarded request
127.0.0.1:9080/api/v1/groups/

there is some confusion here, I cannot got it.

@hozaifaaoud
Copy link
Author

I am facing this error
HTTP Status 400 – Bad Request
while trying this
https://mydomain.com/api/v1/groups/%3CNumber%3E

@yasser-mas2
Copy link

yasser-mas2 commented Dec 2, 2021

@tzssangglass Simply we are trying to reroute an URL to an internal server by replacing ( deleting ) endpoint prefix

E.g /api/groups/... -> my.internal.server/groups/....

so we have used redirect plugin by matching regex_uri /api(/v1/groups/.*) => replaced with $1 group1 ), but it seems that redirect` plugin is decoding URL to match the regex but not encoding it again

I'm calling this endpoint /api/groups/<Number> so HTTP client ( Postman ) encoded the <> characters to be /api/groups/%3CNumber%3E, so I'm assuming that APISIX gateway plugin to remove /api from the request and forward to /groups/%3CNumber%3E but actually it forward the request to the correct internal server but decoding the encoded characters so the request submitted to the internal service as /groups/<Number> not /groups/%3CNumber%3E

@tokers
Copy link
Contributor

tokers commented Dec 3, 2021

{ "uris": [ "/api/v1/groups/" ], "name": "by_groupId", "desc": "by group by id", "methods": [ "GET" ], "plugins": { "key-auth": { "disable": false }, "proxy-rewrite": { "regex_uri": [ "/api(/v1/groups/.)", "$1" ] } }, "service_id": "379565504387351559", "upstream_id": "379565438486447111", "labels": { "API_VERSION": "1.0" }, "status": 1 }

The regex pattern is \/api(\/v1\/groups\/._, is this correct?

@tzssangglass
Copy link
Member

Here are the steps I took to reproduce it on master branch

  1. backend is an openresty server, nginx.conf is like
master_process on;

worker_processes 2;

error_log logs/error.log warn;
pid logs/nginx.pid;

worker_rlimit_nofile 20480;

events {
    accept_mutex off;
    worker_connections 10620;
}

worker_shutdown_timeout 3;

http {
    server {
        listen 1980;

        access_log off;
        location / {
            content_by_lua_block {
                ngx.say("uri: ", ngx.var.uri)
            }
        }
    }
}
  1. route config of APISIX
curl --location --request PUT 'http://127.0.0.1:9080/apisix/admin/routes/1' \
--header 'X-API-KEY: edd1c9f034335f136f87ad84b625c8f1' \
--header 'Content-Type: application/json' \
--data-raw '{
    "plugins": {
        "proxy-rewrite": {
            "regex_uri": ["^/api/(.*)","/$1"]
        }
    },
    "upstream": {
        "nodes": {
            "127.0.0.1:1980": 1
        },
        "type": "roundrobin"
    },
    "uri": "/api/groups/*"
}'
  1. test
$curl http://127.0.0.1:9080/api/groups/%3CNumber%3E
uri: /groups/<Number>

Is this what you want?

@tzssangglass
Copy link
Member

@yasser-mas2
Copy link

yasser-mas2 commented Dec 3, 2021

Here are the steps I took to reproduce it on master branch

  1. backend is an openresty server, nginx.conf is like
master_process on;

worker_processes 2;

error_log logs/error.log warn;
pid logs/nginx.pid;

worker_rlimit_nofile 20480;

events {
    accept_mutex off;
    worker_connections 10620;
}

worker_shutdown_timeout 3;

http {
    server {
        listen 1980;

        access_log off;
        location / {
            content_by_lua_block {
                ngx.say("uri: ", ngx.var.uri)
            }
        }
    }
}
  1. route config of APISIX
curl --location --request PUT 'http://127.0.0.1:9080/apisix/admin/routes/1' \
--header 'X-API-KEY: edd1c9f034335f136f87ad84b625c8f1' \
--header 'Content-Type: application/json' \
--data-raw '{
    "plugins": {
        "proxy-rewrite": {
            "regex_uri": ["^/api/(.*)","/$1"]
        }
    },
    "upstream": {
        "nodes": {
            "127.0.0.1:1980": 1
        },
        "type": "roundrobin"
    },
    "uri": "/api/groups/*"
}'
  1. test
$curl http://127.0.0.1:9080/api/groups/%3CNumber%3E
uri: /groups/<Number>

Is this what you want?

No this is not, we already managed to strip the route prefix by using the regex.

The issue is the gateway server decoding the route and not encoding it again, what I want is when submitting this request
$curl http://127.0.0.1:9080/api/groups/%3CNumber%3E

I need to get this

uri: /groups/%3CNumber%3E

NOT that:
uri: /groups/<Number>

@yasser-mas2
Copy link

{ "uris": [ "/api/v1/groups/" ], "name": "by_groupId", "desc": "by group by id", "methods": [ "GET" ], "plugins": { "key-auth": { "disable": false }, "proxy-rewrite": { "regex_uri": [ "/api(/v1/groups/.)", "$1" ] } }, "service_id": "379565504387351559", "upstream_id": "379565438486447111", "labels": { "API_VERSION": "1.0" }, "status": 1 }

The regex pattern is \/api(\/v1\/groups\/._, is this correct?

This is the regex /api(/v1/groups/.*), and it should be replaced by $1 ( group 1 ).

Regex is matched and the url redirected to the proper server, but url is decoded!! Not encoded, and my application server is rejecting such characters <>[]{}|\

@tzssangglass
Copy link
Member

ok, I've had a look around and I don't think it's a bug.

Nginx will escape the " ", "#", "%", "&", "+", "?" and the %00-%1F and %7F-%FF to follow rfc3986#section-2.2.

so I recommend not using %3C in the uri.

refer to: https://juejin.cn/post/6844903512485134343#heading-5

cc @tokers

@ting-xu
Copy link

ting-xu commented Jan 4, 2022

ok, I've had a look around and I don't think it's a bug.

Nginx will escape the " ", "#", "%", "&", "+", "?" and the %00-%1F and %7F-%FF to follow rfc3986#section-2.2.

so I recommend not using %3C in the uri.

refer to: https://juejin.cn/post/6844903512485134343#heading-5

cc @tokers

No this is not, we already managed to strip the route prefix by using the regex.

The issue is the gateway server decoding the route and not encoding it again, what I want is when submitting this request $curl http://127.0.0.1:9080/api/groups/%3CNumber%3E

I need to get this

uri: /groups/%3CNumber%3E

NOT that: uri: /groups/<Number>

I think it's a bug.

I also had this issue when using apisix to route request to backend gerrit code-review server.
The JS in browser will send request like http://gerrit.server/changes/repo-group%2Freop-name~100/detail?O=....
when apisix forward to gerrit, it becomes http://gerrit.server/changes/repo-group/reop-name~100/detail?O=....
cause gerrit cannot handle correctly and return 404

Instead, when I use nginx to route, things are ok, the %2F is not auto decoded and keep original.

@tzssangglass
Copy link
Member

cc @tokers @spacewander PLAT

@zxdvd
Copy link

zxdvd commented Mar 29, 2022

I also got this problem while trying to proxy gerrit and I found a lot of people got this. For nginx, there is workaround that using request_uri since it is not normalized by nginx. Like following from stackoverflow

location /path/ {
  if ($request_uri ~* "/path/(.*)") {
    proxy_pass http://tracking/webapp/$1;
  }
}

So how to achieve this using apisix? And it is possible to have a switch about the normalization in the future?

@tokers
Copy link
Contributor

tokers commented Mar 30, 2022

< > are not reserved characters so they should be decoded, if you want to reserve %3C and %3E, encoding them twice before sending requests to APISIX.

That's to say, using %253C%0A and %253E%0A.

@cloorc
Copy link

cloorc commented Apr 26, 2022

< > are not reserved characters so they should be decoded, if you want to reserve %3C and %3E, encoding them twice before sending requests to APISIX.

That's to say, using %253C%0A and %253E%0A.

I think the APISIX should encode the special character '%' before decoding it to keep the special '%' in its original format.

@tokers
Copy link
Contributor

tokers commented Apr 26, 2022

< > are not reserved characters so they should be decoded, if you want to reserve %3C and %3E, encoding them twice before sending requests to APISIX.
That's to say, using %253C%0A and %253E%0A.

I think the APISIX should encode the special character '%' before decoding it to keep the special '%' in its original format.

Why APISIX?

@cloorc
Copy link

cloorc commented May 2, 2022

@tokers Cause APISIX will decode it but the encoding work should be respected cause the APISIX is the middle-man not the final-man.

@tzssangglass
Copy link
Member

cc @spacewander @membphis, do you have any options about this?

@tokers
Copy link
Contributor

tokers commented May 4, 2022

@tokers Cause APISIX will decode it but the encoding work should be respected cause the APISIX is the middle-man not the final-man.

This behavior breaks the client's behavior.

@spacewander
Copy link
Member

spacewander commented May 4, 2022

cc @spacewander @membphis, do you have any options about this?

People want APISIX to use a specific URL escape rule to solve some edge problems, like whether <> should be escaped.
However, the escape rule is vary across different platforms. If we follow one of the rules, we need to change the way to escape URL instead of using Nginx's official way. But this change is neither easy nor compatible with the current behavior.

@cloorc
Copy link

cloorc commented May 5, 2022

I'm not sure what circumastances will break the client's behavior. But I think APISIX should at least leave a chance to intercept the request or to reencode it back to its original format.

@tokers Cause APISIX will decode it but the encoding work should be respected cause the APISIX is the middle-man not the final-man.

This behavior breaks the client's behavior.

@cloorc
Copy link

cloorc commented May 5, 2022

cc @spacewander @membphis, do you have any options about this?

People want APISIX to use a specific URL escape rule to solve some edge problems, like whether <> should be escaped. However, the escape rule is vary across different platforms. If we follow one of the rules, we need to change the way to escape URL instead of using Nginx's official way. But this change is neither easy nor compatible with the current behavior.

It's unfair to ask APISIX to use specific escape rule to solve some edge problems, but how about leaving the original request along with its orignal style and forwarding it to the backend? I think the middle-steps do whatever might be necessary.

@tokers
Copy link
Contributor

tokers commented May 6, 2022

cc @spacewander @membphis, do you have any options about this?

People want APISIX to use a specific URL escape rule to solve some edge problems, like whether <> should be escaped. However, the escape rule is vary across different platforms. If we follow one of the rules, we need to change the way to escape URL instead of using Nginx's official way. But this change is neither easy nor compatible with the current behavior.

It's unfair to ask APISIX to use specific escape rule to solve some edge problems, but how about leaving the original request along with its orignal style and forwarding it to the backend? I think the middle-steps do whatever might be necessary.

I still cannot agree it, this makes the URI handling more complicated and error-prone. Just like @spacewander said, the URI encoding/decoding rules are not so standard across platforms.

@membphis
Copy link
Member

membphis commented May 6, 2022

I agree with @tokers and @spacewander 's opinion

@spacewander
Copy link
Member

Consider solved. Feel free to reopen it if needed.

@cloorc
Copy link

cloorc commented May 10, 2022

cc @spacewander @membphis, do you have any options about this?

People want APISIX to use a specific URL escape rule to solve some edge problems, like whether <> should be escaped. However, the escape rule is vary across different platforms. If we follow one of the rules, we need to change the way to escape URL instead of using Nginx's official way. But this change is neither easy nor compatible with the current behavior.

It's unfair to ask APISIX to use specific escape rule to solve some edge problems, but how about leaving the original request along with its orignal style and forwarding it to the backend? I think the middle-steps do whatever might be necessary.

I still cannot agree it, this makes the URI handling more complicated and error-prone. Just like @spacewander said, the URI encoding/decoding rules are not so standard across platforms.

Then, why should APISIX decode the encoded-paramters instead of keeping it as origin?
APISIX can decode whatever it needed but please pass the original parameters to the backend finally. This is enough. APISIX can just put the original query parameters into somewhere else and get it back before passing them to the backend services.

@cloorc
Copy link

cloorc commented May 10, 2022

In my circumastance, I have to setup a nginx-ingress service to handle this kind of requests. This is embarrassed to APISIX.

@spacewander
Copy link
Member

@cloorc
It is unfriendly and unwise to say these unpleasant things to the maintainers. This will make the maintainers determine to never support it.
You can add an option to

upstream_uri = core.utils.uri_safe_encode(sub_str(upstream_uri, 1, index-1)) ..
and bypass the encoding. But make sure the rewrite will not introduce unsafe char.

@cloorc
Copy link

cloorc commented May 18, 2022

@cloorc It is unfriendly and unwise to say these unpleasant things to the maintainers. This will make the maintainers determine to never support it. You can add an option to

upstream_uri = core.utils.uri_safe_encode(sub_str(upstream_uri, 1, index-1)) ..

and bypass the encoding. But make sure the rewrite will not introduce unsafe char.

Thanks! I agree it looks unfriendly but it is not my real purpose. If it makes some guys uncomfortable, I should appologize anyway.

Back to this issue, I have to post my real consideration, although it might be a little fierce.

BTW, no threats should to anyone, both maintainers and users and anyone cared about this project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants