Skip to content
This repository has been archived by the owner on Jan 8, 2020. It is now read-only.

Various Zend\Mvc\Router\Http routers turn + into a space in path segments #2952

Closed
wants to merge 2 commits into from

Conversation

gnicol
Copy link
Contributor

@gnicol gnicol commented Nov 13, 2012

A plus located in a path segment should be treated as a literal +.

Updating all http routers to use rawurlencode/rawurldecode to ensure this is the case.

Added tests to verify all characters are properly handled (including space) when present in either their raw or encoded forms.

parsing params, previously the use of urldecode
would erroneously turn + characters into spaces.
dealing with path segments. The previous use of urlencode/decode
would erroneously transform + characters into spaces.

Also updated tests with new checks to verify all characters which
don't absolutely require encoding pass through unchanged and that
encoding works correctly for all characters (even if they do not
strictly speaking require it).
@DASPRiD
Copy link
Member

DASPRiD commented Nov 13, 2012

For the tests, was there a reason to not use the data providers?

@gnicol
Copy link
Contributor Author

gnicol commented Nov 13, 2012

Yes, I wanted to add a test that the param foo+bar made it through ok. When done via the data provider the assemble test generates foo%2Bbar causing it to fail as the output doesn't match the starting value. The match test actually works great (which is the one I wanted) but I see no way to skip assemble.

The assemble method uses rawurlencode which is strict/conservative on output, anything other than a-z 0-9 -_.~ will be percent encoded.
The match method uses rawurldecode which is flexible on input, an un-encoded + is quite acceptable here.

I couldn't see a good way to fit testing the raw '+' case into the current provider; certainly open to suggestion though.

@DASPRiD
Copy link
Member

DASPRiD commented Nov 13, 2012

So both rawurlencode() and urlencode() will encode the + character?

@gnicol
Copy link
Contributor Author

gnicol commented Nov 13, 2012

Correct both rawurlencode and urlencode encode the +.
The key difference is rawurldecode keeps a raw + as + whereas urldecode turns it into a space (erroneously in a path segment).
The segment router actually defines a $urlencodeCorrectionMap to turn characters such as !$ (and now +) back into their raw version after encoding. The other routers lack this functionality so they encode on the way out.

If this could be pushed into a shared location it would make sense for the other routers but I wasn't clear where to shove it over to; possible creating a filter would make sense.

For our use case simply having the router accept a raw + as input and not turn it into a space was all thats needed. If the system puts put %2B instead of + on output it isn't ideal but is no worse than the existing handling of ! or $ for non-segment routers. Forgiving on input and strict on output is a pretty classic approach.

@ghost ghost assigned weierophinney Nov 19, 2012
weierophinney added a commit that referenced this pull request Nov 19, 2012
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants