DECODE-URL and url! syntax don't obey the url encoding rules #1644

rebolbot · 2010-09-02T18:59:41Z

Submitted by: BrianH

Hex-encoded characters in URLs are supposed to stay encoded until the URL is broken into its component parts. This does not happen correctly with REBOL URLs: They are decoded too early. This causes a problem when one or more of the component parts have characters in them that would be mistaken for structural characters. This happens pretty often nowadays with many sites that use a full email address as a user name, or that allow passwords with less restricted character sets.

This can probably be solved by fixing DECODE-URL. Note: This problem also exists in R2, and could use a similar solution.

>> http://user%40rebol.com:blah@www.rebol.com/
== http://user@rebol.com:blah@www.rebol.com/
; should be http://user%40rebol.com:blah@www.rebol.com/
>> decode-url http://user%40rebol.com:blah@www.rebol.com/
== [scheme: 'http user: "user" host: "rebol.com" path: ":blah@www.rebol.com/"]
; should be [scheme: 'http pass: "blah" user: "user@rebol.com" host: "www.rebol.com" path: "/"]

^{CC - Data [ Version: alpha 97 Type: Bug Platform: All Category: Mezzanine Reproduce: Always Fixed-in:none ]}

rebolbot · 2010-09-02T19:20:49Z

Submitted by: BrianH

Gabriele wrote a correct parser here: http://www.rebol.it/power-mezz/parsers/uri-parser.html

Perhaps we can adapt that to R3 style, or at least analyze the code to get the correct rules to follow. We can't use it directly because it decodes the URLs too far, decoding the path and fragment identifier when we don't want that in this case. But it shouldn't be too hard to adapt, as long as it doesn't use too much of the rest of the Power Mezz stuff.

rebolbot · 2010-09-07T10:03:20Z

Submitted by: meijeru

The following tickets also apply: #482, #1327, #1333.

rebolbot · 2010-09-21T02:29:14Z

Submitted by: Carl

This does not seem like hex decoding problem, but more like a field decoding problem in DECODE-URL. For example, in the example, I could have used the @ directly, without the %40.

rebolbot · 2010-09-23T09:50:55Z

Submitted by: BrianH

But what if the hex was %2F or some other character? All hex characters need to be allowed when specified in hex form.

rebolbot · 2013-04-05T02:56:55Z

Submitted by: Ladislav

"This can probably be solved by fixing DECODE-URL." - this provably can't be solved by "fixing" DECODE-URL in any way.

rebolbot · 2013-04-05T03:09:48Z

Submitted by: BrianH

Agreed, the internal treatment of url! values by LOAD and the other url! manipulation functions would need to be changed first, because url! values are getting corrupted long before DECODE-URL ever sees them. Once those other functions are fixed, and possibly a new internal model for the url! type is chosen (see #2014), then DECODE-URL can be fixed to work on the new url! data model.

rebolbot added Type.bug Status.important labels Jan 12, 2016

This was referenced Jan 12, 2016

TO-URL returns incorrect value for % symbol #2009

Closed

LOAD does not handle correctly URLs containing encoded delimiters #2011

Open

MOLD does not handle correctly URLs containing encoded characters #2012

Open

IngoHohmann mentioned this issue Feb 5, 2020

Semicolons should be legal in URL #2381

Open

Siskin-Bot mentioned this issue Feb 15, 2020

DECODE-URL and url! syntax don't obey the url encoding rules Oldes/Rebol-issues#1644

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DECODE-URL and url! syntax don't obey the url encoding rules #1644

DECODE-URL and url! syntax don't obey the url encoding rules #1644

rebolbot commented Sep 2, 2010

rebolbot commented Sep 2, 2010

rebolbot commented Sep 7, 2010

rebolbot commented Sep 21, 2010

rebolbot commented Sep 23, 2010

rebolbot commented Apr 5, 2013

rebolbot commented Apr 5, 2013

DECODE-URL and url! syntax don't obey the url encoding rules #1644

DECODE-URL and url! syntax don't obey the url encoding rules #1644

Comments

rebolbot commented Sep 2, 2010

rebolbot commented Sep 2, 2010

rebolbot commented Sep 7, 2010

rebolbot commented Sep 21, 2010

rebolbot commented Sep 23, 2010

rebolbot commented Apr 5, 2013

rebolbot commented Apr 5, 2013