Skip to content
This repository has been archived by the owner on Apr 12, 2024. It is now read-only.

URL_REGEXP is not accurate #13528

Closed
IlanFrumer opened this issue Dec 13, 2015 · 7 comments
Closed

URL_REGEXP is not accurate #13528

IlanFrumer opened this issue Dec 13, 2015 · 7 comments

Comments

@IlanFrumer
Copy link

@andrewaustin @petebacondarwin

This commit broke the URL validation: ffb6b2f

For example this URL (which uses a curly brackets macro syntax and should be a perfectly valid URL):

http://ads.xxxxx.com/xxxxxx/?pid=1234&pos=c&r={TIMESTAMP}

console

Also check this
http://stackoverflow.com/questions/5984116/regular-expression-for-w3c-compliant-urls

@andrewaustin
Copy link
Contributor

{ and } are not allowed in URIs unless they are URL encoded. You can read RFC 3986 for the full rules about URI characters: https://tools.ietf.org/html/rfc3986#section-2 . They are allowed in IRIs though it seems.

RFC 3987 is even more explicit about URIs vs IRIs and these characters:

Systems accepting IRIs MAY also deal with the printable characters in US-ASCII that are not allowed in URIs, namely "<", ">", '"', space, "{", "}", "|", "", "^", and "`", in step 2 above.

So I guess it's a question of should we be validating URLs as URIs (RFC 3986) or IRIs (RFC 3986 - where { and } are allowed).

@IlanFrumer
Copy link
Author

@andrewaustin thanks for the clarification
Since we are validating a user input I think it is more clear to use IRI rather than forcing the user to escape some characters.
Also (as i said) this syntax was valid before this commit, so if we choose URI we must add it to the breaking changes list.

Thanks.

@petebacondarwin
Copy link
Contributor

We will make a decision about this before RC.1

@Narretz
Copy link
Contributor

Narretz commented Dec 14, 2015

We've been following Chrome in the past in terms of what urls are allowed (not implementation). Both Chrome and Firefox allow curly brackets, so I think we should too.

@gkalpak
Copy link
Member

gkalpak commented Dec 14, 2015

AFAIK, we've never actually followed Chrome in terms of URL validity.
Before ffb6b2f, there was a bug in URL_REGEXP that allowed all non-whitespace characters in hostname (i.e. it never matched port, path, query, hash - it considered it all to be one, big, valid hostname), so the right half of the RegExp never affected anything.

After ffb6b2f and e4bb838 (which built on top of the former), we try to more closely match the URI spec. Browsers seems to be much more forgiving - maybe we should be too (i.e. be more concerned with validating the structure, than the actual characters).

@gkalpak gkalpak self-assigned this Dec 14, 2015
gkalpak added a commit to gkalpak/angular.js that referenced this issue Dec 15, 2015
Background:
Prior to ffb6b2f, there was a bug in `URL_REGEXP`, trying to match the hostname as `\S+` (meaning
any non-space character). This resulted in never actually validating the structure of the URL (e.g.
segments such as port, path, query, fragment).
Then ffb6b2f and subsequently e4bb838 fixed that bug, but revealed `URL_REGEXP`'s "strictness" wrt
certain parts of the URL.

Since browsers are too lenient when it comes to URL validation, this commit relaxes the "strictness"
of `URL_REGEXP`, focusing more on the general structure, than on the specific characters allowed in
each segment.

Note 1: `URL_REGEXP` still seems to be stricter than browsers in some cases.
Note 2: Browsers don't always agree on what is a valid URL and what isn't.

Fixes angular#13528
gkalpak added a commit to gkalpak/angular.js that referenced this issue Dec 15, 2015
Background:
Prior to ffb6b2f, there was a bug in `URL_REGEXP`, trying to match the hostname as `\S+` (meaning
any non-space character). This resulted in never actually validating the structure of the URL (e.g.
segments such as port, path, query, fragment).
Then ffb6b2f and subsequently e4bb838 fixed that bug, but revealed `URL_REGEXP`'s "strictness" wrt
certain parts of the URL.

Since browsers are too lenient when it comes to URL validation, this commit relaxes the "strictness"
of `URL_REGEXP`, focusing more on the general structure, than on the specific characters allowed in
each segment.

Note 1: `URL_REGEXP` still seems to be stricter than browsers in some cases.
Note 2: Browsers don't always agree on what is a valid URL and what isn't.

Fixes angular#13528
gkalpak added a commit to gkalpak/angular.js that referenced this issue Dec 15, 2015
Background:
Prior to ffb6b2f, there was a bug in `URL_REGEXP`, trying to match the hostname as `\S+` (meaning
any non-space character). This resulted in never actually validating the structure of the URL (e.g.
segments such as port, path, query, fragment).
Then ffb6b2f and subsequently e4bb838 fixed that bug, but revealed `URL_REGEXP`'s "strictness" wrt
certain parts of the URL.

Since browsers are too lenient when it comes to URL validation, this commit relaxes the "strictness"
of `URL_REGEXP`, focusing more on the general structure, than on the specific characters allowed in
each segment.

Note 1: `URL_REGEXP` still seems to be stricter than browsers in some cases.
Note 2: Browsers don't always agree on what is a valid URL and what isn't.

Fixes angular#13528
gkalpak added a commit to gkalpak/angular.js that referenced this issue Dec 15, 2015
Background:
Prior to ffb6b2f, there was a bug in `URL_REGEXP`, trying to match the hostname as `\S+` (meaning
any non-space character). This resulted in never actually validating the structure of the URL (e.g.
segments such as port, path, query, fragment).
Then ffb6b2f and subsequently e4bb838 fixed that bug, but revealed `URL_REGEXP`'s "strictness" wrt
certain parts of the URL.

Since browsers are too lenient when it comes to URL validation, this commit relaxes the "strictness"
of `URL_REGEXP`, focusing more on the general structure, than on the specific characters allowed in
each segment.

Note 1: `URL_REGEXP` still seems to be stricter than browsers in some cases.
Note 2: Browsers don't always agree on what is a valid URL and what isn't.

Fixes angular#13528
gkalpak added a commit to gkalpak/angular.js that referenced this issue Dec 15, 2015
Background:
Prior to ffb6b2f, there was a bug in `URL_REGEXP`, trying to match the hostname as `\S+` (meaning
any non-space character). This resulted in never actually validating the structure of the URL (e.g.
segments such as port, path, query, fragment).
Then ffb6b2f and subsequently e4bb838 fixed that bug, but revealed `URL_REGEXP`'s "strictness" wrt
certain parts of the URL.

Since browsers are too lenient when it comes to URL validation anyway, it doesn't make sense for
Angular to be much stricter, so this commit relaxes the "strictness" of `URL_REGEXP`, focusing more
on the general structure, than on the specific characters allowed in each segment.

Note 1: `URL_REGEXP` still seems to be stricter than browsers in some cases.
Note 2: Browsers don't always agree on what is a valid URL and what isn't.

Fixes angular#13528
gkalpak added a commit to gkalpak/angular.js that referenced this issue Dec 15, 2015
Background:
Prior to ffb6b2f, there was a bug in `URL_REGEXP`, trying to match the hostname as `\S+` (meaning
any non-space character). This resulted in never actually validating the structure of the URL (e.g.
segments such as port, path, query, fragment).
Then ffb6b2f and subsequently e4bb838 fixed that bug, but revealed `URL_REGEXP`'s "strictness" wrt
certain parts of the URL.

Since browsers are too lenient when it comes to URL validation anyway, it doesn't make sense for
Angular to be much stricter, so this commit relaxes the "strictness" of `URL_REGEXP`, focusing more
on the general structure, than on the specific characters allowed in each segment.

Note 1: `URL_REGEXP` still seems to be stricter than browsers in some cases.
Note 2: Browsers don't always agree on what is a valid URL and what isn't.

Fixes angular#13528
gkalpak added a commit that referenced this issue Dec 16, 2015
Background:
Prior to ffb6b2f, there was a bug in `URL_REGEXP`, trying to match the hostname as `\S+` (meaning
any non-space character). This resulted in never actually validating the structure of the URL (e.g.
segments such as port, path, query, fragment).
Then ffb6b2f and subsequently e4bb838 fixed that bug, but revealed `URL_REGEXP`'s "strictness" wrt
certain parts of the URL.

Since browsers are too lenient when it comes to URL validation anyway, it doesn't make sense for
Angular to be much stricter, so this commit relaxes the "strictness" of `URL_REGEXP`, focusing more
on the general structure, than on the specific characters allowed in each segment.

Note 1: `URL_REGEXP` still seems to be stricter than browsers in some cases.
Note 2: Browsers don't always agree on what is a valid URL and what isn't.

Fixes #13528

Closes #13544
@IlanFrumer
Copy link
Author

@gkalpak this should also be fixed inside 1.4.x

@gkalpak
Copy link
Member

gkalpak commented Dec 16, 2015

@IlanFrumer, it has - I just forgot to link to the commit (6610ae8). Thx for pointing it out 😃

kkirsche added a commit to kkirsche/kibana that referenced this issue Feb 21, 2016
<a name="1.4.9"></a>
# 1.4.9 implicit-superannuation (2016-01-21)

## Bug Fixes

- **Animation**
  - ensure that animate promises resolve when the document is hidden
  ([9a60408c](angular/angular.js@9a60408))
  - do not trigger animations if the document is hidden
  ([09f6061a](angular/angular.js@09f6061),
   [elastic#12842](angular/angular.js#12842), [elastic#13776](angular/angular.js#13776))
  - only copy over the animation options once
  ([2fc954d3](angular/angular.js@2fc954d),
   [elastic#13722](angular/angular.js#13722), [elastic#13578](angular/angular.js#13578))
  - allow event listeners on document in IE
  ([5ba4419e](angular/angular.js@5ba4419),
   [elastic#13548](angular/angular.js#13548), [elastic#13696](angular/angular.js#13696))
  - allow removing classes that are added by a running animation
  ([6c4581fc](angular/angular.js@6c4581f),
   [elastic#13339](angular/angular.js#13339), [elastic#13380](angular/angular.js#13380), [elastic#13414](angular/angular.js#13414), [elastic#13472](angular/angular.js#13472), [elastic#13678](angular/angular.js#13678))
  - do not use `event.timeStamp` anymore for time tracking
  ([620a20d1](angular/angular.js@620a20d),
   [elastic#13494](angular/angular.js#13494), [elastic#13495](angular/angular.js#13495))
  - ignore children without animation data when closing them
  ([be01cebf](angular/angular.js@be01ceb),
   [elastic#11992](angular/angular.js#11992), [elastic#13424](angular/angular.js#13424))
  - do not alter the provided options data
  ([7a81e6fe](angular/angular.js@7a81e6f),
   [elastic#13040](angular/angular.js#13040), [elastic#13175](angular/angular.js#13175))
  - correctly handle `$animate.pin()` host elements
  ([a985adfd](angular/angular.js@a985adf),
   [elastic#13783](angular/angular.js#13783))
  - allow animations when pinned element is parent element
  ([4cb8ac61](angular/angular.js@4cb8ac6),
   [elastic#13466](angular/angular.js#13466))
  - allow enabled children to animate on disabled parents
  ([6d85f24e](angular/angular.js@6d85f24),
   [elastic#13179](angular/angular.js#13179), [elastic#13695](angular/angular.js#13695))
  - correctly access `minErr`
  ([0c1b54f0](angular/angular.js@0c1b54f))
  - ensure animate runner is the same with and without animations
  ([937942f5](angular/angular.js@937942f),
   [elastic#13205](angular/angular.js#13205), [elastic#13347](angular/angular.js#13347))
  - remove animation end event listeners on close
  ([d9157849](angular/angular.js@d915784),
   [elastic#13672](angular/angular.js#13672))
  - consider options.delay value for closing timeout
  ([592bf516](angular/angular.js@592bf51),
   [elastic#13355](angular/angular.js#13355), [elastic#13363](angular/angular.js#13363))
- **$controller:** allow identifiers containing `$`
  ([2563ff7b](angular/angular.js@2563ff7),
   [elastic#13736](angular/angular.js#13736))
- **$http:** throw if url passed is not a string
  ([c5bf9dae](angular/angular.js@c5bf9da),
   [elastic#12925](angular/angular.js#12925), [elastic#13444](angular/angular.js#13444))
- **$parse:** handle interceptors with `undefined` expressions
  ([7bb2414b](angular/angular.js@7bb2414))
- **$resource:** don't allow using promises as `timeout` and log a warning
  ([47486524](angular/angular.js@4748652))
- **formatNumber:** cope with large and small number corner cases
  ([9c49eb13](angular/angular.js@9c49eb1),
   [elastic#13394](angular/angular.js#13394), [elastic#8674](angular/angular.js#8674), [elastic#12709](angular/angular.js#12709), [elastic#8705](angular/angular.js#8705), [elastic#12707](angular/angular.js#12707), [elastic#10246](angular/angular.js#10246), [elastic#10252](angular/angular.js#10252))
- **input:**
  - fix URL validation being too strict
  ([6610ae81](angular/angular.js@6610ae8),
   [elastic#13528](angular/angular.js#13528), [elastic#13544](angular/angular.js#13544))
  - add missing chars to URL validation regex
  ([2995b54a](angular/angular.js@2995b54),
   [elastic#13379](angular/angular.js#13379), [elastic#13460](angular/angular.js#13460))
- **isArrayLike:** recognize empty instances of an Array subclass
  ([323f9ab7](angular/angular.js@323f9ab),
   [elastic#13560](angular/angular.js#13560), [elastic#13708](angular/angular.js#13708))
- **ngInclude:** do not compile template if original scope is destroyed
  ([9590bcf0](angular/angular.js@9590bcf))
- **ngOptions:**
  - don't skip `optgroup` elements with `value === ''`
  ([85e392f3](angular/angular.js@85e392f),
   [elastic#13487](angular/angular.js#13487), [elastic#13489](angular/angular.js#13489))
  - don't `$dirty` multiple select after compilation
  ([f163c905](angular/angular.js@f163c90),
   [elastic#13211](angular/angular.js#13211), [elastic#13326](angular/angular.js#13326))
- **select:** re-define `ngModelCtrl.$render` in the `select` directive's postLink function
  ([529b2507](angular/angular.js@529b250),
   [elastic#13583](angular/angular.js#13583), [elastic#13583](angular/angular.js#13583), [elastic#13663](angular/angular.js#13663))

## Minor Features

- **ngLocale:** add support for standalone months
  ([54c4041e](angular/angular.js@54c4041),
   [elastic#3744](angular/angular.js#3744), [elastic#10247](angular/angular.js#10247), [elastic#12642](angular/angular.js#12642), [elastic#12844](angular/angular.js#12844))
- **ngMock:** add support for `$animate.closeAndFlush()`
  ([512c0811](angular/angular.js@512c081))

## Performance Improvements

- **ngAnimate:** speed up `areAnimationsAllowed` check
  ([2d3303dd](angular/angular.js@2d3303d))

## Breaking Changes

While we do not deem the following to be a real breaking change we are highlighting it here in the
changelog to ensure that it does not surprise anyone.

- **$resource:** due to [47486524](angular/angular.js@4748652),

**Possible breaking change** for users who updated their code to provide a `timeout`
promise for a `$resource` request in version v1.4.8.

Up to v1.4.7 (included), using a promise as a timeout in `$resource`, would silently
fail (i.e. have no effect).

In v1.4.8, using a promise as timeout would have the (buggy) behaviour described
in angular/angular.js#12657 (comment).
(I.e. it will work as expected for the first time you resolve the promise and will
cancel all subsequent requests after that - one has to re-create the resource
class. This was not documented.)

With this change, using a promise as timeout in v1.4.9 onwards is not allowed.
It will log a warning and ignore the timeout value.

If you need support for cancellable `$resource` actions, you should upgrade to
version 1.5 or higher.

<a name="1.4.8"></a>
# 1.4.8 ice-manipulation (2015-11-19)

## Bug Fixes

- **$animate:** ensure leave animation calls `close` callback
  ([6bd6dbff](angular/angular.js@6bd6dbf),
   [elastic#12278](angular/angular.js#12278), [elastic#12096](angular/angular.js#12096), [elastic#13054](angular/angular.js#13054))
- **$cacheFactory:** check key exists before decreasing cache size count
  ([2a5a52a7](angular/angular.js@2a5a52a),
   [elastic#12321](angular/angular.js#12321), [elastic#12329](angular/angular.js#12329))
- **$compile:**
  - bind all directive controllers correctly when using `bindToController`
  ([5d8861fb](angular/angular.js@5d8861f),
   [elastic#11343](angular/angular.js#11343), [elastic#11345](angular/angular.js#11345))
  - evaluate against the correct scope with bindToController on new scope
  ([b9f7c453](angular/angular.js@b9f7c45),
   [elastic#13021](angular/angular.js#13021), [elastic#13025](angular/angular.js#13025))
  - fix scoping of transclusion directives inside replace directive
  ([74da0340](angular/angular.js@74da034),
   [elastic#12975](angular/angular.js#12975), [elastic#12936](angular/angular.js#12936), [elastic#13244](angular/angular.js#13244))
- **$http:** apply `transformResponse` even when `data` is empty
  ([c6909464](angular/angular.js@c690946),
   [elastic#12976](angular/angular.js#12976), [elastic#12979](angular/angular.js#12979))
- **$location:** ensure `$locationChangeSuccess` fires even if URL ends with `#`
  ([6f8ddb6d](angular/angular.js@6f8ddb6),
   [elastic#12175](angular/angular.js#12175), [elastic#13251](angular/angular.js#13251))
- **$parse:** evaluate once simple expressions only once
  ([e4036824](angular/angular.js@e403682),
   [elastic#12983](angular/angular.js#12983), [elastic#13002](angular/angular.js#13002))
- **$resource:** allow XHR request to be cancelled via a timeout promise
  ([7170f9d9](angular/angular.js@7170f9d),
   [elastic#12657](angular/angular.js#12657), [elastic#12675](angular/angular.js#12675), [elastic#10890](angular/angular.js#10890), [elastic#9332](angular/angular.js#9332))
- **$rootScope:** prevent IE9 memory leak when destroying scopes
  ([87b0055c](angular/angular.js@87b0055),
   [elastic#10706](angular/angular.js#10706), [elastic#11786](angular/angular.js#11786))
- **Angular.js:** fix `isArrayLike` for unusual cases
  ([70edec94](angular/angular.js@70edec9),
   [elastic#10186](angular/angular.js#10186), [elastic#8000](angular/angular.js#8000), [elastic#4855](angular/angular.js#4855), [elastic#4751](angular/angular.js#4751), [elastic#10272](angular/angular.js#10272))
- **isArrayLike:** handle jQuery objects of length 0
  ([d3da55c4](angular/angular.js@d3da55c))
- **jqLite:**
  - deregister special `mouseenter` / `mouseleave` events correctly
  ([22f66025](angular/angular.js@22f6602),
   [elastic#12795](angular/angular.js#12795), [elastic#12799](angular/angular.js#12799))
  - ensure mouseenter works with svg elements on IE
  ([c1f34e8e](angular/angular.js@c1f34e8),
   [elastic#10259](angular/angular.js#10259), [elastic#10276](angular/angular.js#10276))
- **limitTo:** start at 0 if `begin` is negative and exceeds input length
  ([4fc40bc9](angular/angular.js@4fc40bc),
   [elastic#12775](angular/angular.js#12775), [elastic#12781](angular/angular.js#12781))
- **merge:**
  - ensure that jqlite->jqlite and DOM->DOM
  ([2f8db1bf](angular/angular.js@2f8db1b))
  - clone elements instead of treating them like simple objects
  ([838cf4be](angular/angular.js@838cf4b),
   [elastic#12286](angular/angular.js#12286))
- **ngAria:** don't add tabindex to radio and checkbox inputs
  ([59f1f4e1](angular/angular.js@59f1f4e),
   [elastic#12492](angular/angular.js#12492), [elastic#13095](angular/angular.js#13095))
- **ngInput:** change URL_REGEXP to better match RFC3987
  ([cb51116d](angular/angular.js@cb51116),
   [elastic#11341](angular/angular.js#11341), [elastic#11381](angular/angular.js#11381))
- **ngMock:** reset cache before every test
  ([91b7cd9b](angular/angular.js@91b7cd9),
   [elastic#13013](angular/angular.js#13013))
- **ngOptions:**
  - skip comments and empty options when looking for options
  ([0f58334b](angular/angular.js@0f58334),
   [elastic#12190](angular/angular.js#12190), [elastic#13029](angular/angular.js#13029), [elastic#13033](angular/angular.js#13033))
  - override select option registration to allow compilation of empty option
  ([7b2ecf42](angular/angular.js@7b2ecf4),
   [elastic#11685](angular/angular.js#11685), [elastic#12972](angular/angular.js#12972), [elastic#12968](angular/angular.js#12968), [elastic#13012](angular/angular.js#13012))

## Performance Improvements

- **$compile:** use static jquery data method to avoid creating new instances
  ([55ad192e](angular/angular.js@55ad192))
- **copy:**
  - avoid regex in `isTypedArray`
  ([19fab4a1](angular/angular.js@19fab4a))
  - only validate/clear if the user specifies a destination
  ([d1293540](angular/angular.js@d129354),
   [elastic#12068](angular/angular.js#12068))
- **merge:** remove unnecessary wrapping of jqLite element
  ([ce6a96b0](angular/angular.js@ce6a96b),
   [elastic#13236](angular/angular.js#13236))

## Breaking Changes

Signed-off-by: Kevin Kirsche <Kev.Kirsche@gmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants