Skip to content
This repository has been archived by the owner on Apr 12, 2024. It is now read-only.

fix(ngInput): change URL_REGEXP to better match RFC3987 #11381

Closed
wants to merge 1 commit into from

Conversation

andrewaustin
Copy link
Contributor

The URL_REGEXP in use to perform validation in ngInput is too restrictive and fails to
follow RFC3987. In particular, it only accepts ftp, http, and https scheme components and
rejects perfectly valid schemes such as "file", "mailto", "chrome-extension",
etc. The regex also requires the scheme to be followed by two "/" but the RFC says
0 to n are acceptable. This change fixes both of these issues to better align to
the standard.

Closes #11341

@andrewaustin
Copy link
Contributor Author

Does anything else need to be done for this?

@thatmarvin
Copy link

👍 I need to validate rtsp:// urls, so this would be useful for me too.

@Narretz Narretz modified the milestones: 1.4.x, Backlog Aug 8, 2015
@dgieselaar
Copy link

Anything holding this back?

@Narretz
Copy link
Contributor

Narretz commented Oct 12, 2015

Is the regey taken from somewhere or is it a custom one? I'd like to follow an established regex if possible. What does Chrome's regex look like these days?

@andrewaustin
Copy link
Contributor Author

It's custom, if you can tell me where to find Google's I'll update it to match.

@@ -11,7 +11,8 @@

// Regex code is obtained from SO: https://stackoverflow.com/questions/3143070/javascript-regex-iso-datetime#answer-3143231
var ISO_DATE_REGEXP = /\d{4}-[01]\d-[0-3]\dT[0-2]\d:[0-5]\d:[0-5]\d\.\d+([+-][0-2]\d:[0-5]\d|Z)/;
var URL_REGEXP = /^(ftp|http|https):\/\/(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?$/;
// See valid URLs in RFC3987 (http://tools.ietf.org/html/rfc3987)
var URL_REGEXP = /^([0-9]+|\w+|\.|\+|\-)+:\/?\/?(\w+:{0,1}\w*@)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%@!\-\/]))?$/;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why \w+ (in /^([0-9]+|\w+... instead of [A-Za-z]+ ?
It seems like \w+ will (incorrectly) match characters like _.

If fact, I would expect more something like: /^([A-Za-z][A-Za-z0-9.+-]+):...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or are non-latin letters also allowed ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to KURL's implementation (used by WebKit (used by Chromium)), only A-Z, a-z and 0-9 are allowed: https://code.google.com/p/chromium/codesearch#chromium/src/third_party/WebKit/Source/platform/weborigin/KURL.cpp&q=kurl.&sq=package:chromium&type=cs&l=116

@gkalpak
Copy link
Member

gkalpak commented Oct 12, 2015

A little research:

  • (As expected), Chrome (<= Chromium <= WebKit) does not use a RegExp, they parse the URL instead.
  • They use KURL for that.

I don't think it's worth implementing a URL parser into core.
Let's work on that RegExp :)

@gkalpak
Copy link
Member

gkalpak commented Oct 13, 2015

Based on a quick walk-through the code used in Chromium (and specifically url_parse.cc) and also considering the fact that we don't want to parse/break-up the url (only validate it), I would propose the following RegExp:

/^[A-Za-z][A-Za-z\d.+-]*:\/*(?:\w+(?::\w+)?@)?[^\s/]+(?::\d+)?(?:\/[\w#!:.?+=&%@\-/]*)?$/

This is not strictly following neither the RFC nor KURL's implementation, but I think it works fine for our needs (considering the previous RegExp wasn't following RFC/Chrome either).

I made some minor improvements to the original one (removed duplicate duplicate characters, removed unnecessary escapes, stopped tracking groups etc) and fixed a bug:

By trying to match the host with \S+ it would match everything including the host, port, path, query and fragment (since + matches eagerly). The intention was clearly to match the host only, so I replaced it with [^\s/]+ (which more closely matches KURL's implementation).

Besides the cases fixed by @andrewaustin's version, there are a couple more cases that are recognized incorrectly by his RegExp (which are also fixed with the above):

  • 0scheme://example.com: Should be invalid (because scheme shouldn't start with a digit or .+-).
  • http://example.com:9999/~~```: Should be invalid (because of ~`s ans ```s in the path).

The URL_REGEXP in use to perform validation in ngInput is too restrictive and fails to
follow RFC3987. In particular, it only accepts ftp, http, and https scheme components and
rejects perfectly valid schemes such as "file", "mailto", "chrome-extension",
etc. The regex also requires the scheme to be followed by two "/" but the RFC says
0 to n are acceptable. This change fixes both of these issues to better align to
the standard.

Closes angular#11341
@andrewaustin
Copy link
Contributor Author

@gkalpak I updated the PR with your suggested REGEX. I also added two more tests for the two urls you pointed out that should be invalid.

@gkalpak
Copy link
Member

gkalpak commented Oct 13, 2015

It LGTM (obviously :P).
Considering how the previous RegExp didn't work perfectly either, I would be fine with this getting merged, if it looks good to some other team-member as well :)

petebacondarwin pushed a commit that referenced this pull request Oct 26, 2015
The URL_REGEXP in use to perform validation in ngInput is too restrictive and fails to
follow RFC3987. In particular, it only accepts ftp, http, and https scheme components and
rejects perfectly valid schemes such as "file", "mailto", "chrome-extension",
etc. The regex also requires the scheme to be followed by two "/" but the RFC says
0 to n are acceptable. This change fixes both of these issues to better align to
the standard.

Closes #11341
Closes #11381
kkirsche added a commit to kkirsche/kibana that referenced this pull request Feb 21, 2016
<a name="1.4.9"></a>
# 1.4.9 implicit-superannuation (2016-01-21)

## Bug Fixes

- **Animation**
  - ensure that animate promises resolve when the document is hidden
  ([9a60408c](angular/angular.js@9a60408))
  - do not trigger animations if the document is hidden
  ([09f6061a](angular/angular.js@09f6061),
   [elastic#12842](angular/angular.js#12842), [elastic#13776](angular/angular.js#13776))
  - only copy over the animation options once
  ([2fc954d3](angular/angular.js@2fc954d),
   [elastic#13722](angular/angular.js#13722), [elastic#13578](angular/angular.js#13578))
  - allow event listeners on document in IE
  ([5ba4419e](angular/angular.js@5ba4419),
   [elastic#13548](angular/angular.js#13548), [elastic#13696](angular/angular.js#13696))
  - allow removing classes that are added by a running animation
  ([6c4581fc](angular/angular.js@6c4581f),
   [elastic#13339](angular/angular.js#13339), [elastic#13380](angular/angular.js#13380), [elastic#13414](angular/angular.js#13414), [elastic#13472](angular/angular.js#13472), [elastic#13678](angular/angular.js#13678))
  - do not use `event.timeStamp` anymore for time tracking
  ([620a20d1](angular/angular.js@620a20d),
   [elastic#13494](angular/angular.js#13494), [elastic#13495](angular/angular.js#13495))
  - ignore children without animation data when closing them
  ([be01cebf](angular/angular.js@be01ceb),
   [elastic#11992](angular/angular.js#11992), [elastic#13424](angular/angular.js#13424))
  - do not alter the provided options data
  ([7a81e6fe](angular/angular.js@7a81e6f),
   [elastic#13040](angular/angular.js#13040), [elastic#13175](angular/angular.js#13175))
  - correctly handle `$animate.pin()` host elements
  ([a985adfd](angular/angular.js@a985adf),
   [elastic#13783](angular/angular.js#13783))
  - allow animations when pinned element is parent element
  ([4cb8ac61](angular/angular.js@4cb8ac6),
   [elastic#13466](angular/angular.js#13466))
  - allow enabled children to animate on disabled parents
  ([6d85f24e](angular/angular.js@6d85f24),
   [elastic#13179](angular/angular.js#13179), [elastic#13695](angular/angular.js#13695))
  - correctly access `minErr`
  ([0c1b54f0](angular/angular.js@0c1b54f))
  - ensure animate runner is the same with and without animations
  ([937942f5](angular/angular.js@937942f),
   [elastic#13205](angular/angular.js#13205), [elastic#13347](angular/angular.js#13347))
  - remove animation end event listeners on close
  ([d9157849](angular/angular.js@d915784),
   [elastic#13672](angular/angular.js#13672))
  - consider options.delay value for closing timeout
  ([592bf516](angular/angular.js@592bf51),
   [elastic#13355](angular/angular.js#13355), [elastic#13363](angular/angular.js#13363))
- **$controller:** allow identifiers containing `$`
  ([2563ff7b](angular/angular.js@2563ff7),
   [elastic#13736](angular/angular.js#13736))
- **$http:** throw if url passed is not a string
  ([c5bf9dae](angular/angular.js@c5bf9da),
   [elastic#12925](angular/angular.js#12925), [elastic#13444](angular/angular.js#13444))
- **$parse:** handle interceptors with `undefined` expressions
  ([7bb2414b](angular/angular.js@7bb2414))
- **$resource:** don't allow using promises as `timeout` and log a warning
  ([47486524](angular/angular.js@4748652))
- **formatNumber:** cope with large and small number corner cases
  ([9c49eb13](angular/angular.js@9c49eb1),
   [elastic#13394](angular/angular.js#13394), [elastic#8674](angular/angular.js#8674), [elastic#12709](angular/angular.js#12709), [elastic#8705](angular/angular.js#8705), [elastic#12707](angular/angular.js#12707), [elastic#10246](angular/angular.js#10246), [elastic#10252](angular/angular.js#10252))
- **input:**
  - fix URL validation being too strict
  ([6610ae81](angular/angular.js@6610ae8),
   [elastic#13528](angular/angular.js#13528), [elastic#13544](angular/angular.js#13544))
  - add missing chars to URL validation regex
  ([2995b54a](angular/angular.js@2995b54),
   [elastic#13379](angular/angular.js#13379), [elastic#13460](angular/angular.js#13460))
- **isArrayLike:** recognize empty instances of an Array subclass
  ([323f9ab7](angular/angular.js@323f9ab),
   [elastic#13560](angular/angular.js#13560), [elastic#13708](angular/angular.js#13708))
- **ngInclude:** do not compile template if original scope is destroyed
  ([9590bcf0](angular/angular.js@9590bcf))
- **ngOptions:**
  - don't skip `optgroup` elements with `value === ''`
  ([85e392f3](angular/angular.js@85e392f),
   [elastic#13487](angular/angular.js#13487), [elastic#13489](angular/angular.js#13489))
  - don't `$dirty` multiple select after compilation
  ([f163c905](angular/angular.js@f163c90),
   [elastic#13211](angular/angular.js#13211), [elastic#13326](angular/angular.js#13326))
- **select:** re-define `ngModelCtrl.$render` in the `select` directive's postLink function
  ([529b2507](angular/angular.js@529b250),
   [elastic#13583](angular/angular.js#13583), [elastic#13583](angular/angular.js#13583), [elastic#13663](angular/angular.js#13663))

## Minor Features

- **ngLocale:** add support for standalone months
  ([54c4041e](angular/angular.js@54c4041),
   [elastic#3744](angular/angular.js#3744), [elastic#10247](angular/angular.js#10247), [elastic#12642](angular/angular.js#12642), [elastic#12844](angular/angular.js#12844))
- **ngMock:** add support for `$animate.closeAndFlush()`
  ([512c0811](angular/angular.js@512c081))

## Performance Improvements

- **ngAnimate:** speed up `areAnimationsAllowed` check
  ([2d3303dd](angular/angular.js@2d3303d))

## Breaking Changes

While we do not deem the following to be a real breaking change we are highlighting it here in the
changelog to ensure that it does not surprise anyone.

- **$resource:** due to [47486524](angular/angular.js@4748652),

**Possible breaking change** for users who updated their code to provide a `timeout`
promise for a `$resource` request in version v1.4.8.

Up to v1.4.7 (included), using a promise as a timeout in `$resource`, would silently
fail (i.e. have no effect).

In v1.4.8, using a promise as timeout would have the (buggy) behaviour described
in angular/angular.js#12657 (comment).
(I.e. it will work as expected for the first time you resolve the promise and will
cancel all subsequent requests after that - one has to re-create the resource
class. This was not documented.)

With this change, using a promise as timeout in v1.4.9 onwards is not allowed.
It will log a warning and ignore the timeout value.

If you need support for cancellable `$resource` actions, you should upgrade to
version 1.5 or higher.

<a name="1.4.8"></a>
# 1.4.8 ice-manipulation (2015-11-19)

## Bug Fixes

- **$animate:** ensure leave animation calls `close` callback
  ([6bd6dbff](angular/angular.js@6bd6dbf),
   [elastic#12278](angular/angular.js#12278), [elastic#12096](angular/angular.js#12096), [elastic#13054](angular/angular.js#13054))
- **$cacheFactory:** check key exists before decreasing cache size count
  ([2a5a52a7](angular/angular.js@2a5a52a),
   [elastic#12321](angular/angular.js#12321), [elastic#12329](angular/angular.js#12329))
- **$compile:**
  - bind all directive controllers correctly when using `bindToController`
  ([5d8861fb](angular/angular.js@5d8861f),
   [elastic#11343](angular/angular.js#11343), [elastic#11345](angular/angular.js#11345))
  - evaluate against the correct scope with bindToController on new scope
  ([b9f7c453](angular/angular.js@b9f7c45),
   [elastic#13021](angular/angular.js#13021), [elastic#13025](angular/angular.js#13025))
  - fix scoping of transclusion directives inside replace directive
  ([74da0340](angular/angular.js@74da034),
   [elastic#12975](angular/angular.js#12975), [elastic#12936](angular/angular.js#12936), [elastic#13244](angular/angular.js#13244))
- **$http:** apply `transformResponse` even when `data` is empty
  ([c6909464](angular/angular.js@c690946),
   [elastic#12976](angular/angular.js#12976), [elastic#12979](angular/angular.js#12979))
- **$location:** ensure `$locationChangeSuccess` fires even if URL ends with `#`
  ([6f8ddb6d](angular/angular.js@6f8ddb6),
   [elastic#12175](angular/angular.js#12175), [elastic#13251](angular/angular.js#13251))
- **$parse:** evaluate once simple expressions only once
  ([e4036824](angular/angular.js@e403682),
   [elastic#12983](angular/angular.js#12983), [elastic#13002](angular/angular.js#13002))
- **$resource:** allow XHR request to be cancelled via a timeout promise
  ([7170f9d9](angular/angular.js@7170f9d),
   [elastic#12657](angular/angular.js#12657), [elastic#12675](angular/angular.js#12675), [elastic#10890](angular/angular.js#10890), [elastic#9332](angular/angular.js#9332))
- **$rootScope:** prevent IE9 memory leak when destroying scopes
  ([87b0055c](angular/angular.js@87b0055),
   [elastic#10706](angular/angular.js#10706), [elastic#11786](angular/angular.js#11786))
- **Angular.js:** fix `isArrayLike` for unusual cases
  ([70edec94](angular/angular.js@70edec9),
   [elastic#10186](angular/angular.js#10186), [elastic#8000](angular/angular.js#8000), [elastic#4855](angular/angular.js#4855), [elastic#4751](angular/angular.js#4751), [elastic#10272](angular/angular.js#10272))
- **isArrayLike:** handle jQuery objects of length 0
  ([d3da55c4](angular/angular.js@d3da55c))
- **jqLite:**
  - deregister special `mouseenter` / `mouseleave` events correctly
  ([22f66025](angular/angular.js@22f6602),
   [elastic#12795](angular/angular.js#12795), [elastic#12799](angular/angular.js#12799))
  - ensure mouseenter works with svg elements on IE
  ([c1f34e8e](angular/angular.js@c1f34e8),
   [elastic#10259](angular/angular.js#10259), [elastic#10276](angular/angular.js#10276))
- **limitTo:** start at 0 if `begin` is negative and exceeds input length
  ([4fc40bc9](angular/angular.js@4fc40bc),
   [elastic#12775](angular/angular.js#12775), [elastic#12781](angular/angular.js#12781))
- **merge:**
  - ensure that jqlite->jqlite and DOM->DOM
  ([2f8db1bf](angular/angular.js@2f8db1b))
  - clone elements instead of treating them like simple objects
  ([838cf4be](angular/angular.js@838cf4b),
   [elastic#12286](angular/angular.js#12286))
- **ngAria:** don't add tabindex to radio and checkbox inputs
  ([59f1f4e1](angular/angular.js@59f1f4e),
   [elastic#12492](angular/angular.js#12492), [elastic#13095](angular/angular.js#13095))
- **ngInput:** change URL_REGEXP to better match RFC3987
  ([cb51116d](angular/angular.js@cb51116),
   [elastic#11341](angular/angular.js#11341), [elastic#11381](angular/angular.js#11381))
- **ngMock:** reset cache before every test
  ([91b7cd9b](angular/angular.js@91b7cd9),
   [elastic#13013](angular/angular.js#13013))
- **ngOptions:**
  - skip comments and empty options when looking for options
  ([0f58334b](angular/angular.js@0f58334),
   [elastic#12190](angular/angular.js#12190), [elastic#13029](angular/angular.js#13029), [elastic#13033](angular/angular.js#13033))
  - override select option registration to allow compilation of empty option
  ([7b2ecf42](angular/angular.js@7b2ecf4),
   [elastic#11685](angular/angular.js#11685), [elastic#12972](angular/angular.js#12972), [elastic#12968](angular/angular.js#12968), [elastic#13012](angular/angular.js#13012))

## Performance Improvements

- **$compile:** use static jquery data method to avoid creating new instances
  ([55ad192e](angular/angular.js@55ad192))
- **copy:**
  - avoid regex in `isTypedArray`
  ([19fab4a1](angular/angular.js@19fab4a))
  - only validate/clear if the user specifies a destination
  ([d1293540](angular/angular.js@d129354),
   [elastic#12068](angular/angular.js#12068))
- **merge:** remove unnecessary wrapping of jqLite element
  ([ce6a96b0](angular/angular.js@ce6a96b),
   [elastic#13236](angular/angular.js#13236))

## Breaking Changes

Signed-off-by: Kevin Kirsche <Kev.Kirsche@gmail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

URL type input does not support file:/// protocol.
6 participants