-
Notifications
You must be signed in to change notification settings - Fork 27.4k
fix(angular.encodeUriSegment): do not encode semi-colon #5019
Conversation
RFC 3986 indicates that ; is not encoded as part of the URI, as is the case with other members of sub-delim. Changed encodeUriSegment to match that behaviour, along with the corresponding spec.
This is basically a duplicate of #4067, where this has been discussed in some detail. Can you look at that one and add your thoughts? |
Sorry I missed that in searching. Will do. |
Hi @petebacondarwin - I think this is actually different, despite the similarity. The code here affects location, while the other PR only handles The assessment in there is right in this case too - just doing the substitution means Here is an alternate solution I could propose: This avoids the encode/decode loop. It would need more work to complete (specs for that case, and better handling for setting Would you consider reopening, or a new PR, for such a change? The other alternative would be to parse out the the parameters as a new variable. I tried that and had some issues with it being removed, since any code that uses |
@brettporter - so the fix is basically the same but because the same function ( But back to the issue: if you look through the discussion there is a problem as mentioned in this comment: #4067. Semi-colons should be encoded, unless they are being used in a special application specific sense. This makes it rather difficult for Angular since how does it know in what sense the semi-colon is being used? @eonlepapillon suggests a solution that is specific to ngResource but perhaps from your situation there needs to be a higher level configuration? Or perhaps $location simply needs a similar configuration facility. Either way, I don't think this PR is the right solution and I suggest that once we flush out the best solution it should go in a new PR. |
Change location to retain the expected path encoding. The decode/encode cycle for the location path loses data as %3B (a semi-colon in the path) and ; (a sub-delim in the URI for path parameters) are considered to be different.
@petebacondarwin - thanks for the quick response. We're in agreement, my first two paragraphs were an attempt to say the same thing as your first two. It's certainly possible that there could be a similar configuration facility for those manipulating $location directly, however that's not the issue I'm dealing with here - rather that $location goes through a cycle of decoding and parsing the current URL, then encoding and reconstructing it, that loses information without any intervention from the application. Here's a simple example: https://github.com/brettporter/angular-phonecat/compare/semicolon-path-uri If you access http://localhost:8000/app/index.html;jsessionid.html, you get index.html as you should, but note that the window URL changes to http://localhost:8000/app/index.html%3Bjsessionid.html. If you then refresh, you get the file index.html;jsessionid.html (note the change in page title). What Here's a better link to the fix I propose, which maintains an encoded representation of $$path: https://github.com/brettporter/angular.js/compare/semicolon-path-uri I'm willing to put the extra work in to finish that off for a new PR, or adjust to a different solution - but would like to confirm if that's an acceptable solution first. |
@petebacondarwin any suggestions on how to proceed here? |
I'm going to see if we can get this into the 1.2.5 milestone next week. Then the team can have a chat about what the best solution would be. |
I'm sorry, but I wasn't able to verify your CLA signature. CLA signature is required for any code contributions to AngularJS. Please sign our CLA and ensure that the CLA signature email address and the email address in this PR's commits match. If you signed the CLA as a corporation, please let me know the company's name. Thanks a bunch! PS: If you signed the CLA in the past then most likely the email addresses don't match. Please sign the CLA again or update the email address in the commit of this PR. |
Thanks for considering it! I've signed the CLA again with the email address on the commit in case it wasn't linked up properly before. Is there anything else I should do with the PR at the moment to make it easier for you, or just wait until you've had a chance to discuss the solution? |
I'm sorry, but I wasn't able to verify your CLA signature. CLA signature is required for any code contributions to AngularJS. Please sign our CLA and ensure that the CLA signature email address and the email address in this PR's commits match. If you signed the CLA as a corporation, please let me know the company's name. Thanks a bunch! PS: If you signed the CLA in the past then most likely the email addresses don't match. Please sign the CLA again or update the email address in the commit of this PR. |
@IgorMinar I've signed the CLA twice, most recently with the email in this commit. However, it had me sign in to Google, so I'm not sure if that's using the Google account instead. How can I make sure the CLA gets accepted? |
CLA signature verified! Thank you! Someone from the team will now triage your PR and it will be processed based on the determined priority (doc updates and fixes with tests are prioritized over other changes). |
Here's my current thinking of how
Although significant, I don't think such a breaking change would be too drastic to be included in 1.3. @petebacondarwin & @brettporter what do you think of this approach? |
@jeffbcross - so you are saying that applications should be responsible for pre-encoding such delimiters if they want them to pass through? For example what if you wanted to use semi-colons as a delimiter but then had a semi-colon in the content:
How would we ensure that we didn't double encode this? |
@petebacondarwin yes that's what I'm saying, but I'm still trying to think of different workable policies. I think my suggestion is the most intuitive and compliant as an "Automatic Encoding Policy". (<-- look, british period placement) Do you have a cohesive idea to put forward for a "Configurable Encoding Policy"? I didn't think what was proposed in #4067 was flexible enough, since it was All Or Nothing ignore of encodable characters within a string. What I don't like about my suggested approach is that it would introduce more work for developers who are marshaling strings from one piece of data into the url, who are assuming that the service will encode more than it actually would. Ensuring against double encoding would be a little expensive, I'm experimenting with some approaches. |
@petebacondarwin I added a proof of concept here (a couple of tests are failing): jeffbcross@f57758c |
Thanks for picking this up again. Not sure I have the cycles to page this back into my brain right now, but happy to test any changes you come up with against my specific use case. |
just a few notes about this issue:
So to wrap up:
|
with regards to 6. - I read #4067 (comment) and in case of $resource and http requests made to the server it does make sense to distinguish between the two scenarios. I don't think it applies to window.location though. Anyone thinks differently? |
Per some IRL discussion with @IgorMinar, here is the proposed approach.
I'm going to write a test to reproduce the infinite digest, and then will implement this approach. |
Here is a failing test. The test is kinda hacky because of limitations of the mock it('should not infinitely digest when using a semicolon in initial path', function() {
module(function($locationProvider) {
$locationProvider.html5Mode(true);
});
inject(function($location, $browser, $rootScope) {
Object.prototype.__defineGetter__.call($browser, '$$url', function() {
return 'http://server/;jsessionid=foo';
});
Object.prototype.__defineSetter__.call($browser, '$$url', angular.noop);
$location.path('/;jsessionid=foo');
$rootScope.$digest();
});
}); |
Here's a better test, using the real it('should not infinitely digest when using a semicolon in initial path', function() {
module(function($windowProvider, $locationProvider, $browserProvider) {
$locationProvider.html5Mode(true);
$windowProvider.$get = function() {
var win = {};
angular.extend(win, window);
win.location = {
href: 'http://localhost:9876/;jsessionid=foo'
};
return win;
}
$browserProvider.$get = function($document, $window) {
var sniffer = {history: true, hashchange: true}
var logs = {log:[], warn:[], info:[], error:[]};
var fakeLog = {log: function() { logs.log.push(slice.call(arguments)); },
warn: function() { logs.warn.push(slice.call(arguments)); },
info: function() { logs.info.push(slice.call(arguments)); },
error: function() { logs.error.push(slice.call(arguments)); }};
var b = new Browser($window, $document, fakeLog, sniffer);
return b;
}
});
inject(function($location, $browser, $rootScope) {
$rootScope.$digest();
});
}); |
Some servers require characters within path segments to contain semicolons, such as `/;jsessionid=foo` in order to work correctly. RFC-3986 includes semicolons as acceptable sub-delimiters inside of path and query, but $location currently encodes semicolons. This can cause an infinite digest to occur since $location is comparing the internal semicolon-encoded url with the semicolon-unencoded url returned from window.location.href, causing Angular to believe the url is changing with each digest loop. This fix adds ";" to the list of characters to unencode after encoding queries or path segments. Closes angular#5019
Some servers require characters within path segments to contain semicolons, such as `/;jsessionid=foo` in order to work correctly. RFC-3986 includes semicolons as acceptable sub-delimiters inside of path and query, but $location currently encodes semicolons. This can cause an infinite digest to occur since $location is comparing the internal semicolon-encoded url with the semicolon-unencoded url returned from window.location.href, causing Angular to believe the url is changing with each digest loop. This fix adds ";" to the list of characters to unencode after encoding queries or path segments. Closes angular#5019
Some servers require characters within path segments to contain semicolons, such as `/;jsessionid=foo` in order to work correctly. RFC-3986 includes semicolons as acceptable sub-delimiters inside of path and query, but $location currently encodes semicolons. This can cause an infinite digest to occur since $location is comparing the internal semicolon-encoded url with the semicolon-unencoded url returned from window.location.href, causing Angular to believe the url is changing with each digest loop. This fix adds ";" to the list of characters to unencode after encoding queries or path segments. Closes angular#5019
Some servers require characters within path segments to contain semicolons, such as `/;jsessionid=foo` in order to work correctly. RFC-3986 includes semicolons as acceptable sub-delimiters inside of path and query, but $location currently encodes semicolons. This can cause an infinite digest to occur since $location is comparing the internal semicolon-encoded url with the semicolon-unencoded url returned from window.location.href, causing Angular to believe the url is changing with each digest loop. This fix adds ";" to the list of characters to unencode after encoding queries or path segments. Closes angular#5019
Some servers require characters within path segments to contain semicolons, such as `/;jsessionid=foo` in order to work correctly. RFC-3986 includes semicolons as acceptable sub-delimiters inside of path and query, but $location currently encodes semicolons. This can cause an infinite digest to occur since $location is comparing the internal semicolon-encoded url with the semicolon-unencoded url returned from window.location.href, causing Angular to believe the url is changing with each digest loop. This fix adds ";" to the list of characters to unencode after encoding queries or path segments. Closes angular#5019
RFC 3986 indicates that ; is not encoded as part of the URI, as is the case
with other members of sub-delim. Changed encodeUriSegment to match that
behaviour, along with the corresponding spec.
This was causing a practical issue with Java servers that append
;jsessionid=...
to the path. The;
was encoded, leading to the following error in PhantomJS as it continually saw the current URL as different to the$location.absUrl()
which had been encoded:I note that the specs previously called out
;
specifically for being encoded, which was introduced in 9e30baa, and that it has been in place for some time. Is there a reason this might have been needed that I've missed?I also note that the same code is in
ngResource
, but I'm unfamiliar with the codebase and what impact it would have to change that as well. Should the code in there be updated as well?