-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix compound binding regex when a binding is preceded with text with { or ] #2711
Fix compound binding regex when a binding is preceded with text with { or ] #2711
Conversation
There has been a problem reported in GoogleWebComponents/google-chart#73 (comment) where the regex updated in #2705 matched the
// Regex with comments, compiles to:
// ((?!\{\{|\[\[).+?)?(\{\{|\[\[)((?:!)?(?:(?:[\w\s\-\_\.\:\$]+?)|(?:[\w\s\-\_\.\:\$]+\(.*?\))))(?:\]\]|\}\})
_bindingRegex: (function() {
// Opening Binding options
var ob = '\\{\\{|\\[\\[';
// All allowed characters in a name of property or computed function
var name = '[\\w\\s\\-\\_\\.\\:\\$]+';
return new RegExp('((?!' + ob + ').+?)?' // Everything that is not a {{ or [[
+ '(' + ob + ')' // The mode (two- or one-way)
+ '(' // Start of contents of the binding
+ '(?:\!)?' // Is the binding a negation
+ '(?:' // Can either be name or computed function
+ '(?:' + name + '?)' // A normal name for a property
+ '|' // Or
+ '(?:' + name // The name of the computed function
+ '\\(.*?\\)' // The arguments of the function within ()
+ ')' // End of computed function match
+ ')' // End of name of property or computed function match
+ ')' // End of the whole match returned as m[2]
+ '(?:\\]\\]|\\}\\})' // The binding closing
,'g')
})(), All options have pro's and con's. Option 1 requires users to put JSON as return value of a computed function or property, option 2 requires end-users to update their databinding, option 3 is significantly more difficult to maintain and understand. Personally, I don't know what the way to go is. I only wanted to show that some options are possible. I hope this information is useful, please let me know what option would be desirable. |
1 would be a breaking change, so I'm not sure we have much of an option other than to get this working again as it was. |
For option #1, couldn't you also put a space between the brackets? I believe "[ [1, 2, 3], [3, 4, 5] ]" would passed through unscathed. |
@arthurevans Yes that is correct. This would not match on the regex, but parse as JSON. |
Future complexity may be tamed by decoupling the one-way data binding anchors from valid JSON (choosing new anchors) Here are some more cases but the intent here may be ambiguous between the use of existing features:
|
I have inserted another option to counterfeit JSON edge-cases as pointed out by @jongeho1 |
I think the problem with the option #3 is not that is is too complex but that it not tight enough to prevent all false positive matches. The current parser is not that simple to maintain, it does not recognizes things it should:
It also recognizes things it shouldn't - which creates a risk of collision with other syntax and makes typos harder to find:
So it seems to me that there are also maintenance and complexity costs with option #1 I mean, polymer is great as it is. You sure don't run into these problems everyday and there are workarounds in pretty much all cases - but still, trying to match an airtight syntax does not seem that expensive to me:
Improving and testing each individual sub-expression is still easy. If this "one regular expression to rule them all" approach doesn't work (it runs but I haven't tested the speed or the compatibility), another option is to use the RE as a poor man's lexer that split the input string into a stream of tokens - as soon as something that looks like a binding is detected. The trick would be to bail out and leave the HTML untouched in case of parser error. Ambiguities between json and polymer syntax are rare and wouldn't take long to resolve anyway. |
@ebidel Just as a point of order, I want to point out that not disambiguating nested JSON array literals from one-way bindings has been a long-standing issue (#1734) which only temporarily worked for two releases ( That said, I'm reviewing @david-saslawsky's expression for performance impact and if we can make it work with the existing bespoke string escaping. |
I really like the syntax of @david-saslawsky , but keep in mind this is a breaking change regarding naming of variables. (I think this limitation is the way to go to prevent updating this regex every week) The consequence would be that update must become a MAJOR update according to semver. |
Sorry if this is not the right place to discuss this. I've made more a complete version of the regex:
Floats and string escaping work. Spaces and newlines are not supported but this shouldn't have much impact on the performance (I would expect that most of the time is spent looking for the This is almost airtight but I can't find a way to handle comma correctly in the argument list.
In the long term, the "Regexp as a lexer" approach seems even more robust and flexible. Maybe something like:
This can parse things impossible to parse with a regexp:
Please let me know if I can be of any help. |
@david-saslawsky It might be good to create a PR with your fix so it can be reviewed more easily. I think I have the solution for the argument comma problem. |
@david-saslawsky I modified your regex to pass the tests when integrated with Polymer and then ran some rudimentary perf tests against it. While much more correct than the previous, it highlighted a potential catastropic backtracking situation due to the repeating quantifiers for In my tests, it now parses faster than the regex on master and pretty close to the parsing speed in v1.1.5 before we added compound bindings, so I'm inclined to go with this solution, despite being slightly more strict about identifier characters allowed (I fail to see this as any sort of practical problem). I will put up a PR for this shortly. It also has the happy side-effect of once-again excluding most nested JSON array literals (due to the stricter expression matching) and makes the google-chart repro work again. |
Closing in favor of #3017 |
As outlined by @david-saslawsky in his comment another edge-case for the regex is
[YES]/[NO] {{computeCompound('world', 'username ', 'Hello {0} ')}}
Therefore the regex is once again (sorry for the multiple PR's) changed to also cover this edge-case. Updated the original test-case to reflect this change.
@kevinpschaaf PTAL