-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define the behavior of backslash #22
Comments
Fix projectfluent#12, projectfluent#17, projectfluent#18. With this change, the entire body of a message needs to indented. This makes error recovery very easy: finding the next message definition is as simple as finding the next identifier with no indentation. It also opens up a number of opportunities: we can remove the `|` syntax for multiline blocks of text and allow line breaks inside of placeables safely. The PR also allows the value to be defined on a new line, making the following examples equivalent: lipsum = Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi pellentesque congue metus, non mattis sem faucibus sit amet. lipsum = Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi pellentesque congue metus, non mattis sem faucibus sit amet. I hope this will help when attributes are present: lipsum = Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi pellentesque congue metus, non mattis sem faucibus sit amet. .attr = Attribute Lastly, quoted patterns are only available inside of placeables and cannot be used directly as values. The exact semantics of \ escapes will be defined in projectfluent#22.
Fix projectfluent#12, projectfluent#17, projectfluent#18. With this change, the entire body of a message must be indented. This makes error recovery very easy: finding the next message definition is as simple as finding the next identifier with no indentation. It also opens up a number of opportunities: we can remove the `|` syntax for multiline blocks of text and allow line breaks inside of placeables safely. The change also allows the value to be defined on a new line, making the following examples equivalent: lipsum = Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi pellentesque congue metus, non mattis sem faucibus sit amet. lipsum = Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi pellentesque congue metus, non mattis sem faucibus sit amet. Lastly, quoted patterns are only available inside of placeables, cannot contain aother placeables and cannot be used directly as values. The exact semantics of \ escapes will be defined in projectfluent#22.
Fix #12, #17, #18. With this change, the entire body of a message must be indented. This makes error recovery very easy: finding the next message definition is as simple as finding the next identifier with no indentation. It also opens up a number of opportunities: we can remove the `|` syntax for multiline blocks of text and allow line breaks inside of placeables safely. The change also allows the value to be defined on a new line, making the following examples equivalent: lipsum = Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi pellentesque congue metus, non mattis sem faucibus sit amet. lipsum = Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi pellentesque congue metus, non mattis sem faucibus sit amet. Lastly, quoted patterns are only available inside of placeables, cannot contain aother placeables and cannot be used directly as values. The exact semantics of \ escapes will be defined in #22.
A draft proposal:
@Pike, @zbraniecki — I'd love to hear your thoughts on this. Thanks! |
sgtm! I'd not do |
I woke up this morning and I had another idea: what if we tried to use the Firstly, let's talk about the backslash. In a more extreme version of the proposal, it can (a) become a regular literal character. Or, it could (b) escape any character to itself, taking it out of the parse flow.
Special characters occurring in So the question boils down to: how much do we want to limit the |
After more thought I'd like to go back to the first proposal and also make it simpler.
@Pike, @zbraniecki - mind taking another look at this, please? |
I'm not sure if doing For unicode escapes, I've just toyed around with the unicode hex keyboard on the mac. Interestingly, you need to enter surrogate pairs to get to 𝌆, 8 keystrokes away. I wonder if @flodolo or @TheoChevalier have opinions on this as people that actually have to type that unicode stuff. Apart from that, the latest proposal sounds fine to me. |
I don’t think not being able to use |
I might have more questions than answers…
Maybe dropping the string all together with an error is a better option. |
New-lines are supported by the syntax natively. You don't need to escape them, just write them as normal:
That would be only necessary if
What is the advantage of writing |
Uh, forgot that multiline strings have a different syntax. So, that's not a problem.
Aren't we asking too much then to localizers working on these files? I know we would prefer them to use tools, where this would be automated and transparent, but it seems to add a lot of complexity.
None, but my understanding is that we're considering the case where someone wrote the string assuming |
Some editors define |
Last week we met in person and briefly discussed this issues with @Pike, @zbraniecki and @flodolo . Here are the key take-aways from that conversation:
With that in mind, I'd like to suggest a minimal specification for our current purposes.
|
@stasm is there anything left in this issue? |
No, I forgot to close this issue. And to tag Syntax Spec 0.3 back in April. D'oh. Thanks. |
In #12 (comment) I said we'd need to define the exact behavior of the backslash character
\
for the purposes of escaping. This includes defining:the list of known escape sequences (
\
( a space),\t
,\n
,\*
,\[
,\{
,\u
,\\
, others?),how the Unicode escapes work: is
\u20
valid and the same as\u0020
?the behavior of unknown sequences, like
\a
(does the backslash take the following character out of the syntax parsing?),the behavior for edge-cases, like:
Is that a syntax error? If not, what is the name of the identifier?
Is that an escaped new-line?
The text was updated successfully, but these errors were encountered: