Relax the indentation requirement #87

stasm · 2018-01-30T18:26:34Z

This is a follow-up to #42. I'd like to go further and relax the indentation requirement for everything other than text content. This includes attribute and variant keys as well as expressions.

This doesn't mean that serializers would stop indenting message bodies for readability. We'll still need to define the style guidelines for linting FTL files in #27.

Thanks to #63 this change shouldn't hinder parsers' ability to recover at the next nearest message definition.

A few canonical examples of what would become possible:

multiline-text =
    Lorem ipsum dolor sit amet, consectetur adipiscing elit,
    sed do eiusmod tempor incididunt ut labore et dolore
    magna aliqua.

block-expr = { $num ->
   *[one]
        Lorem ipsum dolor sit amet, consectetur adipiscing elit,
        sed do eiusmod tempor incididunt ut labore et dolore
        magna aliqua.
}

And a few extreme ones (please don't write your FTL this way):

key1 = {
$num ->
*[one]
 One
}

key2 =
.attr = Attribute

key3 = {
key1
}

key4 = {
DATETIME
(
$date,
weekday
:
"short"
)
}

The goal is not to encourage people to write FTL like the above but rather, to accept as much input as possible while being as forgiving as possible.

Pike · 2018-08-03T16:13:10Z

I've looked at the implementation in #159, and I like it.

I like that we can relax the acceptance of the parser without reducing the readability of the spec.

I also agree that we should continue to serialize with indention.

I'd also stick to serializing with trailing-} indent for now. Mostly because that allows us to not immediately push all implementations to production ;-) I'm happy to have a separate conversation about that in a quarter or so, though.

zbraniecki · 2018-08-09T22:00:45Z

I like it.

There's one thing I'm wondering about, both from the parser, and from the human eye perspective and it taps to the long standing debate about how the reader spots new message while scanning a file.

One opinion is that it happens when the reader sees a line starting with an identifier followed by = mark. Another is that it happens when the reader sees a line starting with an identifier.

Let's take a (slightly messy) example. Try to quickly count how many Messages are in the following sample:

this-is-my-key = This is my value
  .this-attribute1 = This is an attribute
this-is-another-key =
  This is another value for
  this-is-my-key and friends
this-is-a-third-key =
  This is a multiline message for
  which always has = after the message id
this-is-a-fourth-key = Yet another message
  with another line of text
this-is-a-fifth-key =
  .this-attribute1 = Yet another attribute

If you're in camp 1, you likely quickly scanned for = signs and backpaddled to see if the = is after an identifier which should be at the beginning of a line.
If you're in camp 2, you just tried to see an identifier character at the beginning of the line, and roughly assumed that it should be a beginning of the message.

I was always in camp 2, and it made me struggle to see the value of enforced = because I didn't use it in examples like that, but for FTL lists that had a lot of messages with just attributes and no value (like this-is-a-fifth-key) it brought noise.

Anyway, the reason I'm bringing it up is that in the current proposal, you relax the indentation rules around a different concept. One, that is not aligned with my human reading patterns despite several months of working with = enforced syntax.

From the extreme examples you provided:

key1 = {
$num ->
*[one]
 One
}

key2 =
.attr = Attribute

key3 = {
key1
}

key4 = {
DATETIME
(
$date,
weekday
:
"short"
)
}

key1 works because the line beginnings are $ and * which I know to be sigils, and the only line that could be confusing, conveniently starts with capital letter and a space. While a single space is not a lot, it's enough that combined with capital letter, I take a deeper look to "parse" the message and confirm that One is not a new key.

key2 works in a similar fashion - I see a . which tells me it's not a new message.

key3 is the one that really gets me. It feels like there's a broken message inside of a message. I understand that lack of = should help me, but it doesn't work at all.

key4 has mostly sigils to start lines except of all-caps DATETIME, which is not confusing mostly because it is all-caps and my brain don't associate message ids to be all-caps. The weekday line gets me again.

So, what did I learn about my processing? I learned that in order to mimic it in the syntax, we could say that a a-z after new line is a new message and forbid it anywhere else.
This means that:

error recovery is trivial. If it's a-z at the beginning of new line, it's a new message.
readability is high for all in camp2, but I assume that camp1 wouldn't have a problem with simpler recognition either
in syntax all we have to forbid is a-z showing up in patterns without indent.

While the logic of this proposal would change, the outcome would change only in a couple extreme cases:

The key3 example would require the variable to be indented
message indentation would only be required if the second+ line of the message starts with a-z.

As for caps A-Z we could either limit Identifiers to not start with them, or add them to the limitation. I haven't seen anyone trying to use caps in identifiers in a long time in any programming language, and since we want to use them in built-ins, I think it would be ok to not use them in ids.

I know my comment is long, but the proposal alteration is not that significant. Just instead of relax the indentation requirement for everything other than text content we'd use relax the indentation requirement for everything except of identifier-start.

wdyt?

stasm · 2018-08-10T08:05:55Z

Thanks for sharing your thoughts, @zbraniecki!

Why do you think that this should be enforced by the grammar, rather than by linting and style guides? I honestly don't suppose anyone will write their FTL the way key4 is written. It's an extreme example of what's possible but definitely not of what's recommended. It's similar to how other languages operate: if you write obfuscated and minified JS by hand, you don't expect people to instantly get it.

We can't protect users from their sense of aesthetics (or the lack thereof). If, as a localizer or developer, you want readable translation files, Fluent should give you every tool you need to do so. But you can be a bit sloppy, too, and things should mostly work, too. That's the Tolerance design principle at work.

we'd use relax the indentation requirement for everything except of identifier-start.

This is much harder to implement in practice. Or at least I'm not sure how to go about it. The way I'm thinking about the relaxation allows the parser to switch into a completely whitespace-insensitive mode as soon as it enters a placeable. The way you propose it, there would still be restrictions that apply depending on what's inside of the placeable.

Also, would you allow {\n-3.14} or require {\n -3.14}?

zbraniecki · 2018-08-10T16:15:55Z

My only point is that error recovery (find the next start of the message) which partially informs the decision here, can (and I argue that it should) be aligned with human readability - because both, the parser and the human are tasked with the same task - "find the start of the message".

And what we're trying to do here is find what we have to forbid in order for that task to be easy/possible.

As I said, I'm not interested in blocking your proposal - just provided an alternative. :) I'm happy to go with your approach.

stasm · 2018-08-10T16:50:21Z

Thanks, @zbraniecki. I went ahead and merged the PR.

stasm added this to the Syntax 0.7 milestone Jan 30, 2018

stasm added the syntax label Mar 26, 2018

Pike mentioned this issue May 14, 2018

Variants parsed as junk if closing brace not indented projectfluent/python-fluent#66

Closed

stasm added syntax and removed syntax labels May 15, 2018

stasm mentioned this issue May 18, 2018

Allowing full-message variants to end without indention #42

Closed

stasm removed this from the Syntax 0.7 milestone May 23, 2018

stasm added the forwards incompatible Old parsers won't parse newer files. label May 23, 2018

This was referenced Aug 1, 2018

Allow indention of SelectorExpression #106

Closed

Only require indentation before TextElements #159

Merged

stasm mentioned this issue Aug 9, 2018

Preserve content indent in multiline Patterns #162

Closed

stasm closed this as completed in #159 Aug 10, 2018

stasm mentioned this issue Aug 22, 2018

Require indent before placeables #170

Closed

This was referenced Oct 19, 2018

Allow selectors to be lists #4

Open

Change named arguments syntax to $name = value #192

Open

snyk-bot mentioned this issue Oct 26, 2019

[Snyk] Upgrade fluent from 0.6.4 to 0.12.0 ajesse11x/send#4

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Relax the indentation requirement #87

Relax the indentation requirement #87

stasm commented Jan 30, 2018

Pike commented Aug 3, 2018

zbraniecki commented Aug 9, 2018

stasm commented Aug 10, 2018

zbraniecki commented Aug 10, 2018

stasm commented Aug 10, 2018

Relax the indentation requirement #87

Relax the indentation requirement #87

Comments

stasm commented Jan 30, 2018

Pike commented Aug 3, 2018

zbraniecki commented Aug 9, 2018

stasm commented Aug 10, 2018

zbraniecki commented Aug 10, 2018

stasm commented Aug 10, 2018