Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document valid identifiers for patterns #247

Open
gijsk opened this issue Mar 2, 2019 · 13 comments
Open

Document valid identifiers for patterns #247

gijsk opened this issue Mar 2, 2019 · 13 comments
Labels

Comments

@gijsk
Copy link

gijsk commented Mar 2, 2019

Apparently 1-foo is not a valid identifier, but foo-1 is. I have no idea why, or what the restrictions are, because somehow https://projectfluent.org/fluent/guide/hello.html and https://projectfluent.org/fluent/guide/text.html do not mention what the syntax is for an identifier, and skip straight to patterns, which are always the values associated with those identifiers...

@flodolo
Copy link
Contributor

flodolo commented Mar 2, 2019

I'm also surprised that identifiers cannot start with a number (I don't think I've ever seen it mentioned in docs?)
https://github.com/projectfluent/fluent/blob/master/spec/fluent.ebnf#L86

@zbraniecki
Copy link
Collaborator

The reason is of course math, and the fact that we handle numbers. The answer is in https://github.com/projectfluent/fluent/blob/master/spec/fluent.ebnf but we can probably document is more in the guide

@flodolo
Copy link
Contributor

flodolo commented Mar 2, 2019

The reason is of course math, and the fact that we handle numbers.

Sorry but that's not an explanation. Is there a practical reason to forbid identifiers from starting with a number? Also, what does it mean "we handle numbers"? Examples would go a long way.

@zbraniecki
Copy link
Collaborator

Yes, math. Math is the reason, if you want to support math, you have to support substraction operator which in most cases, including Fluent is going to be "-". Once you're there, any a-z will be an identifier start and "[0-9]-[a-z]" will be a substraction operation between a number and an identifier. We could, technically, try to forbid that and limit ourselves, but I doubt it is the right tradeoff.

@flodolo
Copy link
Contributor

flodolo commented Mar 2, 2019

if you want to support math, you have to support substraction operator

Where and how we support math in Fluent?

I find it concerning that I'm totally unable to follow your reasoning, and yet I'm not exactly new to Fluent (at least regarding the documentation part, and writing FTL).

@zbraniecki
Copy link
Collaborator

Umm, I'm not as concerned as you. We can reason and explain our positions and I don't find it concerning.

Math operations and operators are frequently made available for the purpose I provided.

If there's a strong reason for number-initiated identifiers we could, as far as I'm aware, made them possible, but that would go against every programming language and DSL that I'm aware of, which forbid that model for exactly the reason I provided. We don't support math operations in the current syntax but we did in the past and it's reasonable to protect the syntax assuming we may want to support it in the future.

Do you have any strong reason to bring number-initiated-initiated identifiers?

@flodolo
Copy link
Contributor

flodolo commented Mar 2, 2019

We don't support math operations in the current syntax but we did in the past and it's reasonable to protect the syntax assuming we may want to support it in the future.

As I said, please provide an example, otherwise we're not going anywhere. I can't imagine how math operators fit into Fluent, and I need an example to understand it.

As for "other programming languages don't", I'm not sure Fluent is a programming language, but that could be a reason good enough, as long as it's conscious and documented.

@zbraniecki
Copy link
Collaborator

key = { SELECTOR(1-key2) - > *[if] Foo [else] faa }

The reason is not just compliance. It's to protect our ability to extend syntax by adding likely operations. We may never need it, but I'd assume one day we will.
And I don't think we've seen a reason to give up that protection.

I agree documenting it better should happen :)

@gijsk
Copy link
Author

gijsk commented Mar 2, 2019

Do you have any strong reason to bring number-initiated-initiated identifiers?

Compatibility with .properties (and maybe .dtds, I haven't tried).

@gijsk
Copy link
Author

gijsk commented Mar 2, 2019

Anyway, the documentation here is just non-existent in the fluent guide. It doesn't say what is and isn't supported (doesn't even define an identifier, as far as I can tell, and even https://projectfluent.org/fluent/guide/references.html only gives examples of uses of placeables but doesn't actually specify what things are and aren't allowed (so it's not obvious that I can use { foo.label }, for instance)).

Note also that the set of supported characters based on that ebnf is much smaller than almost every programming language I know (JS, python (>=3), perl, lisp, c++, rust (feature gated on recent versions) all support unicode characters for identifiers/variables, as well as (in some cases) other ascii characters (notably "$"). Prolog doesn't support _ as a variable name because it's special, so its set of allowed variable names is clearly not a superset of the Fluent one... but that's the exception that proves the rule, as it were.

Of course the current arbitrary restriction is the subject of #117, but the fact that it's not documented should be addressed irrespective of all the other discussion.

@gijsk
Copy link
Author

gijsk commented Mar 2, 2019

Prolog doesn't support _ as a variable name because it's special, so its set of allowed variable names is clearly not a superset of the Fluent one... but that's the exception that proves the rule, as it were.

Oh, it turns out even this is wrong because Fluent doesn't support identifiers that start with _ either, it seems.

@Pike
Copy link
Contributor

Pike commented Mar 3, 2019

A few thoughts:

Yes, documentation is lacking, and the biggest item of work we have planned for Fluent 1.0. The target of 1.0 will be tooling developers, so even then, usage docs may or may not be in scope. For Mozilla, an overhaul of firefox-source-doc is in order, IMHO, the l10n/intl/l10n docs are saying somewhat conflicting and incomplete things on https://firefox-source-docs.mozilla.org/tools/compare-locales/index.html vs https://firefox-source-docs.mozilla.org/intl/localization.html. They both pretty clearly showing their roots.

Also, yes the restrictions on Identifiers are semi-random. They're based on the idea that we can extend the namespace of identifiers if needed. But also on the idea that we might add things like math.

Why is 1234 not an identifier? Because message references and number literals.

1234 = Hello, World
msg = {1234}

msg is actually 1234 because that's a number literal,

fluent/spec/fluent.ebnf

Lines 53 to 56 in 88587eb

InlineExpression ::= StringLiteral
| NumberLiteral
| ReferenceExpression
| inline_placeable
.

So, the most wide-cast net for Identifiers would require that there's at least one non-number character in there. And that . and - as initial characters are reserved for Attributes and Terms. Otherwise we'd have messages that can be referenced with message references and ones that can't.

Just enforcing a leading char is a one way to make it easy to spot what's an identifer and what's not (if it's documented, sic ;-) ).

@zbraniecki
Copy link
Collaborator

zbraniecki commented Mar 3, 2019

One other item to consider is that identifiers in fluent have somewhat special role compared to any other DSL - they play a role in error recovery (as the last resort).
I'm not super excited about how western-centric the selection of characters is (Latin alphabet), but I do like the idea that the identifier is meant to be meaningful for the reader assuming they are able to recognize the word in the Latin alphabet.
For that reason "cancel-button" is an encouraged identifier while "1-foo" is not. From that perspective, limiting the scope of allowed patterns in an identifier and enforcing a character (so, disallowing identifiers such as ___ or 2_-_-_-1 is helping the recovery strategy.

Of course such efforts are not complete and one could use the argumentum ad extremum that Latin characters can be used to create meaningless identifiers. The only thing I'm saying is that extending the scope of allowed characters is unlikely to help with that goal whole it will chip into our ability to add features to the language later.

So, from my perspective the arguments for extending the scope of characters are - compatibility with .properties and removal of a potential paper cut.
Arguments against are maximizing future language extnesion options and nudging the culture of simple, recognizable, consistent and meaningful identifiers.
It's a subjective balance of course and I can see how everyone can see it being in a different place.

@stasm stasm added the docs label Mar 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants