Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconsider using text for private-use and reserved #446

Closed
aphillips opened this issue Jul 31, 2023 · 3 comments
Closed

Reconsider using text for private-use and reserved #446

aphillips opened this issue Jul 31, 2023 · 3 comments
Labels
syntax Issues related with MF Syntax

Comments

@aphillips
Copy link
Member

Is your feature request related to a problem? Please describe.

I think we should reconsider using text instead of reserved-body so that we can ensure the opacity of private-use and reserved. This would also require modifying quoted literals to escape { and }. This comes from a discussion in #444 about the data-model, but also previous discussion.

Here are my comments from 444:

I think this describes a bug in the syntax, where we were "clever" to smuggle in the quoted production and the whitespace production. I think the argument was something like: "if we unreserved a sigil, then it would just parse correctly". But I think we're better off if reserved and private are opaque to the ABNF. An unreserved sigil is subject to whatever ABNF is applied to it (either the existing annotation syntax or new syntax). Pre-unreserved implementations still won't parse into the sigil's space, so won't be affected.

In other words, I think we should modify the ABNF thusly:

private-use    = private-start text
private-start  = "^" / "&"

; reserve additional sigils for use by 
; future versions of this specification
reserved       = reserved-start text
reserved-start = "!" / "@" / "#" / "%" / "*" / "<" / ">" / "/" / "?" / "~"

The only escapes in text are around { and } (and \ in case one needs \} as a character sequence). Brackets retain syntactic meaning everywhere. The current reserved-body approach means that implementations parse reserved and private-use into word tokens and literals, even though the implementation is not allowed to interpret them. I think true opacity is the right approach. (In that case (1), (2), (3), and (4) are different character sequences).

(and, yeah, we've had this discussion before in #374)

Your argument there was:

Currently this is a valid expression:

{:foo key=|{bar}|}

If we were to unreserve @ and try to give it the same semantics as with :, this would be a parse error due to the unescaped >inner {:

{@foo key=|{bar}|}

The problem then is: quoted allows { and } unescaped (because they are "quoted"). Adding these characters to the escaped list for quoted simplifies the ABNF a tiny amount (quoted-escape and reserve-escape are the same and we lose some productions) and lets implementations parse out expressions by matching (unescaped) {/} pairs. The cost is that {/} must be escaped in a literal.

While fewer escapes is better than more escapes in quoted literals, I don't think that is very onerous compared to having private-use (reserved scares me less, as we might never use it) be "semi-parsed". I was convinced before that the ABNF would be okay using what we have now because the character sequences could be squeezed in (that I wouldn't actually have to parse the contents). But the data model shows why I was shy in the first place: private-use and reserved turn out to have parsed structure where we want opacity.

@aphillips aphillips added the syntax Issues related with MF Syntax label Jul 31, 2023
@aphillips aphillips added the Agenda+ Requested for upcoming teleconference label Aug 8, 2023
@aphillips
Copy link
Member Author

Discussed in 2023-08-21 call. Next step is PR.

@aphillips
Copy link
Member Author

Note this discussion called out by @eemeli in the duplicate issue: #421 (comment)

@aphillips aphillips removed the Agenda+ Requested for upcoming teleconference label Sep 24, 2023
@aphillips
Copy link
Member Author

No longer relevant

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
syntax Issues related with MF Syntax
Projects
None yet
Development

No branches or pull requests

1 participant