Agda-style Lexing #203

Ericson2314 · 2022-01-23T22:15:46Z

I have been mulling this for a while, but the difficulties in fixing #197 made it feel more urgent.

As a (rare) user of Adga, I have been very fond of it's lexing, which seems very simple, and more concerned with the boundaries between tokens rather than the contents of tokens themselves. (You can seem me singing its praises in, e.g. ghc-proposals/ghc-proposals#444 (comment)).

I have a few questions on this.

Do the people implementing Agda agree with this premise, that lexing in Agda is significantly different and/or simpler than that in other languages? Or am I reading to much into it as a user guessing how it works?
If the premise is valid (per question 1), is there anything Alex might do to make this easier / a more obvious way to do things? I suppose I should study https://github.com/agda/agda/blob/master/src/full/Agda/Syntax/Parser/Lexer.x
Should we transition Alex itself to lex more in this style, basically requiring more things to be space-separated?

CC @andreasabel who conveniently works on both Alex and Agda, and @int-index who spearheaded the similar left right lexing context rules for Haskell.

andreasabel · 2022-01-25T09:01:44Z

In Agda, identifiers and operators need to be white-space separated.
The only tokens that need not be white-space separated are (, ), {, }, ; (maybe I am forgetting one).
However, if you are suggesting that Agda is doing some post-processing on tokens to e.g. split 2+3 into 2 + 3, this is not the case, so 2+3 is simply an identifier and has nothing to do with numbers or summation whatsoever.

Frankly, I do not understand what you are intending here, or how Alex should be changed. At its core Alex implements traditional 1960s style lexing (classic "formal languages and automata" stuff).

Ericson2314 · 2022-01-28T23:44:47Z

@andreasabel Well, for example, does the Agda lexer use copious right contexts to to find those whitespace boundaries? The current Alex docs warn that right contexts can make things slow, but I suspect either the warning is overly pessimistic, or the situation can be improved.

andreasabel · 2022-01-30T16:01:56Z

Dunno. Right contexts are used in several places: For comments:
https://github.com/agda/agda/blob/798be60d51a56a8c74cfd309c1498b070240e686/src/full/Agda/Syntax/Parser/Lexer.x#L110-L125
For layout:
https://github.com/agda/agda/blob/798be60d51a56a8c74cfd309c1498b070240e686/src/full/Agda/Syntax/Parser/Lexer.x#L135-L140
Also for: {{
https://github.com/agda/agda/blob/798be60d51a56a8c74cfd309c1498b070240e686/src/full/Agda/Syntax/Parser/Lexer.x#L224-L226

andreasabel added the discussion label Jan 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agda-style Lexing #203

Agda-style Lexing #203

Ericson2314 commented Jan 23, 2022 •

edited

Loading

andreasabel commented Jan 25, 2022

Ericson2314 commented Jan 28, 2022

andreasabel commented Jan 30, 2022

Agda-style Lexing #203

Agda-style Lexing #203

Comments

Ericson2314 commented Jan 23, 2022 • edited Loading

andreasabel commented Jan 25, 2022

Ericson2314 commented Jan 28, 2022

andreasabel commented Jan 30, 2022

Ericson2314 commented Jan 23, 2022 •

edited

Loading