Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

basically a full rewrite #29

Merged
merged 27 commits into from
Sep 8, 2021
Merged

basically a full rewrite #29

merged 27 commits into from
Sep 8, 2021

Conversation

otommod
Copy link
Contributor

@otommod otommod commented Aug 20, 2021

I think the field Regex can also be used in identifier.

We can now support escape_sequence highlighting.

I've wanted to add destructuring support but I think there's an actual conflict in each. Since it can take any number of bindings before the iterator, tree-sitter cannot distinguish between a multi-value assignment/destructuring and a function call. The same happens with tables, though they cannot be iterators. For example:

(each [i (a b) (my-iterator)] (print a b))

An escaped backspace just before the closing quote would be recognized
as an escaped quote and the rest of your program becomes a giant string.

Fennel on its own doesn't really deal with escape sequences, but the
underlying Lua platform does.  We can't really know its version so we
just support the latest, 5.4.
Numbers can start or end on just a dot (.003, 100.)

Fixing just that would be boring, so here's some new features:
* hexadecimal literals: 0xdeadbeef
* exponents, decimal and hexadecimal: 1e10, 0x1p10
* underscores as numerical separators: 1_000.000_03

The first two are Lua features.  Hexadecimal integer literals were
introduced in Lua 5.1 and hexadecimal float literals in Lua 5.2.  There
have been no further additions up to Lua 5.4.

The last feature is all Fennel and it's pretty insane.  For example,
here are some ways to represent the number 32:
* 0x20
* 0x10p1
* +_0_x_1_0_p_+_1

The last one doesn't even look like a number.  For my own sanity, I
only allow underscores between digits and not directly after the dot,
the exponent "character" (the 'e', 'E', 'p', 'P') or any sign.
Not really broken but there's definitely more characters we can allow.
In fact, we allow everything!  With some exceptions.
I only noticied this by chance: `(foo. bar . baz)` would get parsed as a
single `field_expression`, because of how `seq()`s work.
We can use the same technique as with `field_expression`s to correctly
parse methods and only methods.  We need to set a higher precedence on
the colon specifically to "overpower" `field`
There is no reason for it to be a separate rule.
@otommod
Copy link
Contributor Author

otommod commented Aug 23, 2021

I pushed some more commits. The last 3 are mostly cosmetic and require changes to queries as well (because of removed or renamed rules)

It's maybe getting a bit too long. I could try to split it up into several PRs if you want.

Any quoted special form loses its special meaning, recursively.
Our first conflicts!  The alternative would be to add precendece in all
the binding rules and I don't really want to do that.
@otommod otommod changed the title better number and string parsing and other fixes basically a full rewrite Aug 28, 2021
@otommod otommod force-pushed the fixes branch 2 times, most recently from fb677ad to 63eaf12 Compare August 28, 2021 20:27
@sogaiu
Copy link

sogaiu commented Aug 28, 2021

Have you tried running npx tree-sitter test?

I might have done something wrong but when I try locally I see errors with this PR.

I also tried testing against a collection of fennel code and I get more errors with this PR than I do with the original.

If you're interested in testing against a collection, there is some information about that here: #5

@otommod
Copy link
Contributor Author

otommod commented Aug 28, 2021

I have indeed been running the tests constantly.

I did also just notice that ., : and .. are not handled correctly

";" would get parsed as a string with only its starting quote and then a
comment.  The ending quote would not be found and all the rest of your
program is a string.

I though `token.immediate` was supposed to not allow...
It is still supported and used in the Fennel compiler
They're just symbols.

However, they are symbols with dots in them (at least `.` and `..` are).
Yet, they're not multisyms either.  Indeed, a symbol that begins or ends
on a dot is not a multisym.

I managed to get the starting dots working.  Unfortunately, I can't get
a symbol to only end on a dot without completely breaking
`multi_symbol`.  So, no `?.` for you. :(
If they just take statements they're not really worth their own rule.
@otommod
Copy link
Contributor Author

otommod commented Aug 29, 2021

I tested against aniseed and the Fennel compiler source and I did find some more bugs.

Unfortunately, I couldn't quite get symbols that end on a dot to work (like ?.)

@otommod
Copy link
Contributor Author

otommod commented Sep 3, 2021

@TravonteD ping

@TravonteD
Copy link
Owner

Thanks for the contribution! I'll take some time to read through it

Copy link
Owner

@TravonteD TravonteD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work on this! I'd been meaning to do something similar (I had a branch much like this one), but you beat me to it! 👏

(function_call
name: (string))))
(quote
(quoted_sequential_table))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the thinking behind making quoted_<value_type> within the the quote.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that '[{:foo (fn)} {:bar (each)}] works. The (fn) and the (each) should not be parsed as their respective rules.

test/corpus/regression.txt Show resolved Hide resolved
@TravonteD TravonteD merged commit 4282344 into TravonteD:master Sep 8, 2021
@TravonteD
Copy link
Owner

TravonteD commented Sep 8, 2021

Merged! One thing we may want to do now is document the nodes so that query writers will have a reference

@otommod
Copy link
Contributor Author

otommod commented Sep 8, 2021

Yup. Every query is broken now unfortunately. :(

@otommod
Copy link
Contributor Author

otommod commented Sep 8, 2021

I believe we can now also close #12 and #30

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants