Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AST is too noisy #39

Closed
calebdw opened this issue Nov 30, 2023 · 5 comments
Closed

AST is too noisy #39

calebdw opened this issue Nov 30, 2023 · 5 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@calebdw
Copy link
Contributor

calebdw commented Nov 30, 2023

Hello!

Please see below screencast---the ast is too noisy in that there's too many php_only nodes. This is likely an artifact of the regex parsing, but there should only be one php_node for that entire block not 16.

Screencast.from.11-30-2023.09.38.29.AM.webm

Thanks!

@EmranMR
Copy link
Owner

EmranMR commented Nov 30, 2023

Hi @calebdw ! Yes I am aware, unfortunately that was the only way to get the grammar working.
I genuinely don't know any work around to fix this. As you correctly pointed it is due to the Regex parsing, but more specifically the token(prec()) functions. But I presumed the noise should actually be hidden away when we turn on your split_parser. Please have a look below. So for example when you turn on the php_only injection, all those should be replaced with correct php injection AST?
If anyone can come up with a better solution more than happy to discuss, I am not even sure if the external scanners could help? Spent genuinely months to come up with a better solution and this was the best I could come up with. 😬

So the reason behind this

  • Because of the complex nature of the blade, the endless injections, and of course the need to correctly distinguish characters, I had to come up with lexical precedence mixed with Regex
  • So all the caught php_only you see are the odd brackets or punctuations that the grammar knows for certain have no special meaning but to be as a text.

Here is an example:

  • So the parser hits an orphan (, the problem arises that, it needs to know whether it is part of a directive @vite( or php or just random html text. This gets very complicated when you have a function call inside a blade directive for example. Same goes for ", ', { and so forth
@if ($attributes->hasAny(['href', ':href', 'v-bind:href']))
    <div>One of the attributes is present</div>
@endif

or

<button {{ $attributes->merge(['type' => 'button']) }}>
    {{ $slot }}
</button>

or even worse, you have a function call inside as a directive parameter that calls another function. recursively! 😬


Now, I have a quick question, I have been using your old split_php parser, so that the editor injects php, and because everything is correctly picked up by the tree-sitter-blade and injected as php I do not see any noise in Nova. Basically let us say that you turn on the injection in Nvim, does that sort your AST? Nova does not have AST but here is how it looks there 🤔 Basically as soon as the php_only injection is turned on, all those php_only get replaced by their correct php definition.

Screen.Recording.2023-11-30.at.16.42.51.mov

@calebdw
Copy link
Contributor Author

calebdw commented Nov 30, 2023

@EmranMR,

You might try playing around with left/right associativity---that might combine the nodes into each other. This can be solved with an external parser, but it would be nice to come up with a solution that doesn't require it.

As for the php_only injection both sets of nodes are in the AST (the php_only nodes and the new nodes from the injection).

While Nova might not show you the AST, I can see from your video that you have many php_only nodes as well.

@amaanq is a treesitter guru, perhaps he would be kind enough to take a look and give some suggested improvements?

@EmranMR EmranMR added the bug Something isn't working label Dec 2, 2023
@EmranMR EmranMR self-assigned this Dec 2, 2023
@EmranMR EmranMR added this to the v0.9.0 milestone Dec 2, 2023
@EmranMR EmranMR added enhancement New feature or request and removed bug Something isn't working labels Dec 2, 2023
EmranMR added a commit that referenced this issue Dec 2, 2023
@EmranMR
Copy link
Owner

EmranMR commented Dec 2, 2023

Hi @calebdw, I had a bit of an epiphany and I think I managed to fix the noise 🤞, with minimal collateral damage, could you please test the branch #39.

The php-only you saw in my Nova inspector is the injected, scrambled syntax that I compiled using your split_parser for Nova. So yes you are correct Nova does not provide AST, it just shows the end result!

However I parsed your code excerpt using the tree-sitter cli and here are the results, which I presume should be the same as AST?

Before

Screenshot 2023-12-02 at 19 59 54

After

Screenshot 2023-12-02 at 20 01 04

If all good on your side, please do let me know and I will go ahead and release v.0.9.0! 🤞

@calebdw
Copy link
Contributor Author

calebdw commented Dec 2, 2023

@EmranMR,

The php_only node looks much better, good job! I noticed though that the text node could use the same improvements:

image

@EmranMR EmranMR closed this as completed in 8353cbb Dec 3, 2023
@EmranMR
Copy link
Owner

EmranMR commented Dec 3, 2023

@calebdw All done, the AST should now be as clean as it can be!

Thanks for bringing it all to my attention, esp the text inside the <script>! (that was an interesting scenario I did not take into account surprisingly)

They should all now be parsed as one node be it $.text or $.php_only. For once the solutions were simpler than I anticipated thankfully! 😅

Before

Screenshot 2023-12-03 at 10 56 43

After

Screenshot 2023-12-03 at 11 00 21

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants