Use AST #109

pherrymason · 2025-01-26T06:33:59Z

Motivation

Treesitter is used to obtain a structured representation of source code, which has been utilized to extract symbol declarations.

So far, Treesitter has been used to parse the source code to analyze and obtain a structured representation of the code: Context Syntax Tree (CST). The CST, also known as the Parse Tree, contains all the elements that are part of the text (spaces, brackets, etc.).
Although the CST can be navigated to analyze the code, it contains too much information about the language's syntax, making certain tasks more challenging.

On the other hand, using an AST (Abstract Syntax Tree) would allow working with a simpler structure that more accurately reflects the program's logic.

Strategy

Continuing to use Treesitter as a parser simplifies a large part of the work, as there is currently no other ready-to-use parser available.
One option is to convert the CST generated by Treesitter into an AST.
Subsequently, by using the Visitor pattern, it would be possible to implement different analyzers:

Symbol table generator
Diagnostics
Etc.

TODO

…sions.

Convert lambda declaration.

Complete ternary_expression test

…ferent choices and/or a sequence of nodes.

Signed-off-by: Nikita Pivkin <nikita.pivkin@smartforce.io>

Do not ignore ERROR treesitter nodes. This reduces scenarios where incomplete source code cannot be located by cursor position while still being able to get that text content. Refactor analysis code + tests. Extract astContext to hold information about node under cursor being analyzed. Protect against variable GenDecl not having type information.

…ef27f9d80)

…1b4b. Grab position context by reading string under cursor instead of finding the node, as this is not always reliable due to ERROR nodes. Fix completion by announcing methods kind as methods instead of plain functions. Fix parsing access_ident.

… are no parse errors. Use string analysis when cursor is inside parse error. Completion: Improve tests on enum values and methods. Completion: solveXAtSelectorExpr does not resolve iden type so completion can decide if child symbols can be included in result or not. Drop fromPosition and use Location instead.

Add tests for autocompleting fault constants and its methods.

…14dc06aaec0ca109f9cc7a77484345a0e64118a6)

pherrymason · 2025-02-11T17:06:36Z

Some notes @PgBiel
I'm finding some of your changes difficult to port.
For example, these commits: #aadb305 and #aaa629.

These take the assumption that cursor context is retrieved by reading the string under the cursor.
However, this new version tries to get the AST node under the cursor...

The issue here is that these cases are actually invalid, as they are not allowed by the grammar, and the AST generated is undefined (and might be flagged with Error).

Notes to discuss:

Should we fallback to old string analysis if node under cursor is flagged as Error?
Should disable the functionality if node under cursor is flagged as Error?

PgBiel · 2025-02-11T18:48:19Z

I haven't had the time to look at your code yet, but do you mean that Abc.VALUE. would be invalid? I'm pretty sure not, since associated values are accessible. Or perhaps you mean that you can't run completion because the extra dot at the end breaks the parser / AST node generation?

Assuming it's not the last option, my code, at those commits, basically just had some extra logic to not suggest enum members / fault constants after variables. Code like MyEnum.VALUE.VALUE is obviously invalid (and probably doesn't parse) but we don't know that until it is typed out, so being smart with the completions before suggesting them is necessary.
With the AST you could still keep some state indicating that you're searching for completions from a variable and not from the top-level enum, unless I'm missing something.

pherrymason · 2025-02-12T07:18:53Z

I haven't had the time to look at your code yet,
Don't worry, there is no commitment. This comment was more of a note for future analysis and just a discussion.

Yes, Abc.VALUE. is semi-valid, but some other scenarios are not. For example:

fn void main(){
status = WindowStatus.BACKGROUND.MINIMIZED;
}

This is not valid, but treesitter still tries to parse it, giving a tree like this:

Where the ERROR node contains the source WindowStatus.BACKGROUND. and right node MINIMIZED.
This causes a problem when trying to know what MINIMIZED relates to, because when reading right you find an incomplete ident, and the ERROR node stays there, without a clear indication of what actually is.

This treesitter tree translates to my AST C3 Tree:

AssignmentExpression{
       Left: { Ident:"status" },
       Right: {Ident: "MINIMIZED"},
}

This affects analysis done on GoTo and Completion operations because when trying to analyze the Right node, I find an incomplete identity and I'm not able to see it is a descendant of a SelectorExpression.

I could obviously store the ERROR node inside AssignmentExpression, but I should also store if it comes after or before the = to help the analyzer to interpret it, but this is just an example and the combinations where the ERROR can appear are infinite.

I'm just exposing the difficulties I'm encountering :D

PgBiel · 2025-02-12T19:33:01Z

Don't worry, there is no commitment. This comment was more of a note for future analysis and just a discussion.
...
I'm just exposing the difficulties I'm encountering :D

Oh don't worry, I was just trying to contextualize my message, i.e. it's possible that the answer to my question was obvious from the code (and sorry if so), I just didn't have the time to take a proper look at it yet. Didn't mean to imply that you were applying too much pressure, or something similar - expressing tone in online text is hard. Sorry! :)

Feel free to send further questions, I'll always try to answer them.

I could obviously store the ERROR node inside AssignmentExpression, but I should also store if it comes after or before the = to help the analyzer to interpret it, but this is just an example and the combinations where the ERROR can appear are infinite.

Ah I see the problem.

If I'm going to be entirely fair, that particular case is evidently pathological since the syntax is invalid, but it still would be nice to be able to properly recover from those weirdnesses.

I think we could blame the tree-sitter grammar here. Maybe it would be nice if they could allow any invalid ident casing anywhere, using a specific node for this, e.g. invalid_ident.

pherrymason · 2025-02-13T08:06:09Z

I guess I will still need to do string analysis on cursor position just to avoid these kind of issues...

PgBiel · 2025-02-13T14:23:55Z

Not sure if that's strictly necessary... I see two options:

Request a change in the original treesitter grammar for extra flexibility;
Keep a fork of it with the changes we need. This might be necessary anyway in the near future.

Simplify them and cover all tested cases.

…d5a04)

Move AST builders to different module. Do not walk TypeSpec.TypeDescription as an AST node.

pherrymason marked this pull request as draft January 26, 2025 17:36

pherrymason and others added 29 commits January 28, 2025 07:13

converting var declaration initialization

6f7ed6f

WIP CST to AST converter: VarDecl support initializing to more expres…

09c3aa0

…sions.

Simplify test compile_time_call

174abd5

Fix Import tests.

c0a5594

Build job depends on test.

bbe9bca

Fix import test.

261f945

testing running job manually

5d33331

Tweak

0926972

Fix compile_time_call test.

0e9e317

Convert compile time functions.

0e1647a

Convert lambda declaration.

Tweak Lambda body definition.

d479668

Extract convert_function_parameter_list

5e3da93

Convert lambda parameters.

70cc733

Convert assignment expressions

bc97b09

test for assignment expression with ct_type_ident

3d241b6

Fix tests after updating Treesitter.

3e4af51

Complete ternary_expression test

WIP Node rule system to define more easily those nodes that allow dif…

013f66d

…ferent choices and/or a sequence of nodes.

Complete lambda_expr test.

affcd75

Convert elvis or else.

0f217d1

Implement convert optional expr. Still missing test

cafce0d

Fix convert_optional_expr and add a test for it.

26b564c

Fix parser.typeNodeToType

918e4c7

Fix convert_optional_expr when single ? is used.

fad60c5

Implement convert_unary_expression

cabb9d7

Fix test

858265c

Implement convert_cast_expression

9ce6227

WIP convert_call_expr

9ced021

Finish convert_rethrow_expr

563eca3

store language keywords in a set instead of a slice

c6d427c

Signed-off-by: Nikita Pivkin <nikita.pivkin@smartforce.io>

pherrymason added 14 commits February 6, 2025 19:32

Add tests for autocompleting deep structs chain

f966af8

Fix doc comment being copied to following declarations.

e367f09

Port add completion test for macros (#a2f3a99eba416edb7a841a1860049db…

a603437

…ef27f9d80)

Try to extract more detailed information from an ErrorNode

e91110e

Differentiate struct members from enum values in symbol table.

e1f84b1

WIP autocompleting Enums

6099a2a

Implement autocompleting enumerables.

38aec56

Implement completing defs.

d5495a7

Remove log print.

c5c1cdf

Flag fault constants as ast.FAULT_CONSTANT

53f29a0

Add tests for autocompleting fault constants and its methods.

Port fix parsing enums with associated values but not backing type (#…

401602f

…14dc06aaec0ca109f9cc7a77484345a0e64118a6)

pherrymason added 10 commits February 23, 2025 17:47

Fix access rules on enum and faults

2b4bbc6

Fix when parent symbol type is Enum.

bd8cfbf

Fix trying to read value from optional when we know there's no value.

83975f4

Refactor solveSelAtSelectorExpr.

143a89a

Simplify solve****SelectorExpr functions

a2b760c

Simplify them and cover all tested cases.

Fix macro description string build.

a531564

Fix parsing enums without param list.

03b3654

Port "add some def-alias" tests (#7cdfbb2148ff453156d6432dfe0862a77d0…

f084b21

…d5a04)

Porting distinct parsing

5b2d9e0

Move AST builders to different module. Do not walk TypeSpec.TypeDescription as an AST node.

Move Symbol to its own module

001c84f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use AST #109

Use AST #109

pherrymason commented Jan 26, 2025 •

edited

Loading

pherrymason commented Feb 11, 2025

PgBiel commented Feb 11, 2025 •

edited

Loading

pherrymason commented Feb 12, 2025

PgBiel commented Feb 12, 2025

pherrymason commented Feb 13, 2025

PgBiel commented Feb 13, 2025

Use AST #109

Are you sure you want to change the base?

Use AST #109

Conversation

pherrymason commented Jan 26, 2025 • edited Loading

Motivation

Strategy

TODO

pherrymason commented Feb 11, 2025

PgBiel commented Feb 11, 2025 • edited Loading

pherrymason commented Feb 12, 2025

PgBiel commented Feb 12, 2025

pherrymason commented Feb 13, 2025

PgBiel commented Feb 13, 2025

pherrymason commented Jan 26, 2025 •

edited

Loading

PgBiel commented Feb 11, 2025 •

edited

Loading