Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split SyntaxNode into TreeNode & SyntaxData #193

Merged
merged 5 commits into from
Feb 19, 2023

Conversation

timholy
Copy link
Member

@timholy timholy commented Feb 11, 2023

Closes #192

Pleasantly straightforward. Note that this object is slightly bigger since val has basically become two fields now, one for children and one for the actual value. I hope that's a tolerable increase.

I haven't timed anything to see if there are any regressions. (If constprop works, I think there won't be any.) Is there a task I should use?

Copy link
Member

@c42f c42f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems pretty reasonable. But I'm curious how much this helps you downstream? It looks like you don't get much functionality with TreeNode{SomeOtherNodeData} as the methods are still specialized to SyntaxNode here.

In terms of benchmarks, there's not really any formal benchmarks at this point. As a rough guide, converting all the way to Expr should be about 5x faster than the existing parser. You could try the following code before and after the changes? (I'll PR this to test/benchmark.jl shortly. I'm not sure what reliable infrastructure we have for benchmarking automatically...)

using BenchmarkTools
using JuliaSyntax

include("test_utils.jl")

function concat_base()
    basedir = joinpath(Sys.BINDIR, "..", "share", "julia", "base")
    io = IOBuffer()
    for f in find_source_in_path(basedir)
        write(io, read(f, String))
        println(io)
    end
    return String(take!(io))
end

all_base_code = concat_base()

b_ParseStream = @benchmark JuliaSyntax.parse!(JuliaSyntax.ParseStream(all_base_code), rule=:toplevel)
b_GreenNode   = @benchmark JuliaSyntax.parseall(JuliaSyntax.GreenNode, all_base_code)
b_SyntaxNode  = @benchmark JuliaSyntax.parseall(JuliaSyntax.SyntaxNode, all_base_code)
b_Expr        = @benchmark JuliaSyntax.parseall(Expr, all_base_code)

@info "Benchmarks" ParseStream=b_ParseStream GreenNode=b_GreenNode SyntaxNode=b_SyntaxNode Expr=b_Expr

@timholy
Copy link
Member Author

timholy commented Feb 12, 2023

But I'm curious how much this helps you downstream?

We might still need to broaden some other methods but I was planning to do that focally as I build out functionality. For now this is all I need, and it seems to clearly pave the way for elegantly handling both typed and untyped syntax trees.

We can wait to merge this if you prefer that I have JuliaDebug/Cthulhu.jl#345 re-implemented. The only thing I really need to know is whether this approach is acceptable, otherwise I'll quit on this approach and start working on a different one. But I'm guessing from your comment above is that you're fine with this, and might simply prefer to take it when there won't be a lot of further changes.

@timholy
Copy link
Member Author

timholy commented Feb 12, 2023

Thanks for the benchmarks; the last three are unchanged, but surprisingly there's a ~30% regression on the first one, which drops to ~15% if I add Base.@constprop :aggressive in front of getproperty. Personally I wouldn't worry about that, but you might, so I thought I'd better raise it.

@c42f
Copy link
Member

c42f commented Feb 14, 2023

there's a ~30% regression on the first one

That would be troubling, but it's very confusing because SyntaxNode shouldn't be involved in that benchmark!

Copy link
Member

@c42f c42f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should go ahead and merge this if you think it's the right way forward.

It's surprisingly non-breaking ❤️

@c42f
Copy link
Member

c42f commented Feb 14, 2023

(I would like to understand whether the performance regression is real, it does bother me and it should be impossible. But I don't know if I have space in my head to dig deep into it right now.)

@timholy
Copy link
Member Author

timholy commented Feb 17, 2023

Before I merge:

  • understand the performance regression
  • check test coverage and see if it needs to be improved before it's safe to merge this refactor (the most recent commit has some val -> children changes that arguably should have caused test failures in previous iterations of this PR).

timholy added a commit that referenced this pull request Feb 18, 2023
These two files are particularly affected by recent and upcoming
changes (e.g., #193). This adds a bit more coverage as a guard
against breakage.
timholy added a commit that referenced this pull request Feb 19, 2023
These two files are particularly affected by recent and upcoming
changes (e.g., #193). This adds a bit more coverage as a guard
against breakage.
@timholy
Copy link
Member Author

timholy commented Feb 19, 2023

I'm not quite sure what was happening before, but today when I run the benchmarks there are no signs of trouble.

@codecov
Copy link

codecov bot commented Feb 19, 2023

Codecov Report

Merging #193 (24843d3) into main (9fa5661) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #193      +/-   ##
==========================================
+ Coverage   94.86%   94.87%   +0.01%     
==========================================
  Files          15       15              
  Lines        3776     3789      +13     
==========================================
+ Hits         3582     3595      +13     
  Misses        194      194              
Impacted Files Coverage Δ
src/syntax_tree.jl 95.10% <100.00%> (+0.37%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@timholy timholy merged commit 0b1aa97 into JuliaLang:main Feb 19, 2023
@timholy timholy deleted the teh/nodes branch February 19, 2023 12:08
@c42f
Copy link
Member

c42f commented Feb 20, 2023

today when I run the benchmarks there are no signs of trouble

Cool this makes sense! The only thing I could think of was that perhaps the getproperty overloading was causing some weird inference limit to be hit somewhere in supposedly unrelated code. But that seemed like a long shot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Location of TypedSyntaxNode
2 participants