Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace error Vec with a linked list of slab Vecs #75

Closed
wants to merge 1 commit into from

Conversation

domenicquirl
Copy link

You mentioned that you wanted to take some stabs at performance. While I am not familiar enough with the codebase to suggest any structural changes (and sadly don't have the time to change that, as interesting as chumsky is), I had a look at the benchmarks and also perfed them. The flamegraph has the usual problem of recursive and/or deeply layered parsing function that any parser expectedly consists mostly of its sub-parsers and it's hard to make out problematic areas. One part that showed up repeatedly was Vec::push and Vec::append.

If I saw correctly, some parsers like Seq use a Vec as output as well, but on top-level I noticed all of the parsing errors are returned in a Vec, so many of the parsers copy them from outputs of their constituents into an aggregated result Vec. I replaced that with a wrapper around LinkedList<Vec<T>> to obtain the O(1) characteristics of its append. It doesn't have everything that Vec has (in particular, no methods to remove elements), but everything that you use in chumsky plus iter and into_iter, so it's possible to get back to a Vec after parsing . On the Linux machine I tested this, making this change through all the parsers and also the recovery Strategys made about a 10% difference. This depends heavily on the presence of errors, obviously.

Notables:

  • Breaking: the container type is exposed through PResult
  • Right now this PR increases the MSRV to current stable. That's only because I used unwrap_unchecked though, so if you want to support earlier versions that's an easy change
  • I added tests for the data structure, and also documented it briefly. I put the thing in its own new module, which is missing the quote in its module comment because I didn't feel qualified ^^

@zesterer
Copy link
Owner

Hey, thanks for this! I'm probably not going to merge this now because I'm working on a pretty big overhaul in #82, but I'll definitely be using this PR as a reference!

@domenicquirl
Copy link
Author

Very welcome! I somewhat expected that after hearing about the (very cool!) zero-copy rewrite. If you may be looking at this later, let me quickly note down three results of trying out some different structures and perfing them:

  • in general, obviously the less code the better and in particular the less additional branching the better
  • that said, one of the most impactful things regarding performance was to make sure to keep the "no allocations until some element is actually added" that all of the std data structures have (in particular the vec and list used). Pre-allocation basically always yielded worse results, presumably due to the fact that - barring highly malformed input - most parser invocations will not return any errors, so you don't want to add unnecessary allocation work to all these cases. Having an emptiness check on insertion therefore consistently performs better than always having space available.
  • I did experiment with having a separate Vec for individual elements instead of pushing them onto the last current buffer in the list. In theory, this makes the data structure more independent from how big Vecs being pushed on are - if you always make the separate Vec have the same capacity that the one before ended up needing, that probably re-allocates less than potentially re-sizing every time because you add elements to incoming Vecs. In practice, there probably again aren't enough errors for that to be worth it (since iirc the first allocation of a Vec over-allocates, so if there are only a few elements in there there is probably space left over). At the very least with the additional logic required to do this it turned out to not be much of an improvement in the benches.

Do I remember correctly that there was also someone mentioning the possibility of chumsky being generic over these data structures? If so, my flat list could just be one of these probably quite easily.

@zesterer
Copy link
Owner

Thanks, this is very valuable information!

@zesterer
Copy link
Owner

zesterer commented Mar 8, 2023

I'm going to close this because it's now out of date. However, we did end up using a very similar approach (and more besides), which was a big factor on the rewrite's very competitive performance. Thanks for the ideas!

@zesterer zesterer closed this Mar 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants