Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance degradation at v2 #561

Closed
dalance opened this issue Jan 22, 2025 · 6 comments
Closed

Performance degradation at v2 #561

dalance opened this issue Jan 22, 2025 · 6 comments
Assignees

Comments

@dalance
Copy link

dalance commented Jan 22, 2025

In my usage, the performance of parol v2 is degraded than v1 by 30%.
Please refer codspeed report in the following PR.

veryl-lang/veryl#1169

@jsinger67
Copy link
Owner

I was aware of the performance drop.
And performance is a still ongoing topic.
I reached a lot on the part of the scanner creation but decreasing the creation comes with the cost of decreased parse throughput. My next steps will be to minimize the automata for the scanner modes.
I hope that will result in a significant improvement.
Also I still keep the option open to provide a way to use the original regex crate if scnr can't fulfill the performance expectations. In this context I should mention that scnr is not the only one to blame. I also introduced new features like lossless parse trees in parol_runtime v2.

@jsinger67 jsinger67 self-assigned this Jan 24, 2025
@jsinger67
Copy link
Owner

There is no need to switch to parol v2 soon.
Staying on v1 is safe because this version will receive bugfixes and other updates regularily.
For me, it was important to proof that veryl can be ported to parol v2 in principle and to see where the weak points are.
I will provide status updates here from time to time.

@dalance
Copy link
Author

dalance commented Jan 27, 2025

Thank you for your explanation!
I'll migrate to v2 when performance becomes same as v1, new feature of v2 is necessary or you decide to stop v1 maintainance.

@jsinger67
Copy link
Owner

jsinger67 commented Jan 27, 2025

Status update

I implemented the DFA minimization in scnr this weekend. It is not merged into main yet and still subject to test.
scnr creates for each scanner mode a separate DFA that can recognize all patterns that are valid in the scanner mode.

Here are the changes in quantity structure of the DFAs:

Scanner mode Node count Edge count
INITIAL 582->503 665->576
Embed 5->5 5->5
Generic 485->462 570->504

Unfortunately the performance gain is not sufficient, yet.
In my test I have a change in throughput from 5.502 MiB/second to 6.491 MiB/second and a change in token rate from 2.037 million tokens/second to 2.403 million tokens/second on my Windows machine. These are the changes compared to version 0.7.0 of scnr.

I will continue to optimze the scan performance.

@jsinger67
Copy link
Owner

jsinger67 commented Feb 14, 2025

Status update

@dalance,
parol_runtime 2.2 now provides a crate feature called regex_automata. This basically instructs scnr to take regex-automata as its regex engine instead of scnr's own regex engine.

Having this available one can alternatively use the regex engine of parol v1 and get all the features and properties of parol v1.

Actually, because of the effort for performance optimization in scnr the results are even slightly better than the ones from parol v1.

All migration changes on the branch parol_v2_migration are therefore obsolete now and can be replaced by much lesser changes I provided you on branch parol_v2_migration_2. This branch is up to date with the master branch at the time of writing.

Use these changes as you wish.

@dalance
Copy link
Author

dalance commented Feb 16, 2025

Thank you for your work!
I updated parol to v2 in veryl-lang/veryl#1247.

@dalance dalance closed this as completed Feb 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants