You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR cleans up the boundary character checking by using similar
classification techniques as we used for other classification problems.
For starters, this moves the boundary related items to its own file,
next we setup the classification enum.
Last but not least, we removed `}` as an _after_ boundary character, and
instead handle that situation in the Ruby pre processor where we need
it. This means the `%w{flex}` will still work in Ruby files.
---
This PR is a followup for
#17001, the main goal is
to clean up some of the boundary character checking code. The other big
improvement is performance. Changing the boundary character checking to
use a classification instead results in:
Took the best score of 10 runs each:
```diff
- CandidateMachine: Throughput: 311.96 MB/s
+ CandidateMachine: Throughput: 333.52 MB/s
```
So a ~20MB/s improvement.
# Test plan
1. Existing tests should pass. Due to the removal of `}` as an after
boundary character, some tests are updated.
2. Added new tests to ensure the Ruby pre processor still works as
expected.
---------
Co-authored-by: Jordan Pittman <jordan@cryptica.me>
0 commit comments