Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Tokenizer is using the dissect library to find match strings. disect uses a binary search to minimize the number of comparisons it has to make. That only works if the input sequence is sorted, (or if the predicate function can determine if the match is greater than, or less than, the current match. I'm sure there are a variety of cases which break because of this, but the one I found was a not short quoted string. If you put console.log inside the code, you'll see that the algorithm around disect is cutting the match in half, and continues to shorten the string to look for a match (ultimately determining incorrectly that there is no match).
I changed the code to do a for loop from the end of the string back to the start looking for the longest string with a matching rule. This works and all existing tests continue to work.