Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix matching for quoted strings #11

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

fix matching for quoted strings #11

wants to merge 1 commit into from

Conversation

kevinludwig
Copy link

Tokenizer is using the dissect library to find match strings. disect uses a binary search to minimize the number of comparisons it has to make. That only works if the input sequence is sorted, (or if the predicate function can determine if the match is greater than, or less than, the current match. I'm sure there are a variety of cases which break because of this, but the one I found was a not short quoted string. If you put console.log inside the code, you'll see that the algorithm around disect is cutting the match in half, and continues to shorten the string to look for a match (ultimately determining incorrectly that there is no match).

I changed the code to do a for loop from the end of the string back to the start looking for the longest string with a matching rule. This works and all existing tests continue to work.

@Floby
Copy link
Owner

Floby commented Nov 26, 2014

Hello,
Just to tell you that i've seen this and there's also #9 looking interesting.
I should be reviewing these two pretty soon.

@farskipper
Copy link

Hey, I also had problems due to bisection. It seems like for some tokenizing problems bisection is definitely the right call, but not for all of them. For that reason I made tokenizer2 which doesn't use bisection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants