-
Notifications
You must be signed in to change notification settings - Fork 450
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This fixes an issue where prefix literals could contain alternates such that an Aho-Corasick match could be ambiguous. More specifically, AC returns matches in the order in which they appear in the text rather than specifically "leftmost first." This means that if an alternate is a substring of another alternate, then AC will return a match for that first even if the other alternate is left of it. In essence, this causes a prefix scan to skip too far ahead in the text. We fix this by ensuring that prefix literals are unambiguous by truncating all literals such that none of them are substrings of another. This is written in a way that is somewhat expensive, but we constrain the number of literals greatly, so the quadratic runtime shouldn't be too noticeable.
- Loading branch information
1 parent
277926f
commit 55a1fc9
Showing
3 changed files
with
187 additions
and
89 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters