-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation/regex #18
Comments
Where's the regex engine used in this project? (I have kind of forgotten, been literally years since I poked around here) |
Arrash, I thought that regular expressions are used in this project because of the regex.lisp file in the sources. A brief look at its contents shows that it deals with converting regular expressions (in some form) into DFA and NFA using McNaughton-Yamada-Thompson algorithm. However, documentation is really rigid and in order to better understand what is going on, one must look a bit deeper. If you still can recall, could you please provide some more information what is this code about? Thanks, |
We do regex stuff in the parser generator implementations. Doing sorts of "Dragoon bookie" stuff. To generate a LL(1) or LL(*) parser, it's useful to go back and forth between finite [non]deterministic state machines and the equivalent regex representation. Does this give you a slightly better picture? I have no idea what @ndantam does nowadays though. |
The regex and FA implementations here are not really aimed at text processing or pattern matching, but at general symbolic analysis -- i.e., we do not assume that the terminal symbol alphabet consists of characters, and we implement various set operations on the regular languages. You could build a text pattern matcher on top of this work using the resulting minimized DFA. Compared to CL-PPCRE -- which I believe is not doing DFA minimization -- this may produce a more compact and efficient matcher. However, some features of typical regex/pattern engines such as capture groups may be more difficult to implement. If you are interested in improving the state of Common Lisp text pattern matching, I personally would like to see a pattern matcher that operates on ropes (https://en.wikipedia.org/wiki/Rope_(data_structure)) instead of just array-based strings. |
Hi Neil, @Tarrasch, thanks for your explanation. My goal is not so bold as to improve the state of regular expressions engines in Common Lisp, just to learn a bit of different algorithms and programming techniques. Though I do have a very practical interest in Boyer-Moore-Horspool and Commentz-Walter implementations. BTW, I've created a ticket for creating a BMH matcher for ropes and it looks like it will be not that difficult (for the ropes implementation by @Ramarren). Thanks, |
Hi,
I am collecting information for my pet project and would like to know more about the regex implementation in this library.
Could you please provide some background/explanations if this regex engine is intended to be used to process strings/sequences/vectors, etc. how is it used (API) what are its features?
Thanks,
Victor
The text was updated successfully, but these errors were encountered: