-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Peek(1) affects next token. #28
Comments
It would be great if peek did not change state at all, other times i also have problems with using consequent peek(1), using peek(1), peek(1) is the same as using peek(2). Maybe a reject() call to revert the effects from peek() or if peek did not affect the state would probably be perfect, just return the token at a certain position, think its possible? |
Yes, this has been a problem ever since rulesets were added. Originally the problem was that the lexer was based on a data representation which did not support rewinding. I think we can fix this now by resetting the position upon changing rulesets, as well as clearing the token cache in the parser. This might require some architectural thought though. We have to be careful not to make a mess of the Lexer/Parser/LexerTokenSource inter-dependencies. |
Would be great, thanks, meanwhile ill see if i can workaround it for this parser. Btw if only peek would not change the state and fetch the token, things would work great imo (even if at some performance cost), I don't know if thats feasible or what is involved compared to reseting the position when changing rulesets, sounds more robust, peek(1) peek(1) would even return the same token. Just a suggestion. |
This would mean disabling the token cache completely, which I don't think is viable performance-wise. The lexer shouldn't have to tokenize the same thing twice if not necessary. |
Well peek(0) is used internally to retrieve the current token sometimes, but if peek() was only called by the user and would use a different method other than Lexer.token() to retrieve the token, it might be possible to skip the cache and let the user be responsible for performance lost if using peek() too much. I'm not sure how the state transitions are generated or how the cache works, if its |
Ok, what i did to avoid unwanted tokens parsed by peek(1) was to involve the case in a try catchand have no rule/match for the specific case i want other function/rule to process. For now it works, guess i was lucky i did not rely on the peek(1) too much and in most cases the resulting token could be junked (junk()) after reading with peek(1), anyway it would be nice to have rewind or reverting cache after ruleset changes. Thanks for this lib, did some benchmarks and the performance is about 2x faster than the previous parser using string comparison, not to mention its much easier to modify and read, with some luck will be ready in a couple of days. |
I'll see what I can do about the caching issue soon. Out of curiosity, what are you trying to parse? |
https://github.com/ciscoheat/erazor/blob/master/src/demo/Main.hx You can see the templating language there, it uses string comparison to process the templates, trying to extend the templating language i ended rewriting the parser to make it easier to extend and add some nice features, now tweaking it to pass all the tests. The tests are probably more readable: |
I don't know if it's relevant to the discussion, but for stipple I'm using a forward matching regular expression, then rewind the lexer position. |
The problem is this:
This shouldn't be hard to solve because it's just a matter of resetting the state upon ruleset changes. The problem is finding a good architecture between lexer, parser and token source. |
This stipple looks awesome, I'll have to spend some time looking at it, funny i was inspired by handlebars and i was trying to implement some of its features into erazor (thats how i got here). With some luck it may fit perfectly what i was trying to do, which is to use a good templating engine with ufront. Thanks a lot! |
Hey @fussybeaver , stipple looks really great, when will it be released as an haxelib? Guess it will have to wait until a new hxparse and haxe versions are released? Anyway this makes my work unnecessary, thanks a lot, looking at the tests it seems to perfect, i can go on making my site, and perhaps using this beta version of stipple. I'll let you know manage to add it to ufront. |
As you say, I am waiting for the new haxe version to be released, and I hope @Simn will release a haxelib version of hxparse too. Regarding hxparse I think I would like to see improvements in a non-macro based API, as I am using macros to improve runtime performance, and without build macros this doesn't seem to be quite as idiomatic for using hxparse. |
I'm not sure what that would look like unless you want to call peek and junk manually all the time, which I think you already can anyway. One thing we could do is add a flag which writes the parser output to a .hx file so it can be used as a preprocessor. |
You can call peek and junk manually, it's works well though it feels like the API could be made nicer. Would a preprocessor require compiling twice ? Maybe simply having non-build macros would also work. |
I don't really trust macro-in-macro anyway, but that should be possible. It would look a bit weird because you would have to pass the switch expression as an argument to a macro call, but that should be fine too. |
Yeh perhaps it's better to just leave it as is.. just wanted to make you aware of that use case. |
It's an easy change because all the code is already there, so I have added hxparse.Parser.parse which expects a switch expression as argument. Please check if that works for you. |
It looks good, the code is more maintainable, the compile-time performance impact seems to be minimal on the test suite (at least after several runs), so I'll use this solution for the time being.. Is there some issue I need to be aware of for macro-in-macro calls ? |
Hey @fussybeaver , added support for stipple into ufront, its working like a charm so far. ufront/ufront-mvc#4 Thanks again for this lib, a good templating engine is essential, cant bootlick more than this! |
Looks nice! Let me know if I can help you with any issues.. |
@fussybeaver: We'll have to move that parse method somewhere else. I forgot that Parser is supposed to be |
Actually I might fix it in core Haxe: HaxeFoundation/haxe#3766 |
Back to the original issue, here's a small example to reproduce it: import byte.ByteData;
import hxparse.*;
class MyLexer extends hxparse.Lexer implements RuleBuilder {
static public var tok = @:rule [
"a" => {
1;
}
];
static public var tok2 = @:rule [
"a" => {
2;
}
];
}
class MyParser extends Parser<LexerTokenSource<Int>, Int> implements ParserBuilder {
public function new(input:ByteData) {
var source = new LexerTokenSource(new MyLexer(input), MyLexer.tok);
super(source);
}
public function changeRuleset() {
stream.ruleset = MyLexer.tok2;
}
}
class Main {
@:access(hxparse.Parser.peek)
static function main() {
var input = ByteData.ofString("a");
var parser = new MyParser(input);
hxparse.Utils.catchErrors(input, function() {
trace(parser.peek(0)); // 1
parser.changeRuleset();
trace(parser.peek(0)); // 1
});
}
} To solve this, assignment to
We could do 2. by turning Any ideas? |
I'm using different rulesets in a custom parser for erazor lib.
Frequently i use peek(1) on a ruleset to verify if it should process what comes next or leave it to another ruleset.
The problem is that if a function using ruleset(A) decides not to process what comes next after peek(1), and another function with ruleset(B) processes what comes next, the first token will be the one defined by function (A) peek(1)
example:
The text was updated successfully, but these errors were encountered: