Lexer.has does not find error token. #76

JoshuaGrams · 2017-09-28T11:54:18Z

var moo = require('moo');
var lexer = moo.compile({
	word: /\w+/,
	ws: { match: /\s+/, lineBreaks: true },
	somethingElse: moo.error
});
console.log('has word?', lexer.has('word'));
console.log('has ws?', lexer.has('ws'));
console.log('has somethingElse?', lexer.has('somethingElse'));

This means that you can't define an error token and use it with Nearley? AFAICT it calls Lexer.has on every token that you use. At any rate, if you're going to claim that you can define an error token instead of throwing an error, then it should behave just like any other token in all respects.

The text was updated successfully, but these errors were encountered:

JoshuaGrams · 2017-09-28T12:16:31Z

Ah, shoot. I was expecting that an error token would have no contents and let you continue parsing, instead of taking the rest of the input. I'm trying to do a thing with indentation and markdown-style lists. So I thought I could lex with newlines pushing a line-marker state which would recognize whitespace as indentation, and then * or + or - would give list marker tokens which would pop the state, and an error would return an unmarked token and pop the state. Is there a better way to do this?

tjvr · 2017-09-28T13:36:59Z

Yes, Nearley uses Lexer.has to work out whether a %token is exposed by Moo, or a custom token matcher. You're right, has() should return true for error tokens.

I was expecting that an error token would have no contents and let you continue parsing, instead of taking the rest of the input

When none of your rules match, Moo doesn't know what to do. So you can either have it throw an error, or return an error token with the whole of the rest of the input. (I've updated the README to clarify this.)

I think error tokens are the wrong thing here. Generally tokenizers work best when your tokens are small atomic units: so I would separate your newline rule from your rule for leading whitespace, for example. You probably want something like Nathan's transformer to turn indentation into INDENT and DEDENT tokens.

EDIT: note that if you want this behaviour (error tokens having no contents), you can always implement it yourself on top of Moo's existing API. :-)

Lexer.has: return true for error token, if any (fixes #76).

tjvr added the question label Sep 28, 2017

tjvr closed this as completed in d3fbaff Oct 5, 2017

tjvr added a commit that referenced this issue Oct 5, 2017

Merge pull request #77 from JoshuaGrams/master

f5572c9

Lexer.has: return true for error token, if any (fixes #76).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lexer.has does not find error token. #76

Lexer.has does not find error token. #76

JoshuaGrams commented Sep 28, 2017

JoshuaGrams commented Sep 28, 2017

tjvr commented Sep 28, 2017 •

edited

Loading

Lexer.has does not find error token. #76

Lexer.has does not find error token. #76

Comments

JoshuaGrams commented Sep 28, 2017

JoshuaGrams commented Sep 28, 2017

tjvr commented Sep 28, 2017 • edited Loading

tjvr commented Sep 28, 2017 •

edited

Loading