Replies: 2 comments 1 reply
-
I think the problem here is that You probably want to add a As an aside: counting the indentation levels might not necessarily be what you want. Consider something like this instead, where the indentation string itself gets passed back as context: let multi_line = {
let indent = just('\n')
.ignore_then(just("").configure(
|cfg, parent_indent| {
eprintln!(">> multi parent: {:?}", parent_indent);
cfg.seq(*parent_indent)
},
))
.labelled("INDENT");
text::whitespace()
.ignore_with_ctx(
line.clone().separated_by(indent).collect::<Vec<String>>(),
)
.map(|lines| lines.join("\n"))
.labelled("Multi line")
}; That way, |
Beta Was this translation helpful? Give feedback.
-
First of all, thanks again for such a great library and for the help figuring this out. I've fixed the issue and I can parse my little markup language with no (real) issues, woot! 🎉 Unfortunately my code is still very repetitive as every time I made it more generic I ran into issues that were hard to debug. I ended up having to create a helper second parser function I wouldn't mind the helper function as much if it didn't force me to redefine parsers like Maybe it's obvious to others why I need that Finally, I ended up not using tokens at all for now because they were (a) getting in the way and (b) not really adding much clarity/conciseness. I will try to use the code now in the actual application for which I developed this and will work on optimizing the parser after that. As promised, the code ended up looking like the below, with some helper functions and I've placed the full code in a repo here for now but might take it offline in a few weeks. fn block_parse<'src>(
) -> impl Parser<'src, &'src str, Spanned<Expr>, extra::Err<Rich<'src, char, Span>>>
{
let ident = text::ascii::ident()
.then_ignore(just("::").ignore_then(just(' ').repeated()))
.map(|z: &str| z.to_string());
let title_ = just("title::").ignore_then(just(' ').repeated());
let subtitle_ = just("subtitle::").ignore_then(just(' ').repeated());
let footer_ = just("footer::").ignore_then(just(' ').repeated());
let single_ident = just("box::").then_ignore(text::inline_whitespace());
let col_ = just("column::").then_ignore(text::inline_whitespace());
let row_ = just("row::").then_ignore(text::inline_whitespace());
let many_ident = choice((col_, row_)).boxed();
let lines = any()
.and_is(ident.not())
.and_is(just('\n').not())
.repeated()
.at_least(1)
.collect::<String>()
.repeated()
.collect::<Vec<String>>()
.map(|lines| lines.join("\n"))
.map(|s| Expr::Text(s.trim().to_string()))
.labelled("Line")
.then_ignore(text::newline())
.map_with(|x, e| (x, e.span()));
let raw_lines = any()
.and_is(ident.not())
.and_is(just('\n').not())
.repeated()
.at_least(1)
.collect::<String>()
.map(|s| match s.trim().to_string() {
s if s.is_empty() => "\n".to_string(),
s => s,
})
.separated_by(text::newline())
.allow_trailing()
.collect::<Vec<String>>()
.map(|lines| lines.join(" "))
.map(|s| s.trim().to_string())
.labelled("Line")
.map_with(|x, e| (Expr::Text(x), e.span()));
let title = title_
.ignore_then(text::newline().not())
.ignore_then(lines.padded())
.map(|content| Expr::Title(Box::new(content)))
.map_with(|tok, e| (tok, e.span()));
let subtitle = subtitle_
.ignore_then(text::newline().not())
.ignore_then(lines.padded())
.map(|content| Expr::Subtitle(Box::new(content)))
.map_with(|tok, e| (tok, e.span()));
let footer = footer_
.ignore_then(text::newline().not())
.ignore_then(lines.padded())
.map(|content| Expr::Footer(Box::new(content)))
.map_with(|tok, e| (tok, e.span()));
let single = single_ident
.clone()
.ignore_then(text::newline())
.ignore_then(title.padded().or_not())
.then(subtitle.padded().or_not())
.then(footer.padded().or_not())
.then(raw_lines.clone())
.boxed()
.map_with(|(((title, subtitle), footer), lines), e| {
(
Expr::Single(Box::new(Component {
title: match title {
Some(title) => Some(Box::new(title)),
None => None,
},
subtitle: match subtitle {
Some(subtitle) => Some(Box::new(subtitle)),
None => None,
},
footer: match footer {
Some(footer) => Some(Box::new(footer)),
None => None,
},
body: lines,
})),
e.span(),
)
});
let block = recursive(|block| {
let indent = just(' ')
.repeated()
.configure(|cfg, parent_indent| cfg.exactly(*parent_indent));
let many = many_ident
.clone()
.then_ignore(text::newline())
.then(title.padded().or_not())
.then(subtitle.padded().or_not())
.then(footer.padded().or_not())
.then(block.clone())
.boxed()
.map(|((((id, title), subtitle), footer), body)| match id {
"column::" => Expr::Column(Component {
title: match title {
Some(title) => Some(Box::new(title)),
None => None,
},
subtitle: match subtitle {
Some(subtitle) => Some(Box::new(subtitle)),
None => None,
},
footer: match footer {
Some(footer) => Some(Box::new(footer)),
None => None,
},
body,
}),
"row::" => Expr::Row(Component {
title: match title {
Some(title) => Some(Box::new(title)),
None => None,
},
subtitle: match subtitle {
Some(subtitle) => Some(Box::new(subtitle)),
None => None,
},
footer: match footer {
Some(footer) => Some(Box::new(footer)),
None => None,
},
body,
}),
_ => {
eprintln!("> Unreachable: {id:?}");
unreachable!()
}
})
.map_with(|x, e| (x, e.span()));
let stmt = single
.clone()
.or(many)
.boxed()
.repeated()
.at_least(1)
.collect::<Vec<_>>()
.map_with(|body, e| (Expr::List(body), e.span()));
text::whitespace()
.count()
.ignore_with_ctx(stmt.separated_by(indent).collect::<Vec<_>>())
});
let many = many_ident
.then_ignore(text::newline())
.then(title.padded().or_not())
.then(subtitle.padded().or_not())
.then(footer.padded().or_not())
.then(block)
.boxed()
.map_with(|((((id, title), subtitle), footer), body), e| {
(
match id {
"column::" => Expr::Column(Component {
title: match title {
Some(title) => Some(Box::new(title)),
None => None,
},
subtitle: match subtitle {
Some(subtitle) => Some(Box::new(subtitle)),
None => None,
},
footer: match footer {
Some(footer) => Some(Box::new(footer)),
None => None,
},
body,
}),
"row::" => Expr::Row(Component {
title: match title {
Some(title) => Some(Box::new(title)),
None => None,
},
subtitle: match subtitle {
Some(subtitle) => Some(Box::new(subtitle)),
None => None,
},
footer: match footer {
Some(footer) => Some(Box::new(footer)),
None => None,
},
body,
}),
_ => {
eprintln!("> Unreachable: {id:?}");
unreachable!()
}
},
e.span(),
)
});
many.or(single).with_ctx(0)
} fn parse<'src>(
) -> impl Parser<'src, &'src str, Spanned<Expr>, extra::Err<Rich<'src, char, Span>>>
{
let ident = text::ascii::ident()
.then_ignore(just("::").ignore_then(just(' ').or_not()))
.map(|z: &str| z.to_string());
let title_ = just("title::").then_ignore(just(' ').or_not()).ignored();
let subtitle_ =
just("subtitle::").then_ignore(just(' ').or_not()).ignored();
let footer_ = just("footer::").then_ignore(just(' ').or_not()).ignored();
let lines = any()
.and_is(ident.not())
.and_is(just('\n').not())
.repeated()
.at_least(1)
.collect::<String>()
.repeated()
.collect::<Vec<String>>()
.map(|lines| lines.join("\n"))
.map(|s| Expr::Text(s.trim().to_string()))
.labelled("Line")
.then_ignore(text::newline())
.map_with(|x, e| (x, e.span()));
let title = title_
.ignore_then(text::newline().not())
.ignore_then(lines.padded())
.map(|content| Expr::Title(Box::new(content)))
.map_with(|tok, e| (tok, e.span()));
let subtitle = subtitle_
.ignore_then(text::newline().not())
.ignore_then(lines.padded())
.map(|content| Expr::Subtitle(Box::new(content)))
.map_with(|tok, e| (tok, e.span()));
let footer = footer_
.ignore_then(text::newline().not())
.ignore_then(lines.padded())
.map(|content| Expr::Footer(Box::new(content)))
.map_with(|tok, e| (tok, e.span()));
let page = just("[page]")
.padded()
.labelled("Page")
.ignore_then(text::newline().repeated())
.ignore_then(title.or_not())
.then(subtitle.or_not())
.then(footer.or_not())
.boxed()
.then(block_parse())
.boxed()
.map(|(((title, subtitle), footer), body)| Component {
title: match title {
Some(title) => Some(Box::new(title)),
None => None,
},
subtitle: match subtitle {
Some(subtitle) => Some(Box::new(subtitle)),
None => None,
},
footer: match footer {
Some(footer) => Some(Box::new(footer)),
None => None,
},
body,
})
.map_with(|c, e| (Expr::Page(Box::new(c)), e.span()));
page.separated_by(text::newline().repeated().at_least(1))
.collect()
.map_with(|pages, e| (Expr::Document(pages), e.span()))
}
|
Beta Was this translation helpful? Give feedback.
-
Hey all,
I'd love to figure this out on my own but it's been days and my head is sore from banging it against the monitor... I'm trying to write a parser for a "simple" markup language which describes nested containers in a layout, such as the following (abbreviated) example:
I'll probably rename
vsplit
andhsplit
intocolumn
androw
, but for now that's what they're called.I can speak to the motivation behind the markup if anyone's curious, but onto the actual issue: I can't seem to find a way to consistently parse multiline strings, inline strings, and indents without them getting in the way of each other or running into
Collect combinator making no progress
... I've gone through the tutorial and examples and don't know what else to try.MRE: Given the following input:
I'm lexing/parsing in a way similar to the nano_rust example:
But I get stuck on the collect combinator, with the output below after littering my lexer with print statements. I would immensely appreciate any help figuring out this lexer.
(I realize this current iteration isn't even parsing Text as a multiline string joined by
\n
. Earlier attempts were successful in managing that, but broke other issues and I feel like this is closer to the answer than what I had before.)where line 85 is at the end of
I also have defined a parser which compiles, but I have been unable to test it since I can't get the lexer to conclude... Below are the current AST and the remainder of
parser.rs
although they are not really relevant right now unless it's giving allergic reactions and you tell me I should also change it.Thank you once again in advance for any help
Beta Was this translation helpful? Give feedback.
All reactions