-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
take_until but parses the input it takes #199
Comments
I'd use |
I just realized the content doesn't need to be parsed I just need it as a string. This makes things a lot easier. But in general if I wanted to parse content further but didn't want the content parser to know anything about when to stop. Say the parser parses a list of lines for example. Would a parser that does the following be possible:
|
It is possible using https://docs.rs/combine/3.5.2/combine/trait.Parser.html#method.flat_map the only real problem is that the reported error position will point into the sub-input so it would need to be fixed if that is an issue (Probably possible using https://docs.rs/combine/3.5.2/combine/fn.position.html to get the position before the sub input) |
This took me a lot of fiddling around but it works now: captures(&*RE_START)
.map(|vec: Vec<&str>| vec[2].to_string())
.then(|name| {
let re =
Regex::new(&format!(r"([ \t]*)#\+END_{}\n?", regex::escape(&name))).unwrap();
(
value(name),
position(),
recognize(skip_until(find(re.clone()))),
)
.flat_map(|(name, position, content_str): (String, usize, &str)| {
use combine::stream::state::{IndexPositioner, State};
let input = State::with_positioner(
content_str,
IndexPositioner::new_with_position(position),
);
content_data()
.easy_parse(input)
.map(|(content_data, _rest)| (name, content_data))
})
.skip(find(re))
}), I even got the correct position to work. The only thing wrong with error is that it both contains: "end of input" and "unexpected token #". But I think that is OK since it is an unexpected end of the content.
|
Btw do you think it would be faster to use the regex above or use something like (spaces(), range(format!("#+BEGIN_{}", name))) |
I'd expect that to be faster, compiling a regex is fairly expensive (compared to matching against a single string) so generally regexes should be compiled once and used many times for them to be efficient. |
Since you are explicitly using |
Yes this is fine. Thanks for your help. |
Is it possible to parse the input consumed by take with another parser instead of just accumulating it in String? Specifically I want to parse something that looks like:
Since name is dynamic and has to be the same in the start and end line I can't use something like
between
. Also I don't really want the parser that's responsible for the content to know about name.Currently what I'm thinking about doing is using
and_then
to justtake_until
the end line, collect the content in a String and create a new stream from the collected string and parse it.But I'm wondering if there is a better way of doing this.
The text was updated successfully, but these errors were encountered: