Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

many0 returns Incomplete in weird cases #790

Closed
djc opened this issue Jun 6, 2018 · 13 comments
Closed

many0 returns Incomplete in weird cases #790

djc opened this issue Jun 6, 2018 · 13 comments
Milestone

Comments

@djc
Copy link
Contributor

djc commented Jun 6, 2018

Prerequisites

Here are a few things you should provide to help me understand the issue:

  • Rust version : 1.26.2
  • nom version : 4.0.0
  • nom compilation features used: default

Test case

Example test case:

named!(expr, do_parse!(
    tag_s!("{{") >>
    inner: take_until!("}}") >>
    tag_s!("}}") >>
    (inner)
));

fn rest(i: &[u8]) -> Result<(&[u8], &[u8]), nom::Err<&[u8]>> {
    let (mut seen, mut count) = (0, 0);
    for c in i {
        seen += 1;
        if *c == b'{' {
            count += 1;
            if count == 2 {
                break;
            }
        } else {
            count = 0;
        }
    }
    if seen < 1 {
        Err(nom::Err::Error(error_position!(i, nom::ErrorKind::Custom(0))))
    } else {
        Ok((&i[seen..], &i[..seen]))
    }
}

named!(parse<Vec<&[u8]>>, many0!(alt!(expr | rest)));

This results in Err(Incomplete(Size(2))), which doesn't make any sense to me. I looked at converting my actual code (in the Askama crate) to CompleteByteSlice, but that seems like a very deep change.

@Hittherhod
Copy link

I can second this with another example error case:

Test Case

named!(parse_sample<&[u8], Sample>,
    do_parse!(
        tag!(&[HDR][..]) >>
        id: le_u16 >>
        a: le_u32 >>
        b: le_u32 >>
        c: le_u32 >>
        time: le_u32 >>
        (Sample {
            id,
            a,
            b,
            c,
            time,
        })
));

named!(parse_samples<&[u8], Vec<Sample>>,
    many0!(parse_sample)
);

parse_samples returns Err(Incomplete(Size(1)) when the provided buffer ends with this structure (i.e., the structure perfectly fills the buffer). If there is remaining data, it returns Ok([rem], [parsed]). However, parse_sample successfully completes.

Upon more testing it seems like many1!() also has this issue of erroneous Incomplete reporting.

@RalfJung
Copy link

I ran into the same issue when porting from nom 3 to nom 4. Essentially, many0 is useless unless one uses the Complete* types: I tried various variants of

named!(test<&[u8],Vec<&[u8]>>, many0!(take_until_and_consume!("TEST")));

including complete!(many0!(...)) and separated_list_complete!(tag!(""), ...), to no avail.

@RalfJung
Copy link

I looked at converting my actual code (in the Askama crate) to CompleteByteSlice, but that seems like a very deep change.

Also see #795

@dignifiedquire
Copy link

Same issue here. I can't use many0, many1 or many_till to work around it. I always get incomplete(1) errors on my code even though the whole input should simply parse exactly.

@djc
Copy link
Contributor Author

djc commented Jul 2, 2018

So the reason this happens was explained to me on IRC: if your parser ends with a many* construction, the streaming parser cannot know whether more instances are coming. That is actually the reason you have to annotate your parser with CompleteByteSlice or similar.

Since I now at least understand the behavior, maybe it would be better to close this bug, although I still stand by my assertion that the new behavior is not very ergonomic to use (but that is better represented by #795).

@Smibu
Copy link

Smibu commented Jul 17, 2018

I worked around this by using the complete macro. So in your case it would be something like:

named!(parse<Vec<&[u8]>>, many0!(complete!(alt!(expr | rest))));

Note that the complete macro needs to be inside many0, not outside, like @RalfJung tried.

@RalfJung
Copy link

RalfJung commented Jul 18, 2018

Okay, that's really surprising:

    fn test_complete() {
        named!(test<&str,Vec<&str>>, many0!(complete!(take_until_and_consume!("TEST"))));
        let r = test("aTESTbTEST");
        assert_eq!(r, Ok(("", vec!["a", "b"])));
    }

works, even though the first time around the many0, things have NOT been completely parsed. How does that make sense? (It doesn't work on my full lib yet, still investigating why.)

@RalfJung
Copy link

RalfJung commented Jul 18, 2018

Here's a test that still fails: I want to allow optional additional space between my items, so I do

    fn test_complete() {
        named!(test<&str,Vec<&str>>, many0!(complete!(
            do_parse!(s: take_until_and_consume!("TEST") >> multispace0 >> (s))
        )));
        let r = test("aTEST  bTEST\n");
        assert_eq!(r, Ok(("", vec!["a", "b"])));
    }

This says

assertion failed: `(left == right)`
  left: `Ok(("bTEST\n", ["a"]))`,
 right: `Ok(("", ["a", "b"]))`

On the other hand, this works:

    fn test_complete() {
        use nom::types::CompleteStr;

        named!(test<CompleteStr,Vec<&str>>, many0!(
            do_parse!(s: take_until_and_consume!("TEST") >> multispace0 >> (s.0))
        ));
        let r = test(CompleteStr("aTEST  bTEST\n"));
        assert_eq!(r, Ok((CompleteStr(""), vec!["a", "b"])));
    }

@Smibu
Copy link

Smibu commented Jul 18, 2018

works, even though the first time around the many0, things have NOT been completely parsed. How does that make sense?

If I'm reading the docs right, complete does not require the child parser to consume all input. It just maps potential Incomplete error to Error.

Here's a test that still fails

I got that to work like this:

fn test_complete() {
    named!(test<&str,Vec<&str>>, many0!(complete!(
            do_parse!(s: take_until_and_consume!("TEST") >> many0!(complete!(one_of!(" \r\n\t"))) >> (s))
    )));
    let r = test("aTEST  bTEST\n");
    assert_eq!(r, Ok(("", vec!["a", "b"])));
}

but yeah, couldn't get it to work with multispace0 (without using CompleteStr).

akshayknarayan added a commit to ccp-project/portus that referenced this issue Sep 4, 2018
As a workaround to rust-bakery/nom#544, migrate to
nom 4 to ensure that the verbose-errors feature becomes additive and
therefore portus compiles when used as a dependency.

There were two classes of changes to the parser structure:
1. Error handling, as predicted and outlined here:
https://github.com/Geal/nom/blob/master/doc/upgrading_to_nom_4.md#replacing-parser-result-matchers

2. Migration to "CompleteByteSlice".
This was predicted in the migration notes:
https://github.com/Geal/nom/blob/master/doc/upgrading_to_nom_4.md#dealing-with-incomplete-usage
but caused underlying errors as reported here:
rust-bakery/nom#790 (comment)

To address this, shadow nom's `named!` macro with our own,
`named_complete!`, which replaces the types appropriately. This is the
solution proposed here:
rust-bakery/nom#795 (comment)
akshayknarayan added a commit to ccp-project/portus that referenced this issue Sep 6, 2018
As a workaround to rust-bakery/nom#544, migrate to
nom 4 to ensure that the verbose-errors feature becomes additive and
therefore portus compiles when used as a dependency.

There were two classes of changes to the parser structure:
1. Error handling, as predicted and outlined here:
https://github.com/Geal/nom/blob/master/doc/upgrading_to_nom_4.md#replacing-parser-result-matchers

2. Migration to "CompleteByteSlice".
This was predicted in the migration notes:
https://github.com/Geal/nom/blob/master/doc/upgrading_to_nom_4.md#dealing-with-incomplete-usage
but caused underlying errors as reported here:
rust-bakery/nom#790 (comment)

To address this, shadow nom's `named!` macro with our own,
`named_complete!`, which replaces the types appropriately. This is the
solution proposed here:
rust-bakery/nom#795 (comment)
akshayknarayan added a commit to ccp-project/portus that referenced this issue Sep 7, 2018
As a workaround to rust-bakery/nom#544, migrate to
nom 4 to ensure that the verbose-errors feature becomes additive and
therefore portus compiles when used as a dependency.

There were two classes of changes to the parser structure:
1. Error handling, as predicted and outlined here:
https://github.com/Geal/nom/blob/master/doc/upgrading_to_nom_4.md#replacing-parser-result-matchers

2. Migration to "CompleteByteSlice".
This was predicted in the migration notes:
https://github.com/Geal/nom/blob/master/doc/upgrading_to_nom_4.md#dealing-with-incomplete-usage
but caused underlying errors as reported here:
rust-bakery/nom#790 (comment)

To address this, shadow nom's `named!` macro with our own,
`named_complete!`, which replaces the types appropriately. This is the
solution proposed here:
rust-bakery/nom#795 (comment)
@mkeeter
Copy link

mkeeter commented Oct 1, 2018

In case anyone stumbles on this thread in the future, here's another workaround:

alt_complete!(many0!(...) |
              value!(vec!(), tag!("")))

The many0! operation returns Incomplete, so alt_complete! falls back to accepting an empty string and returns an empty Vec from the value! operation.

EDIT: On second thought, this doesn't work – it fixes the empty case, but then falls back to the vec! default even when there are items to be parsed, because many0 also returns Incomplete when it's parsed a few items.

@Geal
Copy link
Collaborator

Geal commented Jun 17, 2019

fixed in nom 5 where we can get specialized versions of some parsers for streaming or complete inputs

@Geal Geal closed this as completed Jun 17, 2019
@Geal Geal added this to the 5.0 milestone Jun 17, 2019
@kvakvs
Copy link

kvakvs commented Feb 20, 2022

This still happens on nom7. Trailing many0(...) returns incomplete error. Wrapping with complete(many0(...)) parses correctly but does not consume input and fails with Incomplete(Size(1)). Just found this comment thread with a solution which fixed it for me. Thanks for many0(complete(...)) suggestion.

P.S. Please don't use IRC or chats in general as a knowledge base, because they are not one.

@Xiretza
Copy link
Contributor

Xiretza commented Feb 20, 2022

many0 can not produce an Incomplete on its own: https://github.com/Geal/nom/blob/7.0.0/src/multi/mod.rs#L40-L65

If you pass a streaming parser to many0, many0 will propagate the Incomplete, which is by design. If instead you just want to consume as much input as is available and then stop, you either have to pass a complete parser to many0 or turn a streaming parser into a complete parser using the complete combinator (as you already found out).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants