Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix untyped terms generation #438

Closed
effectfully opened this issue Dec 26, 2018 · 15 comments · Fixed by #460
Closed

Fix untyped terms generation #438

effectfully opened this issue Dec 26, 2018 · 15 comments · Fixed by #460
Assignees

Comments

@effectfully
Copy link
Contributor

effectfully commented Dec 26, 2018

I noticed that this property

propParser :: Property
propParser = property $ do
    prog <- forAll genProgram
    let reprint = BSL.fromStrict . encodeUtf8 . prettyPlcDefText
        proc = void <$> parse (reprint prog)
        compared = and (compareProgram (void prog) <$> proc)
    Hedgehog.assert compared

passes even if what is printed can't be parsed back, because parse returns an Either and and called over Left returns True. This kinda defeats the entire purpose of the test.

If I change the last two lines to

        compared = compareProgram (void prog) <$> proc
    Hedgehog.assert (fromRight False compared)

I immediately get parsing errors which are all due to poor terms generation:

  1. since type-level and term-level lambdas are denoted the same way in our syntax, a type-level lambda generated at the term level (which should not happen) gets printed and parsed as a term-level lambda
  2. constants bounds are not checked, so we generate stuff like this:
Program (Version () 0 0 0) (Constant () (BuiltinInt () 1 (-9859937)))

which doesn't parse due to the out-of-bounds constant (maybe we really shouldn't do bounds checking during parsing).

There are also problems in

genName :: MonadGen m => m (Name ())
genName = Name () <$> name' <*> int'
    where int' = Unique <$> Gen.int (Range.linear 0 3000)
          name' = BSL.fromStrict <$> Gen.utf8 (Range.linear 1 20) Gen.lower

because we can generate names that have same uniques, but distinct string representations which violates one of our invariants. Besides, with such generation, terms where the same variable appearing twice are rather rare.

@michaelpj
Copy link
Contributor

Ah, looks like the same issue I hit in #301, that explains why it doesn't trigger on master.

@effectfully
Copy link
Contributor Author

Ah, looks like the same issue I hit in #301, that explains why it doesn't trigger on master.

Oh, I forgot about this one. Yes, indeed looks like an explanation.

since type-level and term-level lambdas are denoted the same way in our syntax, a type-level lambda generated at the term level (which should not happen) gets printed and parsed as a term-level lambda

Actually, no, I guess I misinterpreted test failures, because I do not think terms like that can be generated as the generators are well-typed.

@vmchale
Copy link
Contributor

vmchale commented Jan 5, 2019

We'd decided in a meeting early on that bounds checking should happen during parsing - though I suppose you could bring that particular point up again next week.

@vmchale
Copy link
Contributor

vmchale commented Jan 5, 2019

@effectfully do you have a seed for the second issue? I've tried running property tests several times, but I've yet to trigger

since type-level and term-level lambdas are denoted the same way in our syntax, a type-level lambda generated at the term level (which should not happen) gets printed and parsed as a term-level lambda

@effectfully
Copy link
Contributor Author

@effectfully do you have a seed for the second issue? I've tried running property tests several times, but I've yet to trigger

I wrote above:

Actually, no, I guess I misinterpreted test failures, because I do not think terms like that can be generated as the generators are well-typed.

I'm not sure if there are any other failures that are not related to out-of-bounds constants, but the second issue does seem like I misunderstood something and there is no issue there actually.

@vmchale
Copy link
Contributor

vmchale commented Jan 6, 2019

Ah okay.

@effectfully
Copy link
Contributor Author

effectfully commented Jan 11, 2019

Reopening, because still the case:

There are also problems in

genName :: MonadGen m => m (Name ())
genName = Name () <$> name' <*> int'
    where int' = Unique <$> Gen.int (Range.linear 0 3000)
          name' = BSL.fromStrict <$> Gen.utf8 (Range.linear 1 20) Gen.lower

because we can generate names that have same uniques, but distinct string representations which violates one of our invariants. Besides, with such generation, terms where the same variable appearing twice are rather rare.

@effectfully effectfully reopened this Jan 11, 2019
@vmchale
Copy link
Contributor

vmchale commented Jan 11, 2019

But... why is this a problem? We are just looking at the lexical structure, yes? This is a test primarily for the parser.

@effectfully
Copy link
Contributor Author

effectfully commented Jan 11, 2019

I played with this stuff and it's a funny situation: we generate silly terms that violate our invariants, but we never ever do anything with uniques: pretty-printing ignores them and compareTerm ignores them as well. So my original conjecture (taken from a slack chat)

So why does it work? Simply because we generate super boring terms

is false.

I'd say this is a mess: we violate invariants, generate uniques and ignore them and use custom equality checking functions. There should be something better than that.

@vmchale
Copy link
Contributor

vmchale commented Jan 11, 2019

Does it matter for parsing?

@effectfully
Copy link
Contributor Author

Does it matter for parsing?

Yes, because we generate uniques during parsing. By ignoring them we weaken our tests.

Besides, we use these generators in other tests as well. It's just a law of property-based testing that default generators have to produce non-trivial test cases. Our generators rarely produce terms with some variable appearing twice and this just makes our tests weaker. Regardless of the procedures we actually test, default generators must cover a variety of cases, because that's the whole point of property-based testing: cover as many cases as you can.

@vmchale
Copy link
Contributor

vmchale commented Jan 11, 2019

Yes, because we generate uniques during parsing.

The way the parser handles uniques is somewhat idiosyncratic and I can't think of a way to test its uniques aside from duplicating the function of some of the code test-side... which is not much of a test.

I'm open to other suggestions if you see a better way.

Besides, we use these generators in other tests as well.

Could we not use something else there? We don't use compareProgram in other places.

@effectfully
Copy link
Contributor Author

effectfully commented Jan 13, 2019

Just got this on master:

  parser round-trip:                            FAIL (0.47s)
      ✗ parser round-trip failed after 100 tests and 36 shrinks.

           ┃     │ -                                      (TyVar
           ┃     │ -                                         ()
           ┃     │ -                                         TyName
           ┃     │ -                                           { unTyName =
           ┃     │ -                                               Name
           ┃     │ -                                                 { nameAttribute = ()
           ┃     │ -                                                 , nameString = "all"
           ┃     │ -                                                 , nameUnique = Unique { unUnique = 0 }
           ┃     │ -                                                 }
           ┃     │ -                                           })

      <...>

           ┃     │ "(program 10.5.6\n  (abs\n    a\n    (type)\n    (error\n      (fun (lam a (type) (lam a (type) (fun (fun a (lam a (type) (lam a (type) all))) a))) a)\n    )\n  )\n)"

      <...>

           ┃     │ + Left
           ┃     │ +   (Unexpected
           ┃     │ +      LexKeyword { loc = AlexPn 130 6 80 , tkKeyword = KwAll })

        This failure can be reproduced by running:
        > recheck (Size 99) (Seed 16003949935550134140 13650782525337027413) parser round-trip
      
    Use '--hedgehog-replay "Size 99 Seed 16003949935550134140 13650782525337027413"' to reproduce.

I.e. it's an instance of #463.

@vmchale
Copy link
Contributor

vmchale commented Jan 14, 2019

Yes, those should definitely be filtered.

@effectfully
Copy link
Contributor Author

effectfully commented Nov 10, 2019

genName :: MonadGen m => m (Name ())
genName = Name () <$> name' <*> int'
    where int' = Unique <$> Gen.int (Range.linear 0 3000)
          name' = BSL.fromStrict <$> Gen.utf8 (Range.linear 1 20) Gen.lower

Besides, we use these generators in other tests as well. It's just a law of property-based testing that default generators have to produce non-trivial test cases. Our generators rarely produce terms with some variable appearing twice and this just makes our tests weaker. Regardless of the procedures we actually test, default generators must cover a variety of cases, because that's the whole point of property-based testing: cover as many cases as you can.

This nonsense is still there. @kwxm you did an amazing job in the compiler in the other project fixing issues with generators, do you happen to have time to devote it to making Plutus Core property tests sensible?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants