Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Case-of-case for UPLC where the inner case is an ifThenElse application #5337

Merged
merged 11 commits into from
Jun 15, 2023

Conversation

zliu41
Copy link
Member

@zliu41 zliu41 commented May 19, 2023

This achieves the same effect as simplifying Foo_match (ifThenElse x (Foo_Con_1...) (Foo_Con_2...)) b1 b2. It is implemented in UPLC because it is easier - we simply need to do the following:

case ((force ifThenElse) b (constr t) (constr f)) alts
==>
(force ifThenElse) b (case (constr t) alts) (case (constr f) alts)

while doing it in PIR involves the fuss of identifying constructors and destructors, dealing with type instantiations etc.

I went and did this, without first implementing the general case-of-case optimisation and making it a special case. The reason is that this is trivial to implement, while the general case-of-case is potentially much more complex as it may increase the program size if not done carefully.

@zliu41 zliu41 requested a review from a team May 19, 2023 15:56
into

@
force ifThenElse b (case (constr t) branches) (case (constr f) branches)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need force here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. ifThenElse is a polymorphic builtin needing a type argument.

Copy link
Contributor

@effectfully effectfully left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but underspecified to me.

import PlutusCore.Quote

type Compiling m uni fun name =
(ToBuiltinMeaning uni fun, MonadQuote m, HasUnique name TermUnique, Ord name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ord name is weird, do you know where it comes from?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some optimizations put names in sets and maps, e.g., FloatDelay.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are supposed to use UniqueMap and we could add UniqueSet too. Hard for me to imagine why we'd prefer Map over a properly newtyped IntMap.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

where
go acc = \case
Apply ann fun arg -> go ((arg, ann) : acc) fun
t -> (t, acc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelpj seems like you're right and we should resurrect the View module.


-- | Make an Apply node from the given function and arguments.
mkApplication :: Term name uni fun a -> [Arg name uni fun a] -> Term name uni fun a
mkApplication = foldl' $ \acc (arg, ann) -> Apply ann acc arg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is like

mkIterApp
    :: TermLike term tyname name uni fun
    => ann
    -> term ann -- ^ @f@
    -> [term ann] -- ^@[ x0 ... xn ]@
    -> term ann -- ^ @[f x0 ... xn ]@
mkIterApp ann = foldl' (apply ann)

except doesn't share a single annotation for application nodes. Maybe we should keep this function in MkPlc too? And maybe we should generally do something about annotations being shared/not shared. Like provide some kind of convenient API and allow the user to choose or something idk.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PIR counterpart is PlutusIR.Contexts.fillAppContext. Perhaps they should all use the TermLike API in MkPLC?

I don't think sharing annotation makes sense. splitApplication and mkApplication should round-trip.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think sharing annotation makes sense.

I think it does sometimes, but I agree that not in this case.


import Data.List (foldl')

type Arg name uni fun a = (Term name uni fun a, a)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather make it a data with two fields.

Comment on lines +74 to +76
caseOfCase' = case eqT @fun @DefaultFun of
Just Refl -> caseOfCase
Nothing -> id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how I feel about this. @michaelpj do you have an opinion?

, [cond, (true@Constr{}, trueAnn), (false@Constr{}, falseAnn)]
) <-
splitApplication scrut ->
mkApplication ite [cond, (Case ann true alts, trueAnn), (Case ann false alts, falseAnn)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Such weird formatting. Is it fourmolu doing that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I can probably format it a bit better if I spend 30 seconds doing it but I'd really rather not.

caseOfCase = transformOf termSubterms $ \case
Case ann scrut alts
| ( ite@(Force _ (Builtin _ PLC.IfThenElse))
, [cond, (true@Constr{}, trueAnn), (false@Constr{}, falseAnn)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, so it works for ifThenElse cond nil (cons x xs) too (with nil and cons being SOP-based)? I guess it's even better, but is worth explicit discussion and the naming probably shouldn't be limited to true and false, as well as the specification above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, as long as they are known constructors. The name true/false is because those are the true and false branches of ifThenElse.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name true/false is because those are the true and false branches of ifThenElse.

I just realized that they are in fact not, in the general and perfectly realistic case. Think of how not (if b then True else False) && c or whatever would look like after not is optimized, but before && is optimized (assuming both get inlined).

Also could you please add a test showing how not (not (if b then True else False)) gets optimized (with not inlined)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is not about optimizing not . not. I think that requires the actual case-of-case optimization. This PR implements a special case of case-of-case, where the inner case is (builtin ifThenElse). This special case occurs very frequently, namely every time one writes something similar to if x < y then ... else .... not . not is not covered by this special case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test case involving not . not (compiledNotNot), and it currently isn't fully optimized (but that's not the goal of this PR). See not-not.uplc-readable.golden.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you only this optimization and case-of-known-constructor? Here's how interleaving those two would transform the not-not expression:

   not (not (if b then True else False))
~> not (if b then not True else not False)
~> not (if b then False else True)
~> if b then not False else not True
~> if b then True else False

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(that wasn't a response to your last message, which I hadn't seen up until after I wrote my comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it currently isn't fully optimized (but that's not the goal of this PR). See not-not.uplc-readable.golden.

Look like we should propagate force the same way this PR propagates case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look like we should propagate force the same way this PR propagates case?

I think we can do that - as long as all the delay branches are pure. That is a separate optimization though.

splitApplication = go []
where
go acc = \case
Apply ann fun arg -> go ((arg, ann) : acc) fun
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's horrible nitpicking, but it's counter-intuitive to me that in constructor arg comes after ann, but in the tuple it's the other way around.

true = Constr () 0 []
false = Constr () 1 []
alts = [mkConstant @Integer () 1, mkConstant @Integer () 2]
pure $ Case () (mkApplication ite [(Var () b, ()), (true, ()), (false, ())]) alts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we should test nil & cons too, plus maybe some other stuff.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I think they all look very similar in UPLC though, just the number of constr arguments (constr 0 x xs vs contr 0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but it's better to test different cases than only 0-ary one (in general tests should assume at little as possible about the concrete thing that they're supposed to test).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added caseOfCase3.

[ [ [ (force (builtin ifThenElse)) b_0 ] t_1 ] (constr 1) ]
(con integer 1)
(con integer 2)
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth a comment on why we don't transform

case (if b then t else False) of
     True  -> 1
     False -> 2

into

if b
   then case t of
        True  -> 1
        False -> 2
   else 2

I assume this is because we don't want to duplicate branches? (and what we do right now does duplicate them, but since all constants are now, the duplication get removed by a subsequent pass of case-of-known-constructor or whatever).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. Case-of-case may increase program size unless the program can be subsequently simplified by case-of-known-constructor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment:

-- Unless both branches are known constructors, the case-of-case transformation
-- may increase the program size.

@effectfully
Copy link
Contributor

Now do the same for Data 😅

@michaelpj
Copy link
Contributor

I'm sceptical of doing this in UPLC just because it's easier. We need COKC to make this work well... which is in PIR. And generally this is an optimization that is likely to unlock more optimization opportunities, so I think it's important to do it in PIR. So this seems like false economy to me - if we're just going to do it in PIR later, what's the point of this version?

@@ -119,7 +119,7 @@ data DefaultFun
| MkPairData
| MkNilData
| MkNilPairData
deriving stock (Show, Eq, Ord, Enum, Bounded, Generic, Ix)
deriving stock (Show, Eq, Ord, Enum, Bounded, Generic, Ix, Typeable)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is deriving Typeable redundant nowadays with newer GHC versions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently yes!

@zliu41
Copy link
Member Author

zliu41 commented May 22, 2023

I'm sceptical of doing this in UPLC just because it's easier. We need COKC to make this work well... which is in PIR.

CaseReduce is COKC in UPLC, no?

@michaelpj
Copy link
Contributor

CaseReduce is COKC in UPLC, no?

It is if constructors are inlined, which they aren't currently.

@zliu41
Copy link
Member Author

zliu41 commented May 22, 2023

It is if constructors are inlined, which they aren't currently.

This is not done in PIR either, is it?

Case-of-case may also interact with CSE, which we also decided to do in UPLC. So I'm in favor of doing all these in UPLC.

@zliu41
Copy link
Member Author

zliu41 commented May 22, 2023

So this seems like false economy to me - if we're just going to do it in PIR later, what's the point of this version?

Well, I've already done it 🙂

Given your schedule, my schedule and everyone else's, I don't see the PIR version of this being implemented till the end of July. Even if the PIR version works better (we can always revert this if that's the case), I'd rather not delay delivering the improvements to the users till then!

@michaelpj
Copy link
Contributor

Constructors don't need to be inlined in PIR because we recognize them directly in COKC.

@effectfully
Copy link
Contributor

So I'm in favor of doing all these in UPLC.

Me too, and we could add it to PIR as well if we wanted to. It's a simple optimization, so there isn't much cost (we'll probably end up discussing the design, corner cases etc way longer than actually implementing anything, and design applies to UPLC and PIR). More importantly, UPLC is for everybody, PIR is only for the users of Plutus Tx, and as such there's value in providing that optimization for UPLC, even if we do it for PIR too.

@zliu41
Copy link
Member Author

zliu41 commented May 22, 2023

Constructors don't need to be inlined in PIR because we recognize them directly in COKC.

I'm confused. Do you mean constructors or constructor applications? Constructors can be recognized in PIR, yes, but they can also be recognized in UPLC - they are just Constr nodes. Constructor applications like Just 2 aren't inlined in either PIR or UPLC.

@michaelpj
Copy link
Contributor

I'm confused. Do you mean constructors or constructor applications? Constructors can be recognized in PIR, yes, but they can also be recognized in UPLC - they are just Constr nodes.

Even in UPLC there will be a binding for the constructor, whose body will contain Constr nodes. Unless you inline that binding, you won't see the Constr.

I haven't looked at your tests yet, but you should be able to see in some of the tests that come from PIR whether this is happening. Maybe we are sometimes inlining the constructor bindings... but if we don't, this optimization can't fire.

@zliu41
Copy link
Member Author

zliu41 commented May 22, 2023

Unless you inline that binding, you won't see the Constr.

The same applies to PIR: if you don't inline x = Just 2, COC or COKC won't fire.

Addressing this in UPLC is easier because it will be taken care of by inlining fully-saturated applications. E.g., Just in UPLC is \x -> Constr 0 [x], so Just 2 will be inlined. In PIR, it is actually harder to inline Just 2 - we'd need to add a special rule for inlining constructor applications.

@michaelpj
Copy link
Contributor

The same applies to PIR: if you don't inline x = Just 2, COC or COKC won't fire.

No, that's a binding of a constructor application. I'm talking about the binding corresponding to Just itself. e.g.

let datatype Maybe a = Just a | Nothing
in match_Maybe (Just 2) (\x -> x) 1

becomes

(\Just . ... match_Maybe (Just 2) (\x -> x) 1)
(\x -> constr 0 x)
...

in UPLC. You need to inline the constructor binding.

I think this would go much faster if we had a test case :)

@zliu41 zliu41 marked this pull request as draft June 5, 2023 20:56
@zliu41 zliu41 marked this pull request as ready for review June 14, 2023 20:39
@zliu41
Copy link
Member Author

zliu41 commented Jun 14, 2023

Going to merge this, since it is a strict improvement in terms of script sizes and costs. We can remove it once we implement a similar optimization in PIR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants