Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Knowing what is (and isn't) defunctionalized can be confusing #429

Open
RyanGlScott opened this issue Dec 26, 2019 · 7 comments
Open

Knowing what is (and isn't) defunctionalized can be confusing #429

RyanGlScott opened this issue Dec 26, 2019 · 7 comments

Comments

@RyanGlScott
Copy link
Collaborator

While pondering the questions in #427 (comment), I realized that trying to document the existing status quo regarding what language constructs are OK to defunctionalize is much, much trickier than I thought. The main source of trickiness is that singletons has an inconsistent ruleset: whether something is defunctionalized or not depends on whether it is quoted (i.e., $(singletons [d| ... |])) or reified (i.e., $(genDefunSymbols [...])):

Quoted

Functions

Legal. Just define the function in quotes, e.g.,

$(singletons [d|
  foo :: Bool -> Bool
  foo = ...
  |])

And singletons will produce FooSym0 and FooSym1.

Classes

Legal, but only in the sense that quoted classes will produce defunctionalization symbols for the promoted class methods and associated type families, e.g.,

λ> ; $(singletons [d| class C a where type T a :: Bool; m :: a -> a |])
<interactive>:6:5-67: Splicing declarations
    singletons
      [d| class C_aejd a_aejg where
            type T_aeje a_aejg :: Bool
            m_aejf :: a_aejg -> a_aejg |]
  ======>
    class C_aejX a_aek0 where
      type T_aejY a_aek0 :: Bool
      m_aejZ :: a_aek0 -> a_aek0
    type TSym1 ...
    data TSym0 ...
    type MSym1 ...
    data MSym0 ...

Defunctionalization symbols for the class itself are not produced.

Class methods

Legal. See the "Classes" section.

Type families and synonyms

Legal. Quoted type families and synonyms are defunctionalized as you would expect. See also the "Classes" section for how associated type families are handled.

Data types

Legal, but only in the sense that quoted data types will produce defunctionalization symbols for the promoted record selectors and data constructors, e.g.,

λ> ; $(singletons [d| data D a = MkD { unD :: a } |])
<interactive>:8:5-49: Splicing declarations
    singletons [d| data D_aelz a_aelC = MkD_aelA {unD :: a_aelC} |]
  ======>
    data D_aema a_aemd = MkD_aemb {unD_aemc :: a_aemd}
    type UnDSym1 ...
    data UnDSym0 ...
    type MkDSym1 ...
    data MkDSym0 ...

Defunctionalization symbols for the data type itself are not produced.

Data constructors

Legal. See the "Data types" section.

Record selectors

Legal. See the "Data types" section.

Reified

Functions

Illegal. Attempting to use genDefunSymbols on a function name will throw an error:

λ> ; $(genDefunSymbols ['id])

<interactive>:9:5: error:
    Building defunctionalization symbols of values not supported

<interactive>:9:5: error: Q monad failure

Classes

Legal. Reified class names will get defunctionalization symbols, but its promoted class methods and associated type families will not be defunctionalized:

λ> class C a where type T a :: Bool; m :: a -> a
λ> ; $(genDefunSymbols [''C])
<interactive>:20:5-25: Splicing declarations
    genDefunSymbols [''C]
  ======>
    type CSym1 ...
    data CSym0 ...

Class methods

Illegal. Attempting to use genDefunSymbols on a class method name will throw an error:

λ> class C a where type T a :: Bool; m :: a -> a
λ> ; $(genDefunSymbols ['m])

<interactive>:22:5: error:
    Building defunctionalization symbols of values not supported

<interactive>:22:5: error: Q monad failure

Reifying the parent class name will not produce defunctionalization symbols for its promoted class methods either.

Type families and synonyms

Legal. Reified type families and synonyms are defunctionalized as you would expect. Associated type families are not automatically defunctionalized when the parent class name is reified, but one can just as well reify the associated type family name directly.

Data types

Legal, but only in the sense that reified data types will produce defunctionalization symbols for the data constructors, e.g.,

λ> data D a = MkD { unD :: a }
λ> ; $(genDefunSymbols [''D])
<interactive>:24:5-25: Splicing declarations
    genDefunSymbols [''D]
  ======>
    type MkDSym1 ...
    data MkDSym0 ...

Data constructors

Illegal. Attempting to use genDefunSymbols on a data constructor name will throw an error:

λ> data D a = MkD { unD :: a }
λ> ; $(genDefunSymbols ['MkD])

<interactive>:26:5: error:
    Building defunctionalization symbols of values not supported

<interactive>:26:5: error: Q monad failure

The only way to defunctionalize data constructor names through reification is to reify the parent data type.

Record selectors

Illegal. Attempting to use genDefunSymbols on a record selector name will throw an error:

λ> data D a = MkD { unD :: a }
λ> ; $(genDefunSymbols ['unD])

<interactive>:28:5: error:
    Building defunctionalization symbols of values not supported

<interactive>:28:5: error: Q monad failure

Reifying the parent data type name will not produce defunctionalization symbols for its promoted record selectors either.


There are multiple inconsistencies between quote-based defunctionalization and reification-based defunctionalization:

  1. Defunctionalizing the names of functions, class methods, data constructors, and record selectors is legal with quoting, but trying to do the same with straightforward reification is illegal.
  2. Quoted data types will produce defunctionalization symbols for its promoted record selectors, but reified data types will not.
  3. Defunctionalization for classes works in completely different ways depending on whether quoting or reification is used:
    • Quoted classes will not produce defunctionalization symbols for the class name itself, but will produce defunctionalization symbols for promoted class method names and associated type family names.
    • Reified classes will produce defunctionalization symbols for the class name itself, but will not produce defunctionalization symbols for promoted class method names and associated type family names.
@RyanGlScott
Copy link
Collaborator Author

RyanGlScott commented Dec 27, 2019

Question: what should we do to resolve inconsistencies (1)-(3) above? Here are my thoughts:

  1. My initial gut feeling is that using genDefunSymbols on the name of a function, class method, data constructor, or record selector should produce defunctionalization symbols for the promoted equivalent of that name. For instance, genDefunSymbols ['id] should produce IdSym0 and IdSym1.

    There is one potential sticking point with this idea. Recall that this is the code that gets generated for IdSym1:

    type IdSym1 (x :: a) = Id x

    This code assumes that the promoted Id type family already exists. However, if Id doesn't exist, then this code will error out. Therefore, allowing genDefunSymbols to be invoked on id offers a new way to shoot yourself in the foot.

    It is worth noting, however, that there is already a footgun of a similar variety on the quoting side. This code will also error out if Id does not exist:

    $(singletonsOnly [d| type Id x = x |])
    
  2. If the answer to (1) is "allow it", then I would advocate for having genDefunSymbols [''D] (where D is the name of a data type) produce defunctionalization symbols for D's promoted record selectors. Again, this is a footgun if any of the promoted record selector type families do not yet exist, but it is exactly the same footgun as in (1).

  3. This is a tricky one. Personally, I find it quite strange that genDefunSymbols generates defunctionalization symbols for class names. If C is the name of a class, then one doesn't need to generate CSym0 at all, since you can just as well use TyCon C to achieve the same thing. But alas, someone specifically asked for this feature, so it might be a bit reckless to change this convention.

    Having just written this, I wonder if we should explore an alternative design where genDefunSymbols doesn't exhibit the "bundling" behavior that it currently has. For example, given data D a = MkD { unD :: a }, then genDefunSymbols [''D] won't just generate DSym0 et al.. It will also bundle along MkDSym0 and UnDSym0 et al. Perhaps genDefunSymbols [''D] should simply generate DSym0 et al. and nothing else. If you want MkDSym0 et al., you will have to write genDefunSymbols ['MkD] explicitly, and similarly for UnDSym0 et al.

    As far as classes go, this would mean that genDefunSymbols [''C] would continue to produce defunctionalization symbols. If C had any associated type families or methods, you would have to call genDefunSymbols on them separately.

    The downside to this approach is that this code:

    $(genDefunSymbols [''D, 'MkD, 'unD])

    And this code:

    $(singletons [d| data D = MkD { unD :: a } |])

    Would no longer do the same thing, as the former would produce defunctionalization symbols for D, but the latter would not. Perhaps this inconsistency is acceptable, however, since this hypothetical version of genDefunSymbols is explicitly designed to work at a more fine-grained level than quoting is.

@goldfirere
Copy link
Owner

Thanks for this thorough analysis. Here is what I propose as a way forward -- but my opinion here is fairly weak, and I am happy with many possible directions (including simply documenting the current strange setup, which I don't believe is actively hurting anyone).

  • I think genDefunSymbols should blindly generate defunctionalization symbols for all the names it is given. It should not bundle. That has a very nice, simple specification. If that means that the generated code refers to some name that is not in scope, then so be it.

  • Perhaps it is worth adding genAllDefunSymbols, which generates defunctionalization symbols for all the names passed to it, and any names which would be exported if you put (..) after any of these names in an export list (ignoring the possibility that some of these names may be out of scope). That's also a specification users are likely to find intuitive. It means that classes include their methods and associated types, and datatypes include their constructors.

  • I think the quoting mechanisms should produce only the defunctionalization symbols they need -- just as they work now. Which ones are generated should be documented, of course.

These ideas broadly follow yours, but with the possibility that some find the bundling behavior useful.

@RyanGlScott
Copy link
Collaborator Author

RyanGlScott commented Jan 2, 2020

I like your suggested design of having both genDefunSymbols and genAllDefunSymbols. I wonder if it's worth adding another variant that generates defunctionalization symbol names using the same approach as quoting does? After all, my plan for implementing this idea was to add an extra argument to D.S.Promote.Defun.defunctionalize of type GenSymbolsDepth, where:

data GenSymbolsDepth
  = GenNameOnlyDefunSymbols
  | GenNonDataOrClassDefunSymbols
  | GenAllDefunSymbols

With this approach, genDefunSymbols becomes (roughly) an alias for defunctionalize GenNameOnlyDefunSymbols and genAllDefunSymbols becomes an alias for defunctionalize GenAllDefunSymbols. I use GenNonDataOrClassDefunSymbols when invoking defunctionalize from functions that deal with quoted declarations (such as promote and singletons), but there is no genNonDataOrClassDefunSymbols alias for defunctionalize GenNonDataOrClassDefunSymbols. Should there be?

@goldfirere
Copy link
Owner

I don't feel the need for exposing this much flexibility, per se. Is anyone asking for it? But I'm also not actively against exposing it.

@RyanGlScott
Copy link
Collaborator Author

I don't feel the need for exposing this much flexibility, per se. Is anyone asking for it?

It's hard to answer that question since this new design is still hypothetical. But it is worth noting that anyone who relies on the current behavior of $(genDefunSymbols [''D]) (given data D = D_1 | ... D_n) won't have a clear migration path in the new design, since neither the new genDefunSymbols nor genAllDefunSymbols will do quite the same thing. A user could conceivably write out each individual constructor to be defunctionalized (i.e., write $(genDefunSymbols [''D_1, ..., ''D_n]), but I could foresee this being annoying if n is large.

Of course, it could be the case that my worrying is completely unfounded and that no one will miss the current behavior of genDefunSymbols. But I don't know how to gauge how users would react to this idea.

@goldfirere
Copy link
Owner

Given that there are about 20 usages of genDefunSymbols in all of hackage (not counting singletons), I'm not very worried about the migration story. Futhermore, generating all symbols is rarely troublesome. In other words, do whatever is easiest.

RyanGlScott added a commit that referenced this issue Feb 23, 2020
This patch fleshes out some more details about what `singletons`
can and can't do in its `README`. The key changes are:

1. There is a new "Promotion and partial application" section that
   explains what defunctionalization is in some amount of detail.
   There is also a new subsection that explains the limitations of
   the `genDefunSymbols` function that were observed in #429.
2. The "Supported Haskell constructs" section has received some more
   love. Some Haskell features were inaccurately characterized
   (e.g., pattern signatures are really only partially supported),
   so I also reorganized some of the bullet points. I have also added
   a new bullet point for `ScopedTypeVariables` under the
   "Little to no support" section, as #433 reveals that promoting
   functions that rely on the behavior of `ScopedTypeVariables` is
   terribly fragile (and not easy to fix).
3. Lots of little formatting and grammar fixes to make the prose in
   the `README` flow better.

Note that this patch does _not_ fix either of #429 or #433—it just
documents the rather unsatisfying current state of affairs.
@RyanGlScott
Copy link
Collaborator Author

I briefly looked into implementing this, but it turned out to be much more work than I originally had expected. For now, I'll just document the various corner cases of genDefunSymbols in the README and leave this issue open as a reminder to implement the better design in #429 (comment) (or to just wait until UnsaturatedTypeFamilies is a thing and makes defunctionalization obsolete, I suppose). I've taken care of the documentation side of things in #441.

RyanGlScott added a commit that referenced this issue Feb 24, 2020
This patch fleshes out some more details about what `singletons`
can and can't do in its `README`. The key changes are:

1. There is a new "Promotion and partial application" section that
   explains what defunctionalization is in some amount of detail.
   There is also a new subsection that explains the limitations of
   the `genDefunSymbols` function that were observed in #429.
2. The "Supported Haskell constructs" section has received some more
   love. Some Haskell features were inaccurately characterized
   (e.g., pattern signatures are really only partially supported),
   so I also reorganized some of the bullet points. I have also added
   a new bullet point for `ScopedTypeVariables` under the
   "Little to no support" section, as #433 reveals that promoting
   functions that rely on the behavior of `ScopedTypeVariables` is
   terribly fragile (and not easy to fix).
3. Lots of little formatting and grammar fixes to make the prose in
   the `README` flow better.

Note that this patch does _not_ fix either of #429 or #433—it just
documents the rather unsatisfying current state of affairs.
RyanGlScott added a commit that referenced this issue Feb 24, 2020
)

This patch fleshes out some more details about what `singletons`
can and can't do in its `README`. The key changes are:

1. There is a new "Promotion and partial application" section that
   explains what defunctionalization is in some amount of detail.
   There is also a new subsection that explains the limitations of
   the `genDefunSymbols` function that were observed in #429.
2. The "Supported Haskell constructs" section has received some more
   love. Some Haskell features were inaccurately characterized
   (e.g., pattern signatures are really only partially supported),
   so I also reorganized some of the bullet points. I have also added
   a new bullet point for `ScopedTypeVariables` under the
   "Little to no support" section, as #433 reveals that promoting
   functions that rely on the behavior of `ScopedTypeVariables` is
   terribly fragile (and not easy to fix).
3. Lots of little formatting and grammar fixes to make the prose in
   the `README` flow better.

Note that this patch does _not_ fix either of #429 or #433—it just
documents the rather unsatisfying current state of affairs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants