Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AST annotations #276

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Add AST annotations #276

wants to merge 1 commit into from

Conversation

TravisCardwell
Copy link
Collaborator

@TravisCardwell TravisCardwell commented Nov 13, 2024

Issue: #256

This WIP implementation has one pass named Parsed and unit annotations. We probably do not want to merge without implementing realistic passes/annotations, so this PR is created as a draft.

Note that many tests currently fail because the fixtures have not been updated to include the annotations.

@edsko
Copy link
Collaborator

edsko commented Nov 14, 2024

After discussing this with @TravisCardwell , we decided to go a slightly different route, which would make the "infrastructure" (such as it is) more generally applicable to all phases in hs-bindgen:

-- | All passes in hs-bindgen (from parsing C to final generation)
data Pass = ...

-- closed!
type family Ann (p :: Pass) (a :: Star) where
  Ann Parsed a = AnnParsed a
  Ann p a = () -- All annotations of all types default to ()

type family AnnParsed a -- could be open or closed, doesn't matter
  
deriving stock instance (forall p. Show (Ann "Foo" p)) => Show (Foo p)

@TravisCardwell TravisCardwell force-pushed the ast-annotations branch 2 times, most recently from 2145d6a to 7fde632 Compare November 15, 2024 06:27
@TravisCardwell
Copy link
Collaborator Author

I am experimenting with the above new design.

I am using the following terminology. (Let me know if different terminology would be better.)

  • Each AST represents a different phase, and each phase may be implemented using multiple passes.
Details
-- | Annotation passes
--
-- @hs-bindgen@ generates Haskell bindings from C headers via a number of
-- abstract syntax tree (AST) data types.  Each AST represents a different
-- /phase/, and each phase may be implemented using multiple /passes/.  This sum
-- type defines the passes used across all phases.  Each pass should only be
-- used within a single phase.
--
-- Phases:
--
-- 1. __C AST__ ("HsBindgen.C.AST"), created from the low-level @libclang@ types
--   that are parsed from the C source
--     * Prefix: @C@
-- 2. __C IR__ ("HsBindgen.IR.AST")
--     * Prefix: @Ir@
-- 3. __Haskell AST__ ("HsBindgen.Hs.AST")
--     * Prefix: @Hs@
-- 4. __Backend Common__ ("HsBindgen.Backend.Common")
--     * Prefix: @Be@
-- 5. __Template Haskell Backend__ ("HsBindgen.Backend.TH")
--     * Prefix: @Th@
-- 6. __Preprocessor Backend__ ("HsBindgen.Backend.PP")
--     * Prefix: @Pp@
data Pass =
    -- | Haskell AST placeholder pass, to be removed when real passes are
    -- implemented (in any phase)
    HsPlaceholder

I think that pass annotation type families (like AnnParsed) need to be open because of imports. They need to be defined in the module that defines Ann while instances likely need to reference phase-specific data types. Using open type families allows us to define the type families in the common module and instances in the phase modules.

The Ann type family defines a default (Ann p a = ()), but it appears that it is not used.

Couldn't match type 'Ann.AnnHsPlaceholder "Newtype"' with '()'
Expected: Ann.Ann Ann.HsPlaceholder "Newtype"
  Actual: ()

I have to define the following instances for it to work, and that default definition can be removed.

type instance AnnHsPlaceholder "Newtype" = ()
type instance AnnHsPlaceholder "NewtypeField" = ()

Without the default definition, there is no need for the Ann closed type family. Furthermore, we would be able to define defaults for pass annotation type families if they were closed.

I am force-pushing the ast-annotations branch with the centralized Pass and Ann definition. There are two commits:

  1. The first commit defines the Pass sum type and annotation families. It defines a single HsPlaceholder pass that should be removed when real passes are implemented (in any phase).
  2. The second commit demonstrates/tests usage, implemented the same annotations as the preliminary implementation. This commit should not be merged.

I am also pushing an ast-annotations-closed branch that implements the same demonstration annotations using a closed type family.

I do not mean to be contrarian, but I think that defining annotations for each phase (when needed, separately) is a better design.

  • We do not have to worry about symbol collision across phases. One only has to consider symbols defined within the current phase when adding a new one.
  • Each type family can be closed, allowing the definition of default unit annotations.
  • When using a common Pass type, quantifying over passes across all phases does not feel clean when only a subset of the passes are relevant.
  • When defining the passes within a phase module, we can document them there.

In this case, neither of these branches should be merged, but the ast-annotations-closed branch can be used as reference when we need to implement passes/annotations.

@TravisCardwell
Copy link
Collaborator Author

I rebased the branches to work with the de Bruijn merge.

Note that I have not yet updated the documentation for Pass (in HsBindgen.Annotations). It documents all of the phases, but I am not sure about the distinction between HsBindgen.SHs.AST types and HsBindgen.Backend.Common types. Perhaps the HsBindgen.SHs.AST types will replace the HsBindgen.Backend.Common types, but they currently both exist.

@edsko
Copy link
Collaborator

edsko commented Nov 20, 2024

After discussing this once more, we realized that the one Ann type family to rule them all doesn't work, due to cyclic module dependencies. Therefore it's better to split them up

module HaskellAST where

-- | Passes specific to the "Haskell AST" phase
data Pass = P1 | .. 

type family Ann (p :: Pass) (s :: Symbol) where
  Ann P1 s = AnnP1 s
  Ann P2 s = AnnP2 s

type family AnnP1 (s :: Symbol) where
  AnnP1 s = ()

data T1 (p :: Pass) = MkT1 (Ann p "MkT1") ..

deriving stock instance ValidPass p => Show (T1 p)

class    (forall s. Show (Ann p s)) => ValidPass p 
instance (forall s. Show (Ann p s)) => ValidPass p 


-- could consider (not sure if necessary)

data SPass :: Pass -> STar where
  SP1 :: SPass P1
  ..

class KnownPass (p :: Pass) where
  knownPass :: Proxy p -> SPass p

-- and then add KnownPass as a superclass constraint on ValidPass

@TravisCardwell
Copy link
Collaborator Author

I get errors like the following:

Illegal type synonym family application ‘Ann pass s’ in instance:
  Show (Ann pass s)

I worked on it for a while, but I have not been able to find a way to get around this. orz

I pushed the building code using ForallAnn with the problematic type class (named AllAnnShow) commented out (in HsBindgen.Hs.AST). If we can get AllAnnShow to work, we should be able to remove ForallAnn and replace all ForallAnn constraints with AllAnnShow constraints, all within that module.

@TravisCardwell
Copy link
Collaborator Author

Haskell Unfolder Episode 2: quantified constraints teaches the workaround to the issue that had me stumped. Thank you!

Here is how I documented it in the commit:

-- Class alias to work around GHC limitation that type family synonym
-- applications cannot be used in quantified constraints
class    Show (Ann pass s) => ShowAnn pass s
instance Show (Ann pass s) => ShowAnn pass s

-- All annotations must have a 'Show' instance (quantified constraint)
class    (forall s. ShowAnn pass s) => AllAnnShow pass
instance (forall s. ShowAnn pass s) => AllAnnShow pass

TravisCardwell added a commit that referenced this pull request Dec 4, 2024
We were using a tuple, but that does not scale well when more data is
added.  I ran into this when working on annotations (#256, PR #276).  I
ran into it again when adding source information to support test
generation (#22).

The `Field` types are used by *both* `Struct`/`Record` and `Newtype`
types.

(Cherry-picked from `source-info` for experimentation)
TravisCardwell added a commit that referenced this pull request Dec 5, 2024
We were using a tuple, but that does not scale well when more data is
added.  I ran into this when working on annotations (#256, PR #276).  I
ran into it again when adding source information to support test
generation (#22).

The `Field` types are used by *both* `Struct`/`Record` and `Newtype`
types.
TravisCardwell added a commit that referenced this pull request Dec 9, 2024
We were using a tuple, but that does not scale well when more data is
added.  I ran into this when working on annotations (#256, PR #276).  I
ran into it again when adding source information to support test
generation (#22).

The `Field` types are used by *both* `Struct`/`Record` and `Newtype`
types.
TravisCardwell added a commit that referenced this pull request Dec 17, 2024
We were using a tuple, but that does not scale well when more data is
added.  I ran into this when working on annotations (#256, PR #276).  I
ran into it again when adding source information to support test
generation (#22).

The `Field` types are used by *both* `Struct`/`Record` and `Newtype`
types.
TravisCardwell added a commit that referenced this pull request Dec 18, 2024
We were using a tuple, but that does not scale well when more data is
added.  I ran into this when working on annotations (#256, PR #276).  I
ran into it again when adding source information to support test
generation (#22).

The `Field` types are used by *both* `Struct`/`Record` and `Newtype`
types.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants