diff --git a/rfc/060-replacing-cabal-custom-build.md b/rfc/060-replacing-cabal-custom-build.md new file mode 100644 index 0000000..3c5e853 --- /dev/null +++ b/rfc/060-replacing-cabal-custom-build.md @@ -0,0 +1,2089 @@ +# RFC: Replacing the Cabal Custom build-type + +Adam Gundry, Matthew Pickering, Sam Derbyshire, Rodrigo Mesquita, Duncan Coutts (Well-Typed LLP) + +- [Abstract](#abstract) +- [Background](#background) + * [The current interface](#the-current-interface) + * [Why do packages use the `Custom` build-type?](#why-do-packages-use-the-custom-build-type) ++ [Problem Statement](#problem-statement) + * [How can we move away from the `Custom` build-type?](#how-can-we-move-away-from-the-custom-build-type) + * [Requirements](#requirements) + + [Integration with existing build systems](#integration-with-existing-build-systems) + * [Non-requirements](#non-requirements) +- [Prior art and related efforts](#prior-art-and-related-efforts) + * [Issues with `UserHooks`](#issues-with-userhooks) + * [`code-generators`](#code-generators) +- [High-level design of `build-type: Hooks`](#high-level-design-of-build-type-hooks) + * [`Hooks` from the package author's perspective](#hooks-from-the-package-author-s-perspective) + * [`Hooks` from the build tool's perspective](#hooks-from-the-build-tool-s-perspective) + * [Designing for future compatibility](#designing-for-future-compatibility) + * [Library API and versioning](#library-api-and-versioning) +- [Detailed design of `SetupHooks`](#detailed-design-of-setuphooks) + * [Phases](#phases) + * [Cabal configuration type hierarchy](#cabal-configuration-type-hierarchy) + * [Configuring and building](#configuring-and-building) + * [Configure hooks](#configure-hooks) + + [Phase separation](#phase-separation) + + [`LocalBuildConfig`](#localbuildconfig) + + [`ComponentDiff`](#componentdiff) + * [Build hooks](#build-hooks) + + [Post-build hooks](#post-build-hooks) + * [Install hooks](#install-hooks) +- [Pre-build hooks](#pre-build-hooks) + * [Motivation: fine-grained rules](#motivation-fine-grained-rules) + * [Motivation: a simplistic first design](#motivation-a-simplistic-first-design) + * [Proposed design of rules](#proposed-design-of-rules) + * [Dependency structure](#dependency-structure) + + [File dependencies](#file-dependencies) + + [Dynamic dependencies](#dynamic-dependencies) + * [Rule demand](#rule-demand) + * [Identifiers](#identifiers) + * [Rule monitors](#rule-monitors) + * [API overview](#api-overview) + * [Inputs to pre-build rules](#inputs-to-pre-build-rules) + * [Hooked preprocessors](#hooked-preprocessors) +- [Examples](#examples) + * [Generating modules](#generating-modules) + * [`./configure` style checks](#-configure-style-checks) + * [Doctests](#doctests) + * [Hooked programs](#hooked-programs) + * [Hooked preprocessors](#hooked-preprocessors-1) + * [executable-hash](#executable-hash) + * [Composing `SetupHooks`](#composing-setuphooks) +- [Alternatives](#alternatives) + * [Decoupling `Cabal-hooks`](#decoupling-Cabal-hooks) + * [Effects available to hooks](#effects-available-to-hooks) + * [Inputs and outputs available to hooks](#inputs-and-outputs-available-to-hooks) + * [Identifiers for fine-grained rules](#identifiers-for-fine-grained-rules) + * [Rules only depend on files](#rules-only-depend-on-files) + * [Let the build tool control all searching](#let-the-build-tool-control-all-searching) + * [Making other hooks fine-grained](#making-other-hooks-fine-grained) +- [Stakeholders](#stakeholders) +- [Success](#success) + * [Testing and migration](#testing-and-migration) +- [Future work](#future-work) + * [Hooks integration](#hooks-integration) +- [References](#references) + +## Abstract + +Every Cabal package can supply its own build system, in the form of a `Setup.hs` +program which implements a common command line interface. +Unfortunately, this makes it difficult to implement features in Cabal which +require the build tool to have complete control over the build system. +This turned out to be a major architectural design flaw because, in practice, +all packages use Cabal as their build system -- making the per-package build +system abstraction an artificial limitation. +We propose a way forward to lift this restriction, which will establish +foundations for improvements in tooling based on Cabal, and make Cabal +easier to maintain in the long term. + +The key obstacle to changing this design is the existence of the `Custom` +build-type, through which a package may supply an arbitrary `Setup.hs` file +implementing its build system. While the majority of packages do not use this +feature, a significant minority do, and thus we need to provide a migration path +that allows them to adapt to the new architecture. The primary technical content +of this proposal is a design for a new build-type that addresses the fundamental +design issues with the `Custom` build-type, while still allowing packages to +augment the build system according to their needs. + +This work is being carried out by Well-Typed LLP thanks to investment from the +Sovereign Tech Fund. (For more information, read our [blog post announcing the +project](https://www.well-typed.com/blog/2023/10/sovereign-tech-fund-invests-in-cabal/).) +It has previously been discussed at [Cabal issue #9292](https://github.com/haskell/cabal/issues/9292), +and [tech-proposals pull request #60](https://github.com/haskellfoundation/tech-proposals/pull/60). + +## Background + +A fundamental assumption of the existing Cabal architecture is that each package +supplies its own build system (provided by `Setup.hs`), with Cabal specifying the interface to that build +system, for instance, the behaviour of `./Setup.hs build` or `./Setup.hs install`. + +Modern projects consist of many packages. However, an aggregation of +per-package build systems makes it difficult or impossible to robustly implement +cross-cutting features (that is, build system features that apply to multiple +packages at once). This includes features in high demand such as: + +* fine-grained intra-package and inter-package build parallelism; +* rich multi-package IDE support; +* multi-package REPL. + +The fundamental assumption has turned out to be false: all modern Haskell +packages implement the build system interface using a standard implementation +provided by Cabal itself, albeit in some cases with minor customisations. Cabal +already provides a build system based on declarative configuration, and the +majority of packages use this `Simple` build-type. A minority use the `Custom` +build-type, which allows wholesale replacement of the build system, but in +practice is mostly used to make minor customisations of the standard +implementation. It is the existence of this `Custom` build-type which is holding +us back, but its flexibility is mostly unused. + +Thus the solution is to invert the design: instead of each package supplying its +own build system, there should be a single build system that supports many +packages. + +By way of example, consider an IDE backend like HLS. An IDE wants to *be* the +build system (and compiler) itself, not for any final artefacts but for the +interactive analysis of all the source code involved. It wants to prepare (i.e. +configure) all of the packages in the project in advance of +building any of them, and wants to find all the source files and compiler +configuration needed to compile the packages. This is incompatible with a set +of opaque per-package build systems, each of which is allowed to assume all +dependencies already exist, and which can only be commanded to build artefacts. + +The overall goal is the deprecation and eventual *removal of support for the +`Custom` build-type*. It is the removal of the `Custom` build-type that will enable +simplifications and easier maintenance in Cabal, and enable easier +implementation of new features. + + +### The current interface + +Today, each package can provide a `Setup.hs` script defining how it should +be built. +The Cabal specification dictates that a `Setup.hs` script must obey a [specific +interface](https://cabal.readthedocs.io/en/stable/setup-commands.html#setup-commands). +One can build an individual package by first compiling `Setup.hs` +and then invoking: + +``` +./Setup configure +./Setup build +``` + +Packages indicate how much customisation of the build process they require by +declaring which +[`build-type`](https://cabal.readthedocs.io/en/3.10/cabal-package.html#pkg-field-build-type) +they use. Currently there are four options: + +| Type | Description | +|-------------|------------------------------------------------------------------------------------------| +| `Simple` | Use the standard Cabal build system. Most packages use this option. | +| `Configure` | Run a `./configure` script to discover system configuration, then build using the standard build system. | +| `Make` | Invoke `make` to build the package using its own build system (obsolete). | +| `Custom` | Compile and run a custom `Setup.hs` script defining the package's build system. | + +The most flexible option used in practice is `build-type: Custom`. In this +case, the package can define an arbitrary `Setup.hs` script which implements the +command-line interface defined by the Cabal specification. That is, the program +must support being executed as `./Setup configure`, `./Setup build`, and so on, +but it can implement whatever logic it wants. + +In practice, almost all custom `Setup.hs` scripts depend on the `Cabal` library, +which allows for customisation of its build system by calling +`defaultMainWithHooks`, providing a value of the +[`UserHooks` datatype](https://hackage.haskell.org/package/Cabal-3.10.1.0/docs/Distribution-Simple-UserHooks.html). +However, nothing about the `Setup.hs` interface guarantees this, so build tools +must pessimistically assume that `Setup.hs` may do anything. + +This means that where a package in the build plan uses `Custom`, build tools +such as `cabal-install` must fall back on legacy code paths to compile it. This +causes various problems and requires hacky workarounds. In particular, the +`Setup.hs` interface works only with whole packages, and cannot easily be +changed, even though modern `cabal-install` versions support building individual +components independently (e.g. compiling a library and one selected test-suite +without compiling other test-suites in the same package). + + +### Why do packages use the `Custom` build-type? + +We have been surveying existing Haskell packages that currently use the `Custom` +build-type, identifying where their requirements can be fulfilled with the +`Simple` build-type using existing declarative features, where new declarative +features would be useful, or where extending the build system is necessarily +required. The [survey report](https://github.com/well-typed/hooks-build-type/blob/main/survey.md) +describes packages we investigated in more detail. + +There are various reasons why packages currently use the `Custom` build-type, such as: + + * detecting system configuration; + + * generating source code for modules during a build (e.g. to make information + about the build environment available to the final executable); + + * adding build system features which are not natively supported by Cabal, such + as doctests (though it is debatable whether this is a good approach to + integrating doctests); + + * working around bugs in build tools; or + + * executing additional steps when a package is installed (e.g. the Agda + compiler is a normal Haskell package that needs to compile some Agda + libraries when it is installed). + +Over time, Cabal has gradually incorporated more features to allow some of these +use cases to be subsumed, typically by adding more declarative information to +the `.cabal` file format. For example, +[`pkgconfig-depends`](https://cabal.readthedocs.io/en/3.10/cabal-package.html#pkg-field-pkgconfig-depends) +reduces the need for packages to have custom configuration logic. Similarly, +`build-type: Configure` can be used to implement configuration logic in a more +targeted way than `build-type: Custom`. + +However, this necessarily captures only the more common use cases. There is a +"long tail" of packages using the `Custom` build-type for their own very specific +needs. + + +## Problem Statement + +This proposal aims to solve the problem that Cabal's architecture currently +prevents build tools from having complete control of the build system, and hence +limits their development and makes maintenance harder. + + +### How can we move away from the `Custom` build-type? + +We are engaging with package maintainers to remove the requirement for the +`Custom` build-type where better alternatives exist. For example, some packages +use the `Custom` build-type only to support doctests, and in these cases it is +often relatively easy to switch to the simple build-type and use a different +mechanism to run doctests. + +However, it is not feasible to simply migrate all existing uses of the `Custom` +build-type to use the simple build-type. Some packages still need to customise +the behaviour of the build system, such as running code during a configuration +step to gather information about the host system, where this dependency cannot +be expressed declaratively. We believe that **it should be possible to +augment the Cabal build process in unanticipated but controlled ways, to cover +parts of the build process that were not imagined or supported by the Cabal +maintainers**. + +Thus, for selected packages, replacing the `Custom` build-type will require a new +mechanism, which will permit controlled extensions +to the build system. This mechanism should be designed on the basis that the +build tool, not each individual package, is in control of the build system. +Moreover, the architecture needs to be flexible to accommodate future changes, +as new build system requirements are discovered. + + +### Requirements + +The key requirements for the new mechanism are as follows: + + * It should provide an alternative to the `Custom` build-type for packages that + need to augment the Cabal build process, based on the principle that the + build system rather than each package is in overall control of the build. + + * It should support essentially all of the existing uses of the + `Custom` build-type. + + * It should minimise the effort required from package maintainers. While some + work to migrate away from the `Custom` build-type is inherently necessary, we + need the migration path to be straightforward. + + * It should integrate with existing build systems; see + [§ Integration with existing build systems](#integration-with-existing-build-systems). + + * It should make limited assumptions about how the build tool will operate, and + in particular should not *require* use of the `Setup.hs` interface. + + * It should learn lessons from previous efforts to migrate away from the `Custom` + build-type. + + * It should be open to future evolution of the design, as new requirements + become clear, rather than getting stuck in a local optimum where there is + again a difficult migration problem. + +A crucial goal of this work is making the `Cabal` library easier to maintain and +develop over the long term, but it should also have benefits for downstream +tools. In particular, Haskell build tools such as `cabal-install` and `stack` +will have more flexibility to implement their build systems, making it easier to +add features such as fine-grained cross-package build graphs (leading to more +parallelism for faster builds), more accurate file change detection, etc.. + +#### Integration with existing build systems + +The `Hooks` `build-type` should integrate with the following three different +ways of building `Cabal` packages: + + * Building Haskell packages with `cabal-install`. In this case, we want to + expose enough information to enable features such as per-module build graphs, + coordination of parallelism through `-jsem`, and multi-repl. + + * Building Haskell packages inside other build systems using the `Setup.hs` + interface. This is important as we don't want to break the workflow of + RPM/DEB distribution packagers, Nix packages, etc. + + * The Shake-like build system of the Haskell Language Server. Here, we want + HLS to be able to re-run actions on demand as part of an interactive + developer environment. + +### Non-requirements + +Although it is not possible or practical to address too many problems at once, +it is a goal to make the new API more evolvable than the old `UserHooks` +API. Thus we believe that it will be possible to adapt the design more easily to +future features and requirements. + +For example, `Cabal` does not currently have proper support for +cross-compilation, because it does not make a clear distinction between the build +and host. This means the `Custom` build-type currently leads to issues with +cross-compilation, and in the first instance, the new design may inherit the +same limitations. This is a bigger cross-cutting issue that needs its own +analysis and design. + + +## Prior art and related efforts + +The `Cabal` developers have long been aware of the limitations arising from +`build-type: Custom` (see for example [#2395 Proposal for a Cabal plugin API +(inversion of control)](https://github.com/haskell/cabal/issues/2395) and [#3065 +Lessons learned from Custom](https://github.com/haskell/cabal/issues/3065)). +Over time, there have been attempts to gradually move packages away from +`Custom`, in some cases by adding declarative features to `build-type: Simple` +instead, which is preferable where possible. + +Since the remaining "long tail" of packages have varied needs, however, we believe it is +better to design a more general mechanism for augmenting the process rather than +many specific knobs, so that integrating the new mechanism into Cabal and other +build-systems is more straightforward and general. We presume that any existing +usage of `Custom` that merely augments the build process is justified and valid, +and seek to provide an alternative build-type to which existing packages can be +directly migrated. +The introduction of the alternative build-type that captures existing `Custom` +extensions does not preclude the addition of declarative features that subsume +use cases for it. + +### Issues with `UserHooks` + +As noted, the `Cabal` library already provides a customisable build system in +the form of the `defaultMainWithHooks` function and the +[`UserHooks` datatype](https://hackage.haskell.org/package/Cabal-3.10.1.0/docs/Distribution-Simple-UserHooks.html). +Thus, it would be possible to define a build-type based on the package author +providing a value of the existing `UserHooks` datatype directly +(rather than providing a `Setup.hs` file which just happens to call `defaultMainWithHooks` on such a value). + +However, redesigning the hooks interface gives us the opportunity to learn +lessons from the original `UserHooks` design, and take into account the +intervening years of Cabal development and the resulting changes in +architectural details (such as multiple components support). + +The existing `UserHooks` mechanism used for the `Custom` build-type presents a +few problems (see [#3600 Hook redesign](https://github.com/haskell/cabal/issues/3600) and other +[tickets labeled `Hooks`](https://github.com/haskell/cabal/issues?q=is%3Aissue+is%3Aopen+Hooks+label%3A%22Cabal%3A+hooks%22)): + +* It is too expressive, as it allows users to override entire build phases. This flexibility + turns the building of a package into a black box, which pessimises certain parts + of `cabal-install`. +* It is opaque, as all the customisation is encapsulated inside the `Setup` executable. + This means that build tools such as `cabal-install` or `HLS` have no way to + inspect which customisations have been made (this means for example that `HLS` + is not aware of any `hookedPreProcessors` declared by the user). +* It often isn't possible to perform the customisation you need using only pre/post hooks, as they aren't expressive enough. + This leads to users instead overriding the main phase, manually taking some pre/post steps + and propagating the information to/from the main build phase. +* Configuration customisation is not propagated: any modifications to the project configuration have + to be reapplied in each hook. This is quite unintuitive, as you would expect to only have + to apply an update to the results of configuration once, instead of needing to re-apply it + in every build phase. +* The interface is not component aware. The hooks were designed before Cabal had support for multiple components, + so it's quite awkard to provide a hook which affects just one component. At best, it's possible + to handle the main library and executables, but there is no concept of named sublibraries or + of other component types such as testsuites or benchmarks in the `UserHooks` design. +* The API is hard to change in a backwards compatible way, and so it has become ossified. + +As will be made clear in later sections, our proposed `SetupHooks` type takes +these issues into account and provides a carefully thought out, cohesive +interface to augmenting the Cabal build process. + + +### `code-generators` + +The +[`code-generators`](https://cabal.readthedocs.io/en/3.10/cabal-package.html#pkg-field-test-suite-code-generators) +feature is intended for defining test suites with source code created via test +autodiscovery. It allows an executable to be invoked that is passed some options +describing the build configuration, and is expected to generate some modules in +response. + +This provides a `Cabal`-native alternative for some uses of the custom +build-type for test autodiscovery. Where such an alternative exists and is +suitable for the needs of a particular package, it is fine to make use of +it. Indeed, where it is possible to add declarative features to describe the +build configuration, those are likely to be preferable to executing arbitrary +custom code during the build. + +However, the declarative features supported by `Cabal` are necessarily limited, +and will not always meet the needs of package authors. For example, +`code-generators` cannot be used to generate modules in components other than +test suites, and it has access to only one set of GHC options, but in general +each component may have different options (see [Cabal issue +#9238](https://github.com/haskell/cabal/issues/9238)). + +Thus we do not think it is feasible to completely migrate away from the custom build-type +merely by adding features like `code-generators`, but not a more general, +uniform mechanism. + + +## High-level design of `build-type: Hooks` + +We propose to augment `Cabal` with a new build-type, `Hooks`. +To implement a package with the `Hooks` build-type, the user needs to provide +a `SetupHooks.hs` file which specifies the hooks using a Haskell API. + +The `Hooks` build-type represents a middle-ground of customisation which specifically +only permits *augmenting* the build process at specific points, while disallowing +the complete replacement of individual build phases. + + +### `Hooks` from the package author's perspective + +When `build-type: Hooks` is specified in the `.cabal` file, +the package author must supply a Haskell file named `SetupHooks.hs` that defines +a value `setupHooks :: SetupHooks`, which is a record of user-specified +hooks. This means that this interface is fundamentally a Haskell library interface, +not a command line interface (unlike `build-type: Custom`, which simply specifies +a replacement for the the `Setup.hs` CLI with no other means of interaction). + +A hook is a Haskell function, with a type such as `HookInputs -> IO HookOutputs` +where `HookInputs` and `HookOutputs` are types specific to the particular hook. +See [§ Effects available to hooks](#effects-available-to-hooks) for discussion +of the choice of the `IO` monad. + +Each hook is optional, e.g. each field of the `ConfigureHooks` datatype has a type +of the form `Maybe Hook`. This means that, when using the library interface +to hooks, the build tool can statically determine that there is no hook at that +particular stage, which might enable certain optimisations. + +See [§ Library API and versioning](#library-api-and-versioning) regarding which +Haskell library defines the `SetupHooks` datatype and any types describing the +inputs and outputs of particular hooks. + +A `.cabal` file using `build-type: Hooks` must include a `custom-setup` stanza +with a `setup-depends` field describing the build dependencies of +`SetupHooks.hs` (just as with the `Custom` build-type). + + +### `Hooks` from the build tool's perspective + +Since the API is expressed using a Haskell library, rather than a CLI, build +tools such as `cabal-install` have a choice of implementation techniques for how +they execute the hooks. + +In order to maintain backwards compatibility with build systems which solely use +the `./Setup.hs` interface (such as `nixpkgs` and Linux distribution packagers), +`Cabal` will provide a function `defaultMainWithSetupHooks :: SetupHooks -> IO ()` +which will ensure that the hooks are invoked in the correct place during the +normal build pipeline. Then the source distribution of a package using the +`Hooks` build-type will contain an automatically-generated shim `Setup.hs` file +of the following form: + +```haskell +import Distribution.Simple ( defaultMainWithSetupHooks ) +import SetupHooks ( setupHooks ) + +main = defaultMainWithSetupHooks setupHooks +``` + +The build tool can compile the shim `Setup.hs` and run the traditional CLI +commands such as `./Setup configure` and `./Setup build`. This does not realise +the full benefits of the new build-type, but it means that the change is completely +transparent to existing tools, as they can continue to use the `Setup` interface +without any modifications. + +With `Hooks`, however, build tools will be able to use alternative +implementation techniques for executing the hooks, rather than being forced to +go through the `Setup.hs` interface: + +* The build tool could define a different IPC interface that can invoke + individual hooks selectively. It could then compile a "hooks executable" that + exposes `SetupHooks.hs` via this interface. This will allow finer-grained + build plans, as is described in [§ Future work](#future-work). + +* The details of such IPC interfaces are under the control + of the build tool, not dictated by the specification (e.g. the hooks + executable could provide a command-line interface for each hook where hook + inputs/outputs are serialised, or it could be started as a child process and + communicate over a pipe). + +* We could even imagine the build tool dynamically loading `SetupHooks.hs` into an + existing process so that it can invoke hooks directly with minimal overhead. + +Crucially, hooks are independent, in the sense that each can be invoked +separately however the build tool arranges to do so. + +See [§ Hooks integration](#hooks-integration) for further details concerning +proposed future work integrating the proposed `Hooks` build-type with build +tools such as `cabal-install` or `HLS`. + +### Designing for future compatibility + +The design strategy for the hooks API should encourage clients to use it in ways +that are unlikely to break when it is evolved in the future. This includes: + + * offering high-level APIs and reusable utilities for common operations where + possible, rather than requiring clients to depend on the details of + particular types; + + * using field names rather than positional constructors; + + * using a distinct record type for each hook's inputs and outputs, so that + fields can be added to the record in the future; + + * hiding the underlying data constructors and providing smart constructors + instead. + +### Library API and versioning + +It is important to be clear what future compatibility guarantees are offered by +`Cabal` and/or build tools to packages using `Hooks`. There is a tension here, +because we do not want package maintainers to be over-burdened with continual +changes to support newer versions of the hooks API, but neither do we want +`Cabal` maintainers to be over-burdened by the costs of providing backwards +compatibility. + +We propose to introduce a new library, `Cabal-hooks`. A package using the +`Hooks` build-type must declare a dependency on the +`Cabal-hooks` library in the `setup-depends` field of their package. +The range of `Cabal-hooks` library versions declared in `setup-depends` +indicates the versions of the hooks API that the package supports. + +The requirement for such a dependency codifies existing practice. Indeed, +while, in theory, a package using `build-type: Custom` can implement its `Setup` +script without depending on `Cabal`, we saw that this flexibility was unused in +practice, as `Setup` scripts end up being defined in terms of `UserHooks`. +This usage pattern incurs a corresponding dependency on the `Cabal` library in +`setup-depends`, in much the same way as we propose here for `Cabal-hooks`. +An additional benefit of the separate `Cabal-hooks` library is that it makes it +possible to evolve the Hooks API without requiring a version bump of the +`Cabal` library. + +In practice, we expect the initial versions of `Cabal-hooks` to mostly +re-export `Cabal` datatypes, as it is these types (such as `LocalBuildInfo`) +that get passed back-and-forth between the build system and the hooks in our +current design (see e.g. [§ Configure hooks](#configure-hooks)). +This design choice does introduce some coupling between the versions of +`Cabal-hooks` and `Cabal` (but see [§ Decoupling `Cabal-hooks`](#decoupling-Cabal-hooks)). +At any rate, this design makes the situation no worse than with `Custom` +(because a shim `Setup.hs` can still always be used to compile `SetupHooks.hs` +using an older version of `Cabal-hooks`), but it gives more options to the build tool, +e.g. where serialisation of hook inputs/outputs is used, the serialisation +format can be controlled by the build tool, and is not necessarily fixed by `Cabal-hooks`. +Indeed, we could imagine a build tool being compiled against multiple `Cabal-hooks` +versions. +The version compatibility problem exists in +`cabal-install` already: even where communication happens via the `Setup.hs` +command line interface, there is already a need for `cabal-install` to adapt to +the command-line flags that are supported by the version of `Cabal` in use (see +e.g. [`filterConfigureFlags`](https://hackage.haskell.org/package/cabal-install-3.10.2.1/docs/src/Distribution.Client.Setup.html#filterConfigureFlags)). +Once this proposal removes the need for `cabal-install` to go through the +`Setup.hs` interface, there is a potential for a significant reduction in +complexity here. + +## Detailed design of `SetupHooks` + +Having described `build-type: Hooks` in the previous section, the remaining part +of the design process is to work out the specific interfaces for the individual +hooks as expressed by the `SetupHooks` type. + +We want to arrive at a design by the following means: + +* The consideration of the existing usage of `Setup.hs` scripts to guide + what hooks should be able to do. +* The needs of the rest of the Haskell ecosystem, in particular the Haskell + Language Server. +* A high-level understanding of what the build process of a package should + look like, taking into account concerns such as parallelisability. + +These viewpoints can inform each other about the precise details for the design. + +As part of the design process we have been developing a [prototype +implementation of this design](https://github.com/mpickering/cabal/tree/wip/setup-hooks) +in the `Cabal` library. + +This section unavoidably relies on a deeper understanding of the `Cabal` build +system than the previous sections. + +### Phases + +The `Cabal` build process defines various phases that package authors should +be allowed to customise in some way: + + * The *configure phase* is when decisions are made about how to perform the + subsequent phases (e.g. which tools and options to use). This + may involve running arbitrary code to detect information about the host + system. + + * The *build phase* is when the project is compiled (including for the REPL) + and build artifacts are generated (including libraries, executables and other + artifacts such as Haddock documentation). + + * The *install phase* is when build artifacts are moved from the build directory + to the final installed location or installation image (e.g. for subsequent packaging). + +We propose to extend these three phases, with the following high-level structure +for `SetupHooks`: + +```haskell +data SetupHooks = SetupHooks + { configureHooks :: ConfigureHooks + , buildHooks :: BuildHooks + , installHooks :: InstallHooks + } +``` + +See [§ Configure hooks](#configure-hooks), [§ Build hooks](#build-hooks) +and [§ Install hooks](#install-hooks). + +Unlike the old `UserHooks` datatype, there is deliberately no way for the +package to remove or replace existing phases wholesale (such as replacing the +`buildHook`), and it is not possible to change the behaviour of operations +such as tests, benchmarks and cleanup. Nor is it possible to add entirely +new phases, because which phases are available is +determined by the overall design of the build system, not the individual package. + +### Cabal configuration type hierarchy + +Cabal has a number of datatypes which are used to store the result of +configuration. We will briefly describe them here before getting into +the precise design of the hooks. + +* `LocalBuildInfo`: the whole result of the configure phase. +* `GenericPackageDescription`: the parsed version of a `.cabal` file. +* `PackageDescription`: a resolved `GenericPackageDescription`, flattened relative to a flag assignment + (see [`Distribution.Types.PackageDescription`](https://hackage.haskell.org/package/Cabal-syntax-3.10.2.0/docs/Distribution-Types-PackageDescription.html)), may contain several `Component`s. +* `Component`: a sum type which captures the possible different component types such as `Library`, `Executable`, etc.. +* `BuildInfo`: the shared part of a component which describes the options which will + be used to build it. +* `ComponentLocalBuildInfo`: additional information which `Cabal` knows about a component which is not present in `Component`. + This is usually pieced together from the parent `LocalBuildInfo` and the individual `BuildInfo` of the `Component`. +* `ConfigFlags`/`BuildFlags`/`HaddockFlags`: flags to `./Setup configure`, `./Setup build`, `./Setup haddock`, etc.. +* `TargetInfo`: all the information necessary to build a specific target (combination of a `Component` and `ComponentLocalBuildInfo`). + +### Configuring and building + +All decisions about *how to build* a project should be made in the configuration +phase. Hooks during the build phase should not (re)calculate options, but +should only be used for actually *building* the package. +Therefore, it is the hooks to the configuration phases which have the ability to +augment the build environment with additional settings. The hooks for other +phases will receive this configuration in their inputs, and must honour it. +See [§ Phase separation](#phase-separation). + +Configuration happens at two levels: + + * global configuration covers the entire package, + * local configuration covers a single component. + +Once the global package configuration is done, all hooks should work on a +per-component level. This avoids introducing additional synchronisation points +in a build that would limit the amount of available parallelism. + +### Configure hooks + +We propose to add the following configure hooks: + +```haskell +type PreConfPackageHook = PreConfPackageInputs -> IO PreConfPackageOutputs +type PostConfPackageHook = PostConfPackageInputs -> IO () +type PreConfComponentHook = PreConfComponentInputs -> IO PreConfComponentOutputs + +data PreConfPackageInputs + = PreConfPackageInputs + { configFlags :: ConfigFlags + , localBuildConfig :: LocalBuildConfig + , compiler :: Compiler + , platform :: Platform + } + +data PreConfPackageOutputs + = PreConfPackageOutputs + { buildOptions :: BuildOptions + , extraConfiguredProgs :: ConfiguredProgs + } + +data PostConfPackageInputs + = PostConfPackageInputs + { localBuildConfig :: LocalBuildConfig + , packageBuildDescr :: PackageBuildDescr + } + +data PreConfComponentInputs + = PreConfComponentInputs + { localBuildConfig :: LocalBuildConfig + , packageBuildDescr :: PackageBuildDescr + , component :: Component + } + +data PreConfComponentOutputs + = PreConfComponentOutputs + { componentDiff :: ComponentDiff } + +data ConfigureHooks + = ConfigureHooks + { preConfPackageHook :: Maybe PreConfPackageHook + , postConfPackageHook :: Maybe PostConfPackageHook + , preConfComponentHook :: Maybe PreConfComponentHook + } +``` + +From the build tool's perspective, the global configuration phase goes as follows: + +- Firstly, decide on the initial global configuration for a package, + producing a `LocalBuildConfig`. + +- Run the `preConfPackageHook`, which has the opportunity to modify the + initially decided global configuration (with a `BuildOptions` that overrides + those stored in the passed in `LocalBuildConfig`, and `ConfiguredProgs` that + get added to the `ProgramDb`). + After this point, the `LocalBuildConfig` can no longer be modified. + +- Use the `LocalBuildConfig` in order to perform the global package + configuration. This produces a `PackageBuildDescr` containing the information + Cabal determines after performing package-wide configuration of a package, + before doing any per-component configuration. + +- Run the `postConfPackageHook`, which can inspect but not modify the result of + the global configuration. This can be used to propagate custom package-wide + logic to the subsequent per-component configure hook (and is used for + example to re-implement the `Configure` `build-type`). + +After the global configuration has completed, individual components can be +configured independently, as follows: + +- Run the `preConfComponentHook`. This is the only means to apply specific + options to a `Component`. + +- Use the modified `Component` to perform per-component configuration and create + the `ComponentLocalBuildInfo`. + +#### Phase separation + +Only configure hooks can make changes to the `PackageDescription`. Once +configuration is finished, the package description should be set in stone, and +subsequent hooks such as build hooks are not able to modify it. +This differs from `UserHooks`, where modifications to the project configuration +in the configure phase are not propagated to the other phases, and instead +subsequent hooks must re-apply changes, in the form of `HookedBuildInfo`. +This old design lead to several subtle bugs and maintenance headaches which +this new design will allow us to get rid of, once we remove support for +the `Custom` build-type. + +The configuration hooks thus follow a simple philosophy: + +* All modifications to global package options must use `preConfPackageHook`. +* All modifications to component configuration options must use `preConfComponentHook`. + +If a hook modifies the options in these phases, then the configuration is propagated +into all subsequent phases, and the design of the interface ensures that +this is the only point where hooks can modify the options. + +It is only the pre-configuration hooks which allow modification of the options. +This is because the configuration process computes some more complicated data +structures from these initial inputs. If hooks were allowed to modify the results +of configuration then it would be error-prone to ensure that they suitably updated +both the options in question as well as the generated configuration. +For example, both `PackageDescription` and `ComponentLocalBuildInfo` contain +a list of exposed modules for the library. +This is why the "post" configuration hook (and any hooks subsequent to the +configure phase) can only run an `IO` action; they can't return any modifications +that would affect the `PackageDescription`. + +#### `LocalBuildConfig` + +There are parts of the `LocalBuildInfo` which must be decided at a global +(per-package) level; for instance, whether to build dynamic libraries. +On the other hand, there are also things we want to decide on a local +(per-component) level, such as specific GHC options with which to compile the +component. + +Moreover, there are parts of the `LocalBuildInfo` which hooks cannot modify. +For example, things such as package dependencies can't be modified because they are +determined externally by the overall build plan (e.g. from the dependency solver). +Thus, the hooks interface should prevent the modification of these +parts of `LocalBuildInfo`. + +We propose to achieve this by defining a new type `LocalBuildConfig` which +contains only the parts of the existing `LocalBuildInfo` datatype that can be +modified by `preConfPackageHook`. + +#### `ComponentDiff` + +The `ComponentDiff` records the modifications that should be applied to each component. + +For each component, `preConfComponentHook` is run, returning a `ComponentDiff`. +This `ComponentDiff` is applied to its corresponding `Component` +by monoidally combining together the fields. + +```haskell +newtype ComponentDiff = ComponentDiff { componentDiff :: Component } + +emptyComponentDiff :: ComponentName -> ComponentDiff +``` + +The diff is represented by a `Component`; not all fields of a `Component` are +allowed to be modified, and when the diff is applied it is dynamically checked +that the hook has not modified any fields which it shouldn't. + +Some alternative designs: + + 1. Specify a Haskell function `Component -> Component` which can modify + the component at will. + 2. Define a custom `ComponentDiff` datatype which contains only the fields + of a `Component` which we allow hooks to modify. + +The benefit of (2) is that it trims down the amount of internal details exposed +from `Cabal`,making it less likely that an internal change in Cabal would end up +breaking the `Hooks` defined by package authors. However, one would need to +ensure this interface is general enough in order to avoid locking out Hooks +authors, e.g. if `Cabal` adds a new field to `Component` without updating the +corresponding `ComponentDiff` type in order to make it modifiable by hook authors. +If we end up with a design in which `Cabal`'s version of the `Component` type +is necessarily separate from the type in the hooks API, we may want to reconsider +this alternative. + +### Build hooks + +The design of the pre-build hooks has generated significant discussion during review. +There are several trade-offs. The initial proposal was essentially a port of the +build hooks in the old `UserHooks`, but updated to the per-component world. +This included monolithic pre and post hooks for each component, plus the existing +"hooked pre-processors" abstraction. This had the advantage that it would be easy +for package authors to port their `Setup.hs` scripts to the new design, and it was +a relatively minimal change in the Cabal codebase. +Many Cabal contributors share a long term goal to move the Cabal design towards +one based on a build graph with fine-grained dependencies. From this perspective, +the critique was that the initial proposal was too conservative a change, and that +we should take this opportunity of making a significant API change to establish a +new API that would not hold back the move towards finer-grained dependencies. +Another critique was that the original `UserHooks` design was somewhat ad-hoc, +since it used both monolithic hooks and hooked pre-processors to provide +finer-grained dependencies for a modest subset of use cases. +On the other hand, there is a very large design space for finer-grained +dependencies, and so picking a point in the design space is not simple. +Another disadvantage is that it will of course be more work for package authors +to port their existing `Setup.hs` scripts, which currently use monolithic hooks. + +The proposed design for pre-build hooks tries to balance these trade-offs. +Instead of an ad-hoc combination of monolithic hooks and hooked pre-processors, +we use a single general system of rules, but we take a relatively conservative +approach to the expressive power of the rules. In particular, the style of the +rules is relatively low level. For example it does not include "rule patterns" +such as generating a `*.hs` from a `*.y`. Instead, each rule specifies the +individual files involved as inputs and outputs. It should nevertheless be +possible to build higher level patterns on top, using Haskell's usual powers of +abstraction to generate the lower level rules. Crucially, the design allows the +rules to be used across an IPC interface, which is necessary for build tools +like `cabal-install` or HLS to be able to interrogate and invoke them +(see e.g. the future work discussed in [§ Hooks integration](#hooks-integration)). + +The full details of the design of pre-build hooks are provided in +[§ Pre-build hooks](#pre-build-hooks). + +On top of pre-build hooks, we also propose a limited notion of post-build hooks, +which accomodates package authors which need to perform an `IO` action in order +to modify executables after they are built, as described in +[§ Post-build hooks](#post-build-hooks). + +To summarise, we propose two different kinds of build hooks: + +```haskell +-- | Build-time hooks. +data BuildHooks + = BuildHooks + { preBuildComponentRules :: Maybe PreBuildComponentRules + , postBuildComponentHook :: Maybe PostBuildComponentHook } +``` + +Build hooks cannot change the configuration of the package. +There are deliberately no package-level build hooks, only component-level hooks. +This avoids introducing unnecessary synchronisation points when multiple +packages/components are being built in parallel. + +#### Post-build hooks + +Post-build hooks cover a simple use case: performing an IO action after an +executable has been built. + +This functionality gives package authors a way to modify an executable after +it has been built, which is useful if one wants to: + + - inject data into an executable after it has been built + (see for example [§ executable-hash](#executable-hash)), + - strip an executable with an external tool, + - perform code signing on an executable, e.g. using `xattr`. + +Post-build hooks are run after the normal build phase completes. This means that +a tool such as HLS would never run them, as in a sense HLS never finishes +building. Note however that, were HLS to support running test-suites, it would +run the post-build hooks for the testsuite right after building it, before +running it. + +We propose the following simple API for post-build hooks: + +```haskell +data PostBuildComponentInputs + = PostBuildComponentInputs + { buildFlags :: BuildFlags + , localBuildInfo :: LocalBuildInfo + , targetInfo :: TargetInfo + } + +type PostBuildComponentHook = PostBuildComponentInputs -> IO () +``` + +Note that this is a single monolithic step that would simply be re-run +any time the `build` action is re-run. + +### Install hooks + +The `install` hooks run allows package authors to run an extra `IO` action +when copying/installing a package: + +```haskell +data InstallComponentInputs + = InstallComponentInputs + { localBuildInfo :: LocalBuildInfo + , copyFlags :: CopyFlags + , targetInfo :: TargetInfo + } + +type InstallComponentHook = InstallComponentInputs -> IO () + +data InstallHooks + = InstallHooks + { installComponentHook :: Maybe InstallComponentHook + } +``` + +The install hooks can be used to install files per-component. +The main use case for install hooks is when set of things you want to install is +not fixed and predetermined. One example is Agda, which wants to run the built +`Agda` executable on the associated standard library `.agda` modules in order +to generate `.agdai` interface files for them. These should then be installed +alongside the `Agda` executable. +This allows users to obtain a functional Agda compiler by using the single +invocation `cabal install Agda`. This also means that we can use +`build-tool-depends: Agda` in other projects. + +We could imagine a more declarative way of specifying this being introduced in +the future, in which case packages will be free to migrate to it gradually. +There is not necessarily a problem with having some overlap between hooks and +declarative features. + +An alternative approach would be to regard as illegitimate any use cases which +treat `Cabal` as a packaging and distribution mechanism for executables, and on +that basis, cease to provide install hooks. We do not follow this approach because +it would block maintainers of packages that rely on this behaviour +(e.g. Agda and Darcs) from migrating, for a relatively small reduction in +complexity in `Cabal`. + +It is important that these install hooks are consistently run both when copying +and when installing, as this fixes the inconsistency noted in +[Cabal issue #709](https://github.com/haskell/cabal/issues/709). +There is no separate notion of an "copy hook", because "copy" and "install" +are not distinct build phases. + +## Pre-build hooks + +The pre-build hooks consist of a collection of fine-grained build rules. +These are run before building a particular component of a package. + +### Motivation: fine-grained rules + +Suppose that Cabal did not have built-in support for `happy`, then a +package making use of it might like to write a rule like this: + +``` +lib:my-component:module:Foo.Bar : src:blah/foo/bar.y + ${happy:exe:happy} ${input[0]} -o ${output[0]} +``` + +The key components of such a rule description are: + + - The input of the rule (in this case, the source file `blah/foo/bar.y`). + - The output of the rule (the Haskell module `Foo.Bar`, inside a particular + component of the package). + - The action to run (in this case running the executable `happy`). + Note that it is the build system that decides where inputs/outputs are + located (in this case, the rule refers to them using `${input}` and + `${output}`). + +Unfortunately, the textual description of rules presented above suffers from some +limitations that would make migrating existing packages with `Custom` build-type +difficult. In particular, one often wants to allow rules to depend on each other +in a more dynamic manner, for example if one needs to query an external +executable in order to determine the dependency structure; say by running +[`ghc -M`](https://downloads.haskell.org/ghc/latest/docs/users_guide/separate_compilation.html#makefile-dependencies) +on a root Haskell module in order to compute a build graph (or `gcc -M`, etc). + +### Motivation: a simplistic first design + +To explain the design we have arrived at for fine-grained pre-build rules, let +us first consider what a first draft design, which accommodates both the design +of HLS as well as the existing `Custom` setup scripts, might look like: + +```haskell +type TentativeRules env = env -> IO [TentativeRule] +data TentativeRule = TentativeRule + { dependencies :: [FilePath] + , results :: NonEmpty FilePath + , action :: IO () + } +``` + +That is, rules are specified by a function that takes in an environment +(which in practice consists of information known to `Cabal` after configuring, +e.g. `LocalBuildInfo`, `ComponentLocalBuildInfo`) and returns an `IO` action +that computes a list of rules. + +### Proposed design of rules + +There are several shortcomings with the above simplistic design: + + 1. It does not support an IPC interface that would allow integration + with other build tools (see [§ Hooks integration](#hooks-integration)). + Broadly, we expect the build tool to be able to query the separate hooks + executable in order to obtain all the hooks that a package with `Hooks` + `build-type` provides, as we have no way of serialising and deserialising + arbitrary `IO` actions. + + 2. It lacks information that would allow us to determine when the rules need + to be recomputed: + + a. if the rules were computed by invoking `ghc -M` (or similar), we would + need to recompute them if the user adds a new file that would + have been found by that call to `ghc -M`. + + b. if the `env` environment changed, we might or might not need to re-run + individual rules. We need a mechanism to match up old rules (from a + previous computation) with new rules, and determine whether the rules + have changed (and thus need to be re-run) or not. + + 3. The dependency structure is overly reliant on filepaths, see + [§ Dependency structure](#dependency-structure). + +We propose to fix (1) by using static pointers, taking inspiration from +[Cloud Haskell](#cloud-haskell). + +We fix (2) by (a) adding monitoring of files and directories (see +[§ Rule monitors](#rule-monitors)), and (b) by attaching a unique `RuleId` +identifier to each rule, together with a `Eq Rule` instance. + +We fix (3) by requiring that rules that consume the output of another rule +directly refer to that rule, rather than indirectly depending on the same +filepath that that rule outputs. + +We thus propose: + +```haskell +data Rule + = Rule + { ruleAction :: !RuleCommands + -- ^ To run this rule, which t'Command's should we execute? + , staticDependencies :: ![Dependency] + -- ^ Static dependencies of this rule. + , results :: !(NE.NonEmpty Location) + -- ^ Results of this rule. + } + deriving (Eq, Binary) + +data RuleId -- opaque + +data RuleCommands + = -- | A rule with statically-known dependencies. + forall arg. + Typeable arg => + StaticRuleCommand + { staticRuleCommand :: !(Command arg (IO ())) + -- ^ The command to execute the rule. + } + | DynamicRuleCommands { .. } -- (explained later) + +instance Eq RuleCommands +instance Binary RuleCommands + +-- NB: essentially the Cloud Haskell "Closure" type. +data Command arg res = Command + { actionPtr :: !(StaticPtr (arg -> res)) + -- ^ The (statically-known) action to execute. + , actionArg :: !arg + -- ^ The (possibly dynamic) argument to pass to the action. + , cmdInstances :: !(StaticPtr (Dict (Binary arg, Show arg))) + -- ^ Static evidence that the argument can be serialised and deserialised. + } + +instance Eq (Command arg res) +instance Binary (Command arg res) + +newtype Rules env = + Rules { runRules :: env -> RulesM () } +``` + +In this design, a rule stores a closure (in the sense of Cloud Haskell) that +executes it, using the `RuleCommands` datatype. For the simple case of a static +rule (with no dynamic dependencies), this constists of: + + - a static pointer to a function expecting an argument and returning `IO ()`, + - the argument to pass to the action, + - static evidence that the argument can be serialised/deserialised, so that + it can be passed through an IPC interface. + +For example, the function might be an invocation of `happy` whose arguments +depend on the passed in value (e.g. which module we are compiling). + +The specific monadic return type, `RulesM ()`, is used internally to handle +generation of `RuleId`s as explained in [§ Identifiers](#identifiers). +Ignoring these implementation details, we can think of the rules as being +specified by a Haskell function with the following type: + +```haskell +env -> IO (Map RuleId Rule, [MonitorFileOrDir]) +``` + +This design is an intermediate point in between the applicative and monadic +dependency structures defined in [Build systems à la carte](#carte): + + - an applicative interface enforces a static dependency structure, which + is not flexible enough when we need to query an executable for dependencies + (e.g. `gcc -M`), + + - a monadic interface allows full dynamic dependencies. While desirable, + this presents challenges when considering how to communicate this build + graph to other build systems such as HLS (it would require the ability to + call back into the build system from within the hooks executable, as + detailed in [Free Delivery](#delivery)). + +Instead, we opt to restrict ourselves to a limited amount of dynamicism in +the dependency structure of rules: we can dynamically generate a collection +of rules, and each rule can then introduce additional dynamic dependencies +on other rules (but not add any new nodes to the dependency graph). +This follows the design in `ninja`, which requires a tool to generate a ninja +file that lists rules and their dependencies, with rules being allowed to +declare +[additional dynamic dependencies](https://ninja-build.org/manual.html#ref_dyndep). + +This design seems sufficient for common use cases, such as GHC's build system. +Indeed, as explained in [Hadrian](#hadrian), its Make build system only required +second-layer expansion (i.e. `$$$$`) for rules that invoke `ghc -M` in some way, +not any further layers. +More generally, it is preferable to output a set of rules that are at a rather +low-level, so that these can be readily consumed by build tools, rather than +requiring the build tool to do additional work to resolve dependencies. + +### Dependency structure + +We propose the following API for rule dependencies: + +```haskell +data Dependency = RuleDependency RuleOutput | FileDependency Location +data RuleOutput = RuleOutput { outputOfRule :: RuleId, outputIndex :: Int } +``` + +In particular, a rule that depends on the output of another rule must depend +directly on the rule, rather than the file that that rule outputs. +This ensures that dependencies are resolved upfront rather than when running +the rules. This ensures that any complexity in the structure of the rules exists +within the program generating the rules rather than in the build tool consuming +them. Moreover, this design avoids several issues surrounding stale files that +plague `Shake` and `Hadrian` in practice; see +[§ Rules only depend on files](#rules-only-depend-on-files). + +Note that we still require the ability for rules to depend directly on files, +to account for situations in which the file is not generated by another rule, +such as for a Happy pre-processor rule `A.y -> A.hs`, where `A.y` is a source +file present on disk (e.g. open in a code editor window). + +#### File dependencies + +Locations on the file system are specified as fully resolved paths, using +the `Location` type: + +```haskell +-- | A (fully resolved) location of a dependency or result of a rule, +-- consisting of a base directory and of a file path relative to that base +-- directory path. +-- +-- In practice, this will be something like @( dir, toFilePath modName )@, +-- where: +-- +-- - for a file dependency, @dir@ is one of the Cabal search directories, +-- - for an output, @dir@ is a directory such as @autogenComponentModulesDir@ +-- or @componentBuildDir@. +type Location = (FilePath, FilePath) +``` + +That is, each rule can be thought of as a pure function that takes in the +contents of the files at the input locations (the `dependencies` of the rule), +and outputs the contents of the files at the output locations (the `results` +of the rule). +The logic that computes all pre-build rules is responsible for computing such +resolved locations, for example by searching the Cabal search directories. +However, there are certain restrictions on the filepaths used for results of +rules. Namely: + + - these filepaths are not allowed to refer to files outside the project, + - the location of any result file must either be the autogenerated modules + directory for the component, the build directory for the component, or + a temporary directory for the component. + +To implement the `happy` preprocessor using these fine-grained build rules, +one would thus: + + - search for all `.y` files corresponding to Haskell modules declared by the + component in the `.cabal` file, + - for each such `.y` file, register an associated rule that stores the + `happy` action together with the relevant arguments to pass to it (e.g. + the input/output locations and additional flags). + +This results in one rule for each `.y` file, which will get re-run whenever the +associated `.y` file is modified. +The rules need to be re-computed whenever a `.y` file gets added/removed, or when +a `.hs` file with the same module name as a `.y` file gets added/removed; we can +declare this by using the `MonitorFileOrDir` functionality. + +#### Dynamic dependencies + +Inspired by [ninja's dynamic dependencies](https://ninja-build.org/manual.html#ref_dyndep), +we support rules with dynamic dependencies, using the following API: + +```haskell +data RuleCommands + = -- | A rule with statically-known dependencies. + forall arg. + Typeable arg => + StaticRuleCommand + { staticRuleCommand :: !(Command arg (IO ())) + -- ^ The command to execute the rule. + } + | -- | A rule with dynamic dependencies, which consists of two parts: + -- + -- - a dynamic dependency computation, that returns additional edges to + -- be added to the build graph together with an additional piece of data, + -- - the command to execute the rule itself, which receives the additional + -- piece of data returned by the dependency computation. + forall depsArg depsRes arg. + (Typeable depsArg, Typeable depsRes, Typeable arg) => + DynamicRuleCommands + { dynamicRuleInstances :: !(StaticPtr (Dict (Binary depsRes, Show depsRes))) + -- ^ Static evidence used for serialisation, in order to pass the result + -- of the dependency computation to the main rule action. + , dynamicDepsCommand :: !(Command depsArg (IO ([Dependency], depsRes))) + -- ^ A dynamic dependency computation. The resulting dependencies + -- will be injected into the build graph, and the result of the computation + -- will be passed on to the command that executes the rule. + , dynamicRuleCommand :: !(Command arg (depsRes -> IO ())) + -- ^ The command to execute the rule. It will receive the result + -- of the dynamic dependency computation. + } +``` + +Here, a rule with dynamic dependencies is specified by two actions: + + - an action that computes dynamic dependencies and returns additional data, + - an action that takes in this additional data and executes the rule. + +The information flow is that the build system should execute all these dynamic +dependency computations first, adding all the resulting edges to the build graph. +It can then start running rules in dependency order, passing any rules with +dynamic dependencies the additional data that was returned by the dependency +computation. + +This functionality is important in handling the common case of imports without +wasting work. Consider for example pre-processing `.chs` files: + + - An individual file `foo.chs` can be compiled using `c2hs`, producing both + `foo.hs` and `foo.chi`. + - `foo.chs` may import `bar.chs`, introducing a dependency of `foo.chs` + on `bar.chi`. + +To preprocess a collection of `.chs` files, we would register a rule with +dynamic dependencies for each `.chs` file, consisting of two actions: + + - a dynamic dependency action that computes which `.chi` files the current + `.chs` file depends on for compilation, + - the `c2hs` preprocessor action, that outputs both a `.hs` and a `.chi` + file. + +See [§ API overview](#api-overview) for an illustration of such an implementation. + +This design means that we **do not** re-run the entire computation of rules +each time a `.chs` file is modified. Instead, we re-run the dependency +computation of the modified `.chs` file (as its imports list may have changed), +which allows us to update the build graph. +This ensures the dependencies of this rule remain up to date, ensuring correct +recompilation checking. Without this mechanism for declaring additional dynamic +dependencies, we would be forced to re-run the entire computation of rules each +time any input to any rule changes, which potentially leads to a lot of wasted +work. + +### Rule demand + +When do we re-run the actions associated to individual rules, and when do +we re-compute the rules (which, we recall, are returned as the result of an +`IO` action)? + +The general flow is as follows: + + 1. When the rules are **out-of-date**, re-query the pre-build rules to obtain + up-to-date rules. + 2. Re-run the **demanded** **stale** rules. + +A rule is considered **demanded** if: + + - it generates a Haskell module that is declared to be an autogenerated + module of the component we are building, or + - another rule that is itself demanded depends on the output of the rule. + +The rules as a whole are considered **out-of-date** precisely when any of the +following conditions apply: + +
Eq Rule
+ instance.)
+