From d5e4c5433fad9f688c3a4c469a3ec16050077a9f Mon Sep 17 00:00:00 2001
From: Jeremy Fitzhardinge <jeremy@goop.org>
Date: Sat, 2 Sep 2023 18:03:10 -0700
Subject: [PATCH] Revise and clarify

---
 text/0000-extern-impl.md | 470 +++++++++++++++++++++++++--------------
 1 file changed, 297 insertions(+), 173 deletions(-)

diff --git a/text/0000-extern-impl.md b/text/0000-extern-impl.md
index ae74c4d6266..8ed685302f7 100644
--- a/text/0000-extern-impl.md
+++ b/text/0000-extern-impl.md
@@ -8,42 +8,51 @@
 
 This RFC proposes a mechanism to allow `impl`s for a type or trait to be in a
 separate crate, while still meeting the soundness guarantees currently enforced
-by the orphan rule. This allows the code author to use the build system to more
-flexibly partition functionality among crates.
+by the orphan rule. With this, the crate structure (and thus build system
+dependency graph) can be decoupled from the actual implementations.
 
 # Motivation
 [motivation]: #motivation
 
-In order to guarantee coherence, Rust currently imposes the fairly strict orphan
-rule. That is:
-- only a crate defining a type can have inherent implementations for that type,
-  or for any traits, and
-- only a crate defining a trait can implement that trait for any type
-
-In other words, a crate is not allowed to have "third-party implementations" -
-it's not allowed to add new inherent implementations to a non-local type, nor is
-it allowed to implement a non-local trait for a non-local type.
-
-This raises a number of problems, but this RFC's principle concern is that of
-dependency graph scaling and build performance. Specifically, the current orphan
-rule sets a hard limit how finely implementations can be distributed amongst
-crates.
-
-For example:
-
-- When introducing a new trait, it's useful to provide implementations for other
-  external types. But this means the trait-definition crate takes on
-  depdendencies on all those downstream dependencies whether they're ultimately
-  needed or not.
-- It common for a to only use a crate for its type definitions, but not need any
-  of it's implementations (for example, when mentioning a type in a function
-  signature or another type definition). However, with the current orphan rule,
-  this would require waiting for the full crate to compile, including all of its
-  dependencies. 
-
-What this RFC proposes is a new kind of dependency relationship between two
-crates, in which a crate can have an "impl" dependency on a crate defining a
-type or trait, and coherence is a computed over more than one crate.
+Rust requires implementation coherence - that is, there's at most one
+implementation of a trait for a type. This is required to:
+- avoid any ambiguity about which implementation to use when compiling, and
+- make sure that adding a new implementation does not change the meaning of
+existing code.
+
+To do this, Rust implements the orphan rule, which requires that at least one of
+the trait or the type are defined in the same crate as the implementation (ie is
+a "local" definition). The "other" definition must be in a crate which is a
+dependency, and as the dependency graph is acyclic, that "other" crate can't
+possibly have a conflicting implementation because it can't have both
+definitions in scope.
+
+The effect of this rule imposes some very hard constraints on how a developer
+can factor their code into crates. For example it prevents:
+- introducing a new trait and implementing it for existing types, without adding
+  dependencies for those types to all trait users
+  - more generally, refactoring to break dependency chains
+- refactoring code to increase build parallelism
+- adding trait implementations for types in crates whose source is generated (eg protobuf)
+- splitting libstd/libcore
+- multiple conflicting implementations to make different implementation
+  tradeoffs (eg size vs performance), or bindings to incompatible native
+  libraries (python 2 vs python 3)
+
+This RFC proposes an extension to the orphan rule which retains the coherence
+invariant, while loosening the "locality" requirement for implementation. This
+considerably increases implementation flexibility, at the cost of no longer
+being able to enforce the invariant purely locally.
+
+At the rustc and language level, this mechanism is very general, and allows
+almost arbitrary implementation relations between crates. However in practice
+unconstrained use of this feature (ie implementing traits for types from
+different crates from unrelated origins) would have large impacts on the general
+Rust / crates.io ecosystem if not carefully considered.
+
+As a result, this RFC also proposes a fairly small Cargo extension which allows
+the feature to be used within a workspace for experimentation, but leaves the
+implications of publishing to crates.io to a future RFC.
 
 ## More concrete example
 
@@ -77,16 +86,45 @@ DBMysql --> Mysql
 DBMongo --> MongoDB
 DBRocks --> RocksDB
 ```
-# Guide-level explanation
-[guide-level-explanation]: #guide-level-explanation
+# Guide-level explanation (cargo)
+[guide-level-explanation]: #guide-level-explanation 
+
+Within a workspace, you can designate an "impl" relationship between two crates.
+For example, in your `my_impls/Cargo.toml` you can put:
 
-This RFC defines the notion of an *defining* crate and an *implementing* crate.
+```
+[dependency]
+my_definitions = { version = "0.1", path = "../my_definitions", impl = True }
+```
 
-We extend the `rustc` `--extern` flag with an additional `impl` flag:
-`--extern impl:dbtrait=dbtrait.rlib`. This means that *this* crate (the
-*implementing* crate), that rustc is currently compiling, is allowed to
-implement traits and methods for types defined in the `dbtrait` crate (the
-*defining* crate).
+This means that your `my_impls` crate can add trait implementations using types
+or traits defined in `my_definitions`, almost as if they were defined in
+`my_definitions`. If you want to use those implementations in other crates,
+however, you must also add a dependency to both `my_impls` and `my_definitions`
+(or you could re-export the definitions in `my_impls`).
+
+You may only implement for types/traits actually defined in `my_definitions`
+itself. Re-exports don't count.
+
+The dependency *must* be a path dependency to another crate defined in the same
+workspace. Cargo will not allow you specify `impl = True` for other crates.
+
+Publishing packages with `impl = True` to crates.io is not allowed.
+
+# Guide-level explanation (rustc)
+
+In current Rust, when you specify in your build system that a crate depends on
+another, this is a "use" dependency - the dependee can use definitions in the
+dependant.
+
+This RFC defines a new dependency relationship: the "impl" dependency. The
+dependee can provide implementations for types and traits defined in the
+dependant.
+
+We extend the `rustc` `--extern` flag with an additional `impl` flag: `--extern
+impl:dbtrait=dbtrait.rlib`. This means that the crate currently being compiled
+with this option, is allowed to implement traits and methods for types defined
+in the `dbtrait` crate (the *defining* crate).
 
 For example:
 ```rust
@@ -106,11 +144,6 @@ crate as far as the orphan rule goes. Note that you may only implement traits
 and types for crates directly defined in the defining crate &mdash; you cannot
 implement re-exported types.
 
-The implementing crate is also considered the same as the defining crate from a
-visibility perspective - it will be able to access `pub(crate)` modules, types,
-fields, methods, etc in the defining crate (though in this case including
-re-exports).
-
 The defining and implementing crates are not the same in other respects. They
 still have distinct crate names, defining their own crate-level namespaces. For
 example, if they both defined a type `Foo`, it would be two definitions
@@ -133,136 +166,161 @@ scope in the dependency graph, they must also be coherent.
 currently defined in terms of "local" definitions - that is, items defined with
 in the current crate.
 
-This RFC does not change the definition of Coherence, just the definition of
-"local". Instead of only considering items defined in one specific crate, it
-defines a set of crates whose definitions are considered "local".
-
-This set is defined in terms of immediate dominators in the crate dependency
-graph.
-
-Some definitions & axioms:
-- a dependency graph always has a top-level crate (typically a binary) which:
-  - has no dependencies
-  - only crates which it transitively depends on are relevant
-- a "local set" of crates TODO
-- an implementing crate must have a dependency relationship to a defining crate
-  in order to actually have implementations
-- a crate always dominates itself
-- a crate is the dominator of other crates if it's on the only path to
-  those other crates
- - it's the immediate dominator if it's the closest on that path
-
-So given this example compiling binary crate `Binary`:
-```mermaid
-graph TB
- Bin(Binary) --> A & B
- A & B -->|impl| C
- D -->|impl| C
- ```
-- When compiling `C`, because it dominates itself, `rustc` must check that
-  it is internally coherent (this is identical to today)
-- When compiling `A`, which dominates itself and `C`, `rustc` must check `A` is
-  internally coherent and coherence between `A` and `C`. Likewise `B` with
-  respect to `C`.`
-- `Binary` dominates both `A` and `B`, so when compiling it `rustc` must check
-  for coherence between them. It doesn't need to worry about coherence between
-  `A` and `C`, nor `B` and `C`, since that's already been checked.
-- `D` is not relevant to any of this, as it is not in `Binary`'s transitive
-  dependencies, and its coherence is not checked.
-
-In a more complex example:
-```mermaid
-graph TB
- Bin(Binary) --> A & C
- A --> D & E
- C & D & E -->|impl| F
- ```
- - `F` must be internally coherent, likewise `D` & `E` with respect to `F`.
- - Compiling `A` must check for coherence between `D` and `E` with respect to `F`
- - Compiling `Binary` must check for coherence between `C`, `D` and `E` (which
-   may be able to take advantage of the knowledge that `D` and `E` are already
-   coherent because `A` already checked it.)
+The details of the coherence checking logic are unchanged; the only change is
+the constraints on which crates a definition and an implementation can be in.
+
+The discussion below talks about coherence in terms of the crate dependency
+relations, but the actual coherence check is at the level of individual type and
+trait implementations. The crate dependency relations set an upper bound on
+this, as any crate which does not appear within the relevent dependency graph
+need not be considered for coherence checking.
 
 ## Changes to coherence checking
 
-In the simple case, when checking the coherence between an implementing crate
-and the defining crate, definitions in both the defining and implementing crate
-are considered "local". Aside from that, the algorithm then proceeds as before.
-Here, implementing crates `A`, `B` and `C` are each considered local to defining crate `D`, but not to each other.
+The key problem we need to solve is how to reconcile the Rust language-level
+details (types and traits) with the build system view (dependency relationship
+between crates). Coherence is defined in terms of type/trait definitions and
+implementations, which forms its own graph. This graph is embedded in the crate
+dependency graph &mdash; that is there are no edges in the type/trait
+implementation graph which are not also present in the crate dependency graph.
+
 ```mermaid
-graph TD
-  A & B & C --> D
+graph TB
+  CrateC -->|dep| CrateA & CrateB
+  subgraph CrateA
+    A["Trait A"]
+    D["Type D"]
+    E["Impl A for D"] -..->|impl| A & D
+  end
+  subgraph CrateB
+    B["Type B"]
+  end
+  subgraph CrateC
+    C["Impl A for B"] -..->|impl| A & B
+  end
 ```
 
-When checking the coherence of multiple implementing crates, we must be careful
-with the definition of "local". Since the implementing crates don't depend on
-each other, and can't have implementations for each other's types, we can safely
-consider them local to each other with respect to the defining crate `D` for the
-purposes of coherence checking.
+For the discussion below, crate B is visible to crate A if it falls within its
+transitive dependencies; a crate is visible to itself.
 
 ```mermaid
 graph TB
-  Bin(Binary) --> A & B & C --> D
-  subgraph Local
-  A & B & C
+  subgraph A sees only A
+    c1["A"] --> c2["..."]
+    c3["B"]
+  end
+  subgraph A sees A&B
+    b1["A"] --> b2["..."] --> b3["B"]
+  end
+  subgraph A sees A&B
+    a1["A"] --> a2["B"]
   end
 ```
 
-## Extra cases
+The general rule is when compiling a crate, `rustc` must check the coherence of
+a set of implementations for a type if:
+- all those implementations are visible
+- no other visible crate has checked them
 
-Impl crates depend on each other
+This means the compilation of a crate must check for coherence when:
+- all the definitions and implementations are within one crate
+- if a crate A has an impl dependency on crate B, it must check coherency
+  between A and B
+- if a crate has two or more crates with implementations for a given definition
+  within its view via distinct direct dependencies
+
+This last point is the most complex. For example:
 ```mermaid
-graph LR
- I1 & I2 -->|impl| D
- I2 --> I1
+graph TB
+  T -->|view| A & B -->|impl| D
+  A & B --> E
 ```
-Impl crates use other crates (common)
+Here, `T` must check `A` and `B`'s implementations are coherent with respect to
+each other and definitions in `D` and `E`. It can rely that `A` and `B` are each
+individually coherent with respect to `D` and `E` because that would have been
+checked when `A` and `B` were compiled.
+
+Note that while `A` and `B` both depend
+on `D` and `E`, they need only have a `impl` dependency on one of them; the `impl`
+dependency simply means that `D`'s definitions are considered local to `A` and
+`B` when seen in terms of the current orphan rule.
+
+This is OK even if `A` and `B` have different `impl` dependencies:
 ```mermaid
-graph LR
- I1 -->|impl| D
- I1 --> T
+graph TB
+  T -->|view| A & B
+  A -->|impl| D
+  A --> E
+  B --> D
+  B -->|impl| E
 ```
-Impl crates implement multiple definition crates
+Since the groups (`A`, `D` `E`) and (`B`, `D` `E`) must be internally coherent,
+and `T` still needs to check the coherence of (`A` and `B`).
+
+In this case:
 ```mermaid
-graph LR
- I1 -->|impl| D0 & D1
+graph TB
+  T --> A -->|view| B & C -->|impl| D
 ```
-Impl crates define types and have their own impl crates
+`A` must check the coherence of `B` & `C`'s implementations with respect to `D`.
+`T`, however, can rely on `A` having already checked `B` and `C`'s coherence
+because both are visible through it's `A` dependency.
+
+A more complex example:
 ```mermaid
-graph LR
- I1 -->|impl| D0
- I1 -->|impl| I2
- I2 -->|impl| D0
+graph TB
+  T --> A -->|view| B & C -->|impl| D
+  B & C --> E
+  T --> C
 ```
-Duplicated checking -- `A` and `B` would both check the coherence of `I1` and
-`I2` with respect to each other, since neither knows about the other. 
+`T` can rely on `A` to check the coherence of `B` and `C`, even though `T` has
+its own direct dependency on `C`, because `C` is also visible through `A`.
+
+But in this case:
 ```mermaid
 graph TB
-I1 & I2 -->|impl| D
-Bin([Binary]) --> A & B --> I1 & I2
+  T --> A -->|view| B & C -->|impl| E
+  B & C --> F
+  T -->|view| D -->|impl| E
+```
+`T` can rely on `A` for the coherency of `B` and `C`, but it must check the
+coherence of `B`, `C` and `D` with respect to `E`.
 
+Note that this can mean that the same crates' coherency can be checked
+redundantly. For example:
+```mermaid
+graph TB
+  Binary --> X & Y & Z -->|view| A & B -->|impl| C
 ```
+`X`, `Y` and `Z` are all the first to see the implementations of `A` and `B`
+with respect to `C`, so must each do their own coherency checks, even though
+`Binary` only needs for this to be performed once. See below for some discussion
+about optimizing this.
 
 # Drawbacks
 [drawbacks]: #drawbacks
 
 This adds a fair amount of complexity to a delecate part of the language
-semantics and to the compiler's implementation of it. I've tried to keep the
-changes as light-touch as possible, but I haven't tried to prototype it yet.
+semantics and to the compiler's implementation of it. This is opt-in, but one
+would need to understand these semantics when using other people's packages
+which use this feature.
 
 If implemented as described here, and enabled in Cargo in an unconstrainted way,
-it would enabled full third-party implementations. That is, anyone could publish
-a crate implementing any trait for any type. This would be fine in the small,
-but at large scales it would mean that if there were two crates with conflicting
-implementations, then could never appear in the same dependency graph. It means
-that non-local changes to dependencies could cause downstream build breakage.
-This would lead to very undesireable ecosystem-wide dynamics. As such how this
-feature is used must be very carefully considered.
+it would enable full third-party implementations. That is, anyone could publish
+a package implementing any type or trait in any other package. This would be
+fine at small scalle, but at large scales it would mean that if there were two
+crates with conflicting implementations, then could never appear in the same
+dependency graph. That is a change to private dependencies, which wouldn't
+usually be considered compatibility breaking, would cause downstream build
+breakage. This would lead to very undesireable ecosystem-wide dynamics. As such
+how this feature is used must be very carefully considered.
 
 # Rationale and alternatives
 [rationale-and-alternatives]: #rationale-and-alternatives
 
-In principle you could use Cargo features to achieve similar outcomes to the
+## Features
+
+In principle you could use Cargo "features" to achieve similar outcomes to the
 motivating `DBTrait` example above - you'd define a feature for each DB
 implementation, and the consumer would only enable the specific implementations
 they want.
@@ -280,18 +338,62 @@ which end up single-threading the build.
 (This is particularly acute in cases where Cargo is primarily used as a
 dependency management tool, and builds are done with another tool like Buck.)
 
+Sometimes features are used in a non-additive way, which is technically not
+supported, but there's no mechanism to prevent it (or even warn about it). In
+contrast, the mechanism described here does allow for separate implementation
+crates with conflicting implementations, perhaps to allow for different
+implementation tradeoffs, or bindings to different non-Rust implementations.
+
+Extensive use of features make code navigation difficult. For example, IDE
+integration (VS Code + Rust Analyzer) often won't know which features are
+enabled, and so will tend to dim feature-controlled code which should be
+enabled, or vice versa.
+
+It's also hard to know which features should be enabled when generating
+documentation with `rustdoc`. Not enabling features which are actually used
+means that crate functionality goes undocumented, where as enabling too many
+unused features can obscure the API the user actually wants to see.
+
 # Prior art
 [prior-art]: #prior-art
 
-...?
+## Generally weak coherence
 
 Other languages have a very weak notion of coherence. C++ doesn't enforce it at
 all, and simply defines conflicting implementations as UB. Haskell allows for
 ecosystem wide conflicts, which may or may not be a problem in practice. 
 
-Rust is unique(?) in separately languages in taking such a strong stand on this,
-and it's been mostly beneficial. But it would be nice to have a mechanism to
-loosen the constraints on this without admitting unsoundness.
+Rust is unique(?) in separately-compiled languages in taking such a strong stand
+on this, and it's been mostly beneficial. But it would be nice to have a
+mechanism to loosen the constraints on this without admitting unsoundness.
+
+## Interface/implementation split
+
+In C and C++, there's a clear separation between "header files" containing a
+specification of an interface, and "source files" which contain actual
+implementations. Typically a compilation unit which has a dependency on another
+compilation unit will include the header files containing the interface
+definitions, and can be compiled with that information alone; it does not requre
+the implementation of the dependant to be compiled.
+
+Ideally (and often in practice) this means that the entire program can be
+compiled in parallel, with only the final linking step dependent on all the
+compilation units.
+
+The primary disadvantage of this scheme is that the header file definitions must
+be manually kept in sync with the implementations, and failing to do so can
+cause numerous poor outcomes from linker errors to undefined behaviour.
+
+Today, Rust approximates this interface/implementation split with pipelined
+builds, where `rustc` will generate a metadata object containing a crates
+definitions, followed by actual code generation. The dependent compilation can
+start as soon as the metadata is available without having to wait for the
+generated code. But this still requires metadata for the entire dependency chain
+- and for many crates the metadata generation *is* the expensive part of
+compilation.
+
+With this RFC the split can be made more explicit by having actual separate
+crates for definitions and implementations.
 
 # Unresolved questions
 [unresolved-questions]: #unresolved-questions
@@ -299,37 +401,64 @@ loosen the constraints on this without admitting unsoundness.
 Intended for close coupled crates from same origin, but *could* be used for
 generic third-party impls.
 
-The big question is what this looks like from a Cargo perspective. I can think
-of a few options:
-- Extend the namespace proposal so that crates in the same namespace can have
-  `impl` relations
-- Multi-crate Cargo packages, which allow internal `impl` dependencies
+The big question is what this looks like from a crates.io perspective. If we
+could guarantee that `cargo publish` published an entire workspace atomically,
+then 
 
 
 # Future possibilities
 [future-possibilities]: #future-possibilities
 
-Allow transitive impl relations. For example:
-```mermaid
-graph LR
-  A -->|impl| B -->|impl| C
-```
-Allowing `A` to implement items in `C` because of the transitive `impl`
-relationship through B. I think this potentially complicates the correctness
-analysis a lot though, and doesn't really have a lot of benefits over just
-making `A` depend on `C` directly.
-
-There's a question on whether to globally enforce coherence even between crates
-which are not in the same dependency graph. This could be particularly useful at
-crates.io publication time to make sure the package doesn't have lurking
-problems. On the other hand there might be cases where it's desireable to allow
-incoherent implementations to exist; for example, two implementations of the
-same API with different algorithms tuned for different use-cases.
+## Support intrinsic implementations
+
+This RFC focuses on implementing traits for types, since this is the common pain
+point. In principle everything described above will also work for intrinsic
+implementations, but there's some awkward secondary questions. For example,
+intrinsic implementations tend to access the private fields of structs; does
+that mean we need add or redefine visibility rules?
+
+## "Global" third-party implementations
+
+If this mechanism were broadly enabled for the crates.io ecosystem, it could
+cause undesireably ecosystem splits. For example if you have type-defining
+package `some_type` and trait defining package `some_trait`, you could have two
+separate crates `A` & `B` implementing `some_trait` for `some_type`. If your
+package `my_package` contains just one of `A` or `B` in its dependencies, then
+all is OK. But at any point, with only a minor patch update, any package could
+add the other to its dependencies, preventing your package from compiling. This
+kind of brittleness is highly undesireable.
+
+So is there some way to enable third-party these kinds of third-party
+implementation crates in a more general way without introducing this kind of
+ecosystem brittleness?
+
+The [namespacing RFC](https://github.com/Manishearth/namespacing-rfc)
+([PR)(https://github.com/rust-lang/rfcs/pull/3243)]) is also looking at
+questions relating the the overall crates.io ecosystem. One could imagine a
+proposal, for example, that packages within the same namespace could have this
+`impl` relationship. But whether that makes sense really depends on the intended
+semantics of namespaces.
+
+## Cached coherence checks
 
 There are cases where the same coherence properties may be checked multiple
-times. Perhaps these checks could be memoized with the incremental machinery to
+times, if the same implementing crates are dependencies of multiple unrelated
+crates. Perhaps these checks could be memoized with the incremental machinery to
 avoid the redundancy.
 
+
+## Auto-split crates
+
+Modules
+within crates may have cyclic dependencies, but the crate dependency graph must
+be acyclic. But only in extremely pathological cases will a crate's internal
+dependency graph consist of a single strongly connected component.
+
+This opens the possibility of using the mechanism described in this RFC to
+automatically partition a crate into sub-crates in order to expose more
+built-time parallelism. This could either be performed inline during the build
+or some tooling to perform pre-build processing on the crates.
+
 # Appendix - Alloy spec (WIP)
 
 ---
@@ -343,7 +472,9 @@ https://alloy.readthedocs.io for documentation.
 
 # Alloy spec for Rust implementation coherency
 
-This is a simplified spec of Rust coherency checking. Rust requires that there,
+(TODO: make this fully consistent with the description above.)
+
+This is a simplified spec of Rust coherency checking. Rust requires that there
 are no overlapping or conflicting implementations with a complete Rust program,
 as that would allow for ambiguity about which one to use.
 
@@ -351,13 +482,6 @@ as that would allow for ambiguity about which one to use.
 subset of simple trait implementations for types, with no generic type
 parameters.)
 
-The current solution to this is to requires that when a crate implements a trait
-for a type, at least one of the type or the trait have to be "local" - ie,
-defined in the same crate.
-
-This is simple to describe and implement, but in practice it has several downsides:
-- (dependencies, no third-party impls)
-
 ## Alloy Spec
 
 First we define signatures for types and traits. Both are defined in a crate: