From f01144694f4adb447c45980eaeda39f33c737179 Mon Sep 17 00:00:00 2001
From: Jeremy Fitzhardinge <jeremy@goop.org>
Date: Sat, 29 Jul 2023 13:24:14 -0700
Subject: [PATCH 1/6] WIP

---
 text/0000-extern-impl.md | 218 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 218 insertions(+)
 create mode 100644 text/0000-extern-impl.md

diff --git a/text/0000-extern-impl.md b/text/0000-extern-impl.md
new file mode 100644
index 00000000000..9c671c5108f
--- /dev/null
+++ b/text/0000-extern-impl.md
@@ -0,0 +1,218 @@
+- Feature Name: (fill me in with a unique ident, `my_awesome_feature`)
+- Start Date: (fill me in with today's date, YYYY-MM-DD)
+- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
+- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)
+
+# Summary
+[summary]: #summary
+
+This RFC proposes a mechanism to allow `impl`s for a type or trait to be in a
+separate crate, while still meeting the soundness guarantees currently enforced
+by the orphan rule.
+
+# Motivation
+[motivation]: #motivation
+
+In order to guarantee coherence, Rust currently imposes the fairly strict orphan
+rule. That is:
+- only a crate defining a type can have inherent implementations for that type,
+  or for any traits, and
+- only a crate defining a trait can implement that trait for any type
+
+In other words, a crate is not allowed to have "third-party implementations" -
+it's not allowed to add new inherent implementations to a type, nor is it
+allowed to implement a foreign trait for a foreign type.
+
+This raises a number of problems, but this RFC's principle concern is that of
+build performance and dependency graph scaling. 
+
+For example:
+
+- It common for a to only use a crate for its type definitions, but not need any
+  of it's implementations (for example, when mentioning a type in a function
+  signature or another type definition). However, with the current orphan rule,
+  this would require waiting for the full crate to compile, including all of its
+  dependencies. 
+- When introducing a new trait, it's useful to provide implementations for other
+  external types. But this means the trait-definition crate takes on
+  depdendencies on all those downstream dependencies whether they're needed or not.
+
+~~What this RFC proposes is a mechanism which allows implementations to be
+delegated to "trusted" or "friendly" crates (typically defined in the same
+package), allowing downstream consumers to take on only the specific
+dependencies they need. Furthermore a "witness" mechanism makes sure that the
+coherence invariants currently enforced by the orphan rule are maintained.~~
+
+Dependency graph dominator based coherence checking.
+
+Status Quo:
+```mermaid
+graph LR
+MysqlUser --> DBTrait
+MongoUser --> DBTrait
+DBTrait --> Mysql
+DBTrait --> MongoDB
+DBTrait --> RocksDB
+```
+
+Proposed:
+```mermaid
+graph LR
+MysqlUser --> DBMysql
+MongoUser --> DBMongo
+DBMysql --> Mysql
+DBMysql -->|impl| DBTrait
+DBMongo --> MongoDB
+DBMongo -->|impl| DBTrait
+DBRocks --> RocksDB
+DBRocks -->|impl| DBTrait
+```
+# Guide-level explanation
+[guide-level-explanation]: #guide-level-explanation
+
+- defining crate and impl crate
+
+`--extern impl:name=name.rlib`
+
+- impl crate must directly depend on defining crate - re-exports don't count
+- defining crate & impl crate are considered the same (eg `pub(crate)` is visible to both)
+- impls in defining & impl must be coherent
+
+# Reference-level explanation
+[reference-level-explanation]: #reference-level-explanation
+
+- "Local" for coherence checking extended from "definitions in one crate" to "definitions from set of crates"
+- set of crates are impls for a definition which are downstream from dominator
+
+Dominating crates:
+- crate dominates itself (so must be internally coherent)
+- impl crate dominates self and defining crate (so must be coherent)
+- least dominator for a set of impl crates must guarantee coherence
+
+Defining crate is its own implementing crates, and must be internally coherent
+(ie, status quo)
+```mermaid
+graph TD
+D["`D
+also I0`"]
+```
+
+Each impl crate dominates its defining crate(s) and must be coherent with them:
+```mermaid
+graph TD
+I1 -->|impl| D
+I2 -->|impl| D
+In["I..n"] -->|impl| D
+```
+
+Least dominator of all impl crates for a given defining crate must check
+coherence of impls:
+```mermaid
+graph TD
+D["`D
+Defines types/traits`"]
+I1 -->|impl| D
+I2 -->|impl| D
+I3 -->|impl| D
+In["`I..n
+impls for types/traits in D`"] -->|impl| D
+C0 --> I1
+C1 --> I2
+C2["`C2
+must check for I1, I2 coherence`"]
+C2 --> C0
+C2 --> C1
+C3["C3 need not check"]
+C3 --> C2
+C4["C4 must check I1, I2, I3 coherence"]
+C4 --> C2
+C4 --> I3
+```
+
+## Extra cases
+
+Impl crates depend on each other
+```mermaid
+graph TD
+ I1 -->|impl| D
+ I2 -->|impl| D
+ I2 --> I1
+```
+Impl crates use other crates (common)
+```mermaid
+graph TD
+ I1 -->|impl| D
+ I1 --> T
+```
+Impl crates implement multiple definition crates
+```mermaid
+graph TD
+ I1 -->|impl| D0
+ I1 -->|impl| D1
+```
+Impl crates define types and have their own impl crates
+```mermaid
+graph TD
+ I1 -->|impl| D0
+ I1 -->|impl| I2
+ I2 -->|impl| D0
+```
+
+
+This is the technical portion of the RFC. Explain the design in sufficient detail that:
+
+- Its interaction with other features is clear.
+- It is reasonably clear how the feature would be implemented.
+- Corner cases are dissected by example.
+
+The section should return to the examples given in the previous section, and explain more fully how the detailed proposal makes those examples work.
+
+# Drawbacks
+[drawbacks]: #drawbacks
+
+- Complexity
+- Ecosystem split
+
+# Rationale and alternatives
+[rationale-and-alternatives]: #rationale-and-alternatives
+
+- Features?
+  - only work for small-scale projects
+
+# Prior art
+[prior-art]: #prior-art
+
+Discuss prior art, both the good and the bad, in relation to this proposal.
+A few examples of what this can include are:
+
+- For language, library, cargo, tools, and compiler proposals: Does this feature exist in other programming languages and what experience have their community had?
+- For community proposals: Is this done by some other community and what were their experiences with it?
+- For other teams: What lessons can we learn from what other communities have done here?
+- Papers: Are there any published papers or great posts that discuss this? If you have some relevant papers to refer to, this can serve as a more detailed theoretical background.
+
+This section is intended to encourage you as an author to think about the lessons from other languages, provide readers of your RFC with a fuller picture.
+If there is no prior art, that is fine - your ideas are interesting to us whether they are brand new or if it is an adaptation from other languages.
+
+Note that while precedent set by other languages is some motivation, it does not on its own motivate an RFC.
+Please also take into consideration that rust sometimes intentionally diverges from common language features.
+
+# Unresolved questions
+[unresolved-questions]: #unresolved-questions
+
+Intended for close coupled crates from same origin, but *could* be used for
+generic third-party impls.
+
+Corresponding Cargo part
+  - Multi-crate packages?
+  - Namespaces
+  - Some other abstraction?
+
+# Future possibilities
+[future-possibilities]: #future-possibilities
+
+- Transitiive impls
+- Total coherence check
+  - All impl crates are checked against each other for publication, even if not
+    used together
+  - Or deliberately allow for conflicting impls in different impl crates, eg
+    different algorithms for different use-cases for the same thing
\ No newline at end of file

From fbd0b59eb3fea3d1e85ea4293a7e8ad52752ff6c Mon Sep 17 00:00:00 2001
From: Jeremy Fitzhardinge <jeremy@goop.org>
Date: Sat, 29 Jul 2023 19:59:19 -0700
Subject: [PATCH 2/6] Lots more details

---
 text/0000-extern-impl.md | 339 ++++++++++++++++++++++++++-------------
 1 file changed, 224 insertions(+), 115 deletions(-)

diff --git a/text/0000-extern-impl.md b/text/0000-extern-impl.md
index 9c671c5108f..fe19dfdd43e 100644
--- a/text/0000-extern-impl.md
+++ b/text/0000-extern-impl.md
@@ -1,5 +1,5 @@
-- Feature Name: (fill me in with a unique ident, `my_awesome_feature`)
-- Start Date: (fill me in with today's date, YYYY-MM-DD)
+- Feature Name: extern_impl
+- Start Date: 2023-07-17
 - RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
 - Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)
 
@@ -8,7 +8,8 @@
 
 This RFC proposes a mechanism to allow `impl`s for a type or trait to be in a
 separate crate, while still meeting the soundness guarantees currently enforced
-by the orphan rule.
+by the orphan rule. This allows the code author to use the build system to more
+flexibly partition functionality among crates.
 
 # Motivation
 [motivation]: #motivation
@@ -20,181 +21,277 @@ rule. That is:
 - only a crate defining a trait can implement that trait for any type
 
 In other words, a crate is not allowed to have "third-party implementations" -
-it's not allowed to add new inherent implementations to a type, nor is it
-allowed to implement a foreign trait for a foreign type.
+it's not allowed to add new inherent implementations to a non-local type, nor is
+it allowed to implement a non-local trait for a non-local type.
 
 This raises a number of problems, but this RFC's principle concern is that of
-build performance and dependency graph scaling. 
+dependency graph scaling and build performance. Specifically, the current orphan
+rule sets a hard limit how finely implementations can be distributed amongst
+crates.
 
 For example:
 
+- When introducing a new trait, it's useful to provide implementations for other
+  external types. But this means the trait-definition crate takes on
+  depdendencies on all those downstream dependencies whether they're ultimately
+  needed or not.
 - It common for a to only use a crate for its type definitions, but not need any
   of it's implementations (for example, when mentioning a type in a function
   signature or another type definition). However, with the current orphan rule,
   this would require waiting for the full crate to compile, including all of its
   dependencies. 
-- When introducing a new trait, it's useful to provide implementations for other
-  external types. But this means the trait-definition crate takes on
-  depdendencies on all those downstream dependencies whether they're needed or not.
 
-~~What this RFC proposes is a mechanism which allows implementations to be
-delegated to "trusted" or "friendly" crates (typically defined in the same
-package), allowing downstream consumers to take on only the specific
-dependencies they need. Furthermore a "witness" mechanism makes sure that the
-coherence invariants currently enforced by the orphan rule are maintained.~~
+What this RFC proposes is a new kind of dependency relationship between two
+crates, in which a crate can have an "impl" dependency on a crate defining a
+type or trait, and coherence is a computed over more than one crate.
+
+## More concrete example
+
+Say you've just written a new crate which defines a univeral interface to all
+types of databases, called `DBTrait`. You want to make this trait useful from
+the outset by implementing it for a number of existing DB crates.
 
-Dependency graph dominator based coherence checking.
+Today this means your `DBTrait` crate would need to take dependencies on every
+one of those DB crates in order to do `impl DBTrait for X`. It also means that
+every downstream user of `DBTrait` would also transitively gain those
+dependencies.
 
-Status Quo:
+Here, `MysqlUser` ends up depending on all of `Mysql`, `MongoDB` and `RocksDB`
+even though it only cares about `Mysql`.
 ```mermaid
-graph LR
-MysqlUser --> DBTrait
-MongoUser --> DBTrait
-DBTrait --> Mysql
-DBTrait --> MongoDB
-DBTrait --> RocksDB
+graph TB
+MysqlUser(MysqlUser) --> DBTrait --> Mysql & MongoDB & RocksDB
 ```
 
-Proposed:
+With this RFC, the implementation of `DBTrait` for each of the database crates
+would be distributed among multiple implementation crates. Downstream users
+would only need to take a dependency on the specific implemenetations they need.
+
+In this example, `MysqlUser` depends only on `DBTrait` and `DBMysql`, and is
+unaffected by the `DBMongo` and `DBRocks` implementations.
 ```mermaid
-graph LR
-MysqlUser --> DBMysql
-MongoUser --> DBMongo
+graph TB
+Bin(MysqlUser) --> DBMysql
+DBMysql & DBMongo & DBRocks -->|impl| DBTrait
 DBMysql --> Mysql
-DBMysql -->|impl| DBTrait
 DBMongo --> MongoDB
-DBMongo -->|impl| DBTrait
 DBRocks --> RocksDB
-DBRocks -->|impl| DBTrait
 ```
 # Guide-level explanation
 [guide-level-explanation]: #guide-level-explanation
 
-- defining crate and impl crate
+This RFC defines the notion of an *defining* crate and an *implementing* crate.
 
-`--extern impl:name=name.rlib`
+We extend the `rustc` `--extern` flag with an additional `impl` flag:
+`--extern impl:dbtrait=dbtrait.rlib`. This means that *this* crate (the
+*implementing* crate), that rustc is currently compiling, is allowed to
+implement traits and methods for types defined in the `dbtrait` crate (the
+*defining* crate).
 
-- impl crate must directly depend on defining crate - re-exports don't count
-- defining crate & impl crate are considered the same (eg `pub(crate)` is visible to both)
-- impls in defining & impl must be coherent
+For example:
+```rust
+//! Defining crate
+trait DBTrait {}
+```
+```rust
+//! Implementation crate, also depends on the `mysql` crate.
+use mysql::MySql;
 
-# Reference-level explanation
-[reference-level-explanation]: #reference-level-explanation
+// Allowed because we're depending on `dbtrait` with `impl`
+impl dbtrait::DBTrait for MySql {}
+```
 
-- "Local" for coherence checking extended from "definitions in one crate" to "definitions from set of crates"
-- set of crates are impls for a definition which are downstream from dominator
+In other words, the implementing crate is considered the same as the defining
+crate as far as the orphan rule goes. Note that you may only implement traits
+and types for crates directly defined in the defining crate &mdash; you cannot
+implement re-exported types.
 
-Dominating crates:
-- crate dominates itself (so must be internally coherent)
-- impl crate dominates self and defining crate (so must be coherent)
-- least dominator for a set of impl crates must guarantee coherence
+The implementing crate is also considered the same as the defining crate from a
+visibility perspective - it will be able to access `pub(crate)` modules, types,
+fields, methods, etc in the defining crate (though in this case including
+re-exports).
 
-Defining crate is its own implementing crates, and must be internally coherent
-(ie, status quo)
-```mermaid
-graph TD
-D["`D
-also I0`"]
-```
+The defining and implementing crates are not the same in other respects. They
+still have distinct crate names, defining their own crate-level namespaces. For
+example, if they both defined a type `Foo`, it would be two definitions
+`defining::Foo` and `implementing::Foo`.
+
+There are no other constraints on the defining and implementing crates - they
+can freely define new traits, types and other items, and implementations for
+those types and traits. An implementing crate can be a defining crate for
+another crate.
 
-Each impl crate dominates its defining crate(s) and must be coherent with them:
+Regardless, the definitions within the defining crate, the implementing crate,
+and between the implementing crate and defining crate must meet the current
+coherence requirements. Also if there are multiple implementation crates in
+scope in the dependency graph, they must also be coherent.
+
+# Reference-level explanation
+[reference-level-explanation]: #reference-level-explanation
+
+[Coherence](https://rust-lang.github.io/chalk/book/clauses/coherence.html) is
+currently defined in terms of "local" definitions - that is, items defined with
+in the current crate.
+
+This RFC does not change the definition of Coherence, just the definition of
+"local". Instead of only considering items defined in one specific crate, it
+defines a set of crates whose definitions are considered "local".
+
+This set is defined in terms of immediate dominators in the crate dependency
+graph.
+
+Some definitions & axioms:
+- a dependency graph always has a top-level crate (typically a binary) which:
+  - has no dependencies
+  - only crates which it transitively depends on are relevant
+- a "local set" of crates TODO
+- an implementing crate must have a dependency relationship to a defining crate
+  in order to actually have implementations
+- a crate always dominates itself
+- a crate is the dominator of other crates if it's on the only path to
+  those other crates
+ - it's the immediate dominator if it's the closest on that path
+
+So given this example compiling binary crate `Binary`:
+```mermaid
+graph TB
+ Bin(Binary) --> A & B
+ A & B -->|impl| C
+ D -->|impl| C
+ ```
+- When compiling `C`, because it dominates itself, `rustc` must check that
+  it is internally coherent (this is identical to today)
+- When compiling `A`, which dominates itself and `C`, `rustc` must check `A` is
+  internally coherent and coherence between `A` and `C`. Likewise `B` with
+  respect to `C`.`
+- `Binary` dominates both `A` and `B`, so when compiling it `rustc` must check
+  for coherence between them. It doesn't need to worry about coherence between
+  `A` and `C`, nor `B` and `C`, since that's already been checked.
+- `D` is not relevant to any of this, as it is not in `Binary`'s transitive
+  dependencies, and its coherence is not checked.
+
+In a more complex example:
+```mermaid
+graph TB
+ Bin(Binary) --> A & C
+ A --> D & E
+ C & D & E -->|impl| F
+ ```
+ - `F` must be internally coherent, likewise `D` & `E` with respect to `F`.
+ - Compiling `A` must check for coherence between `D` and `E` with respect to `F`
+ - Compiling `Binary` must check for coherence between `C`, `D` and `E` (which
+   may be able to take advantage of the knowledge that `D` and `E` are already
+   coherent because `A` already checked it.)
+
+## Changes to coherence checking
+
+In the simple case, when checking the coherence between an implementing crate
+and the defining crate, definitions in both the defining and implementing crate
+are considered "local". Aside from that, the algorithm then proceeds as before.
+Here, implementing crates `A`, `B` and `C` are each considered local to defining crate `D`, but not to each other.
 ```mermaid
 graph TD
-I1 -->|impl| D
-I2 -->|impl| D
-In["I..n"] -->|impl| D
+  A & B & C --> D
 ```
 
-Least dominator of all impl crates for a given defining crate must check
-coherence of impls:
+When checking the coherence of multiple implementing crates, we must be careful
+with the definition of "local". Since the implementing crates don't depend on
+each other, and can't have implementations for each other's types, we can safely
+consider them local to each other with respect to the defining crate `D` for the
+purposes of coherence checking.
+
 ```mermaid
-graph TD
-D["`D
-Defines types/traits`"]
-I1 -->|impl| D
-I2 -->|impl| D
-I3 -->|impl| D
-In["`I..n
-impls for types/traits in D`"] -->|impl| D
-C0 --> I1
-C1 --> I2
-C2["`C2
-must check for I1, I2 coherence`"]
-C2 --> C0
-C2 --> C1
-C3["C3 need not check"]
-C3 --> C2
-C4["C4 must check I1, I2, I3 coherence"]
-C4 --> C2
-C4 --> I3
+graph TB
+  Bin(Binary) --> A & B & C --> D
+  subgraph Local
+  A & B & C
+  end
 ```
 
 ## Extra cases
 
 Impl crates depend on each other
 ```mermaid
-graph TD
- I1 -->|impl| D
- I2 -->|impl| D
+graph TB
+ I1 & I2 -->|impl| D
  I2 --> I1
 ```
 Impl crates use other crates (common)
 ```mermaid
-graph TD
+graph TB
  I1 -->|impl| D
  I1 --> T
 ```
 Impl crates implement multiple definition crates
 ```mermaid
-graph TD
- I1 -->|impl| D0
- I1 -->|impl| D1
+graph TB
+ I1 -->|impl| D0 & D1
 ```
 Impl crates define types and have their own impl crates
 ```mermaid
-graph TD
+graph TB
  I1 -->|impl| D0
  I1 -->|impl| I2
  I2 -->|impl| D0
 ```
+Duplicated checking -- `A` and `B` would both check the coherence of `I1` and
+`I2` with respect to each other, since neither knows about the other. 
+```mermaid
+graph TB
+I1 & I2 -->|impl| D
+Bin([Binary]) --> A & B --> I1 & I2
 
-
-This is the technical portion of the RFC. Explain the design in sufficient detail that:
-
-- Its interaction with other features is clear.
-- It is reasonably clear how the feature would be implemented.
-- Corner cases are dissected by example.
-
-The section should return to the examples given in the previous section, and explain more fully how the detailed proposal makes those examples work.
+```
 
 # Drawbacks
 [drawbacks]: #drawbacks
 
-- Complexity
-- Ecosystem split
+This adds a fair amount of complexity to a delecate part of the language
+semantics and to the compiler's implementation of it. I've tried to keep the
+changes as light-touch as possible, but I haven't tried to prototype it yet.
+
+If implemented as described here, and enabled in Cargo in an unconstrainted way,
+it would enabled full third-party implementations. That is, anyone could publish
+a crate implementing any trait for any type. This would be fine in the small,
+but at large scales it would mean that if there were two crates with conflicting
+implementations, then could never appear in the same dependency graph. It means
+that non-local changes to dependencies could cause downstream build breakage.
+This would lead to very undesireable ecosystem-wide dynamics. As such how this
+feature is used must be very carefully considered.
 
 # Rationale and alternatives
 [rationale-and-alternatives]: #rationale-and-alternatives
 
-- Features?
-  - only work for small-scale projects
+In principle you could use features to achieve similar outcomes to the
+motivating `DBTrait` example above - you'd define a feature for each DB
+implementation, and the consumer would only enable the specific implementations
+they want.
+
+For relatively small Cargo-defined projects, this would probably work fine. You
+would lose some of the benefits (like type-only crates for use in declarations),
+but it would solve the fundamental "don't want to depend on every DB" problem.
+
+However, because features are additive, as the dependency graphs get larger it
+will tend towards having every feature enabled, so that every subgraph of the
+overall dependency graph will end up depending on what everyone else depends on.
+As a result, trait-defining crates like `DBTrait` become dependency bottlenecks
+which end up single-threading the build.
+
+(This is particularly acute in cases where Cargo is primarily used as a
+dependency management tool, and builds are done with another tool like Buck.)
 
 # Prior art
 [prior-art]: #prior-art
 
-Discuss prior art, both the good and the bad, in relation to this proposal.
-A few examples of what this can include are:
+...?
 
-- For language, library, cargo, tools, and compiler proposals: Does this feature exist in other programming languages and what experience have their community had?
-- For community proposals: Is this done by some other community and what were their experiences with it?
-- For other teams: What lessons can we learn from what other communities have done here?
-- Papers: Are there any published papers or great posts that discuss this? If you have some relevant papers to refer to, this can serve as a more detailed theoretical background.
+Other languages have a very weak notion of coherence. C++ doesn't enforce it at
+all, and simply defines conflicting implementations as UB. Haskell allows for
+ecosystem wide conflicts, which may or may not be a problem in practice. 
 
-This section is intended to encourage you as an author to think about the lessons from other languages, provide readers of your RFC with a fuller picture.
-If there is no prior art, that is fine - your ideas are interesting to us whether they are brand new or if it is an adaptation from other languages.
-
-Note that while precedent set by other languages is some motivation, it does not on its own motivate an RFC.
-Please also take into consideration that rust sometimes intentionally diverges from common language features.
+Rust is unique(?) in separately languages in taking such a strong stand on this,
+and it's been mostly beneficial. But it would be nice to have a mechanism to
+loosen the constraints on this without admitting unsoundness.
 
 # Unresolved questions
 [unresolved-questions]: #unresolved-questions
@@ -202,17 +299,29 @@ Please also take into consideration that rust sometimes intentionally diverges f
 Intended for close coupled crates from same origin, but *could* be used for
 generic third-party impls.
 
-Corresponding Cargo part
-  - Multi-crate packages?
-  - Namespaces
-  - Some other abstraction?
+The big question is what this looks like from a Cargo perspective. I can think
+of a few options:
+- Extend the namespace proposal so that crates in the same namespace can have
+  `impl` relations
+- Multi-crate Cargo packages, which allow internal `impl` dependencies
+
 
 # Future possibilities
 [future-possibilities]: #future-possibilities
 
-- Transitiive impls
-- Total coherence check
-  - All impl crates are checked against each other for publication, even if not
-    used together
-  - Or deliberately allow for conflicting impls in different impl crates, eg
-    different algorithms for different use-cases for the same thing
\ No newline at end of file
+Allow transitive impl relations. For example:
+```mermaid
+graph LR
+  A -->|impl| B -->|impl| C
+```
+Allowing `A` to implement items in `C` because of the transitive `impl`
+relationship through B. I think this potentially complicates the correctness
+analysis a lot though, and doesn't really have a lot of benefits over just
+making `A` depend on `C` directly.
+
+There's a question on whether to globally enforce coherence even between crates
+which are not in the same dependency graph. This could be particularly useful at
+crates.io publication time to make sure the package doesn't have lurking
+problems. On the other hand there might be cases where it's desireable to allow
+incoherent implementations to exist; for example, two implementations of the
+same API with different algorithms tuned for different use-cases.

From 4d75622d9d7982bd0104a48f79f77d7d7c3da144 Mon Sep 17 00:00:00 2001
From: Jeremy Fitzhardinge <jeremy@goop.org>
Date: Sat, 29 Jul 2023 20:09:14 -0700
Subject: [PATCH 3/6] More tweaks

---
 text/0000-extern-impl.md | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/text/0000-extern-impl.md b/text/0000-extern-impl.md
index fe19dfdd43e..11de02921f0 100644
--- a/text/0000-extern-impl.md
+++ b/text/0000-extern-impl.md
@@ -212,24 +212,24 @@ graph TB
 
 Impl crates depend on each other
 ```mermaid
-graph TB
+graph LR
  I1 & I2 -->|impl| D
  I2 --> I1
 ```
 Impl crates use other crates (common)
 ```mermaid
-graph TB
+graph LR
  I1 -->|impl| D
  I1 --> T
 ```
 Impl crates implement multiple definition crates
 ```mermaid
-graph TB
+graph LR
  I1 -->|impl| D0 & D1
 ```
 Impl crates define types and have their own impl crates
 ```mermaid
-graph TB
+graph LR
  I1 -->|impl| D0
  I1 -->|impl| I2
  I2 -->|impl| D0
@@ -262,7 +262,7 @@ feature is used must be very carefully considered.
 # Rationale and alternatives
 [rationale-and-alternatives]: #rationale-and-alternatives
 
-In principle you could use features to achieve similar outcomes to the
+In principle you could use Cargo features to achieve similar outcomes to the
 motivating `DBTrait` example above - you'd define a feature for each DB
 implementation, and the consumer would only enable the specific implementations
 they want.
@@ -325,3 +325,7 @@ crates.io publication time to make sure the package doesn't have lurking
 problems. On the other hand there might be cases where it's desireable to allow
 incoherent implementations to exist; for example, two implementations of the
 same API with different algorithms tuned for different use-cases.
+
+There are cases where the same coherence properties may be checked multiple
+times. Perhaps these checks could be memoized with the incremental machinery to
+avoid the redundancy.

From 3ebc4e84f9aeaa20189517abb03225511cfbabea Mon Sep 17 00:00:00 2001
From: Jeremy Fitzhardinge <jeremy@goop.org>
Date: Mon, 7 Aug 2023 02:56:42 -0700
Subject: [PATCH 4/6] Add WIP alloy spec

---
 text/0000-extern-impl.md | 168 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 168 insertions(+)

diff --git a/text/0000-extern-impl.md b/text/0000-extern-impl.md
index 11de02921f0..ae74c4d6266 100644
--- a/text/0000-extern-impl.md
+++ b/text/0000-extern-impl.md
@@ -329,3 +329,171 @@ same API with different algorithms tuned for different use-cases.
 There are cases where the same coherence properties may be checked multiple
 times. Perhaps these checks could be memoized with the incremental machinery to
 avoid the redundancy.
+
+# Appendix - Alloy spec (WIP)
+
+---
+title: Alloy spec for Rust implementation coherency
+---
+
+Note: this is a literate [Alloy](https://alloytools.org/) spec. Download the
+most recent version of Alloy from
+https://github.com/AlloyTools/org.alloytools.alloy/releases, and see
+https://alloy.readthedocs.io for documentation.
+
+# Alloy spec for Rust implementation coherency
+
+This is a simplified spec of Rust coherency checking. Rust requires that there,
+are no overlapping or conflicting implementations with a complete Rust program,
+as that would allow for ambiguity about which one to use.
+
+(This applies to all implementations, but here we're only going to consider the
+subset of simple trait implementations for types, with no generic type
+parameters.)
+
+The current solution to this is to requires that when a crate implements a trait
+for a type, at least one of the type or the trait have to be "local" - ie,
+defined in the same crate.
+
+This is simple to describe and implement, but in practice it has several downsides:
+- (dependencies, no third-party impls)
+
+## Alloy Spec
+
+First we define signatures for types and traits. Both are defined in a crate:
+
+```alloy
+sig Trait {
+    trait_def: one Crate
+}
+
+sig Type {
+    type_def: one Crate
+}
+```
+
+And crates themselves. Crates can depend on a set of other crates, but the
+overall crate dependency graph must be acyclic. Each crate also has a relation
+of which trait implementations for which types it contains.
+
+```alloy
+sig Crate {
+    deps: set Crate,
+    impls: Trait -> Type,
+} {
+    no this & this.^@deps -- acyclic
+}
+```
+
+A Binary is the unique "top-level" crate which depends on all the other crates
+transitively. Or to put it another way, no other crate depends on Binary.
+
+```alloy
+one sig Binary extends Crate {} {
+    no @deps.this -- nothing depends on Binary
+    all c: Crate - Binary | c in this.^@deps -- Binary depends on everything else
+}
+```
+
+Let's define the safety invariant we need to enforce, that every implementation
+is unique. Or more precisely, for every trait/type pair, there's at most one
+crate implementing it. (It's fine if nothing implements it.)
+
+```alloy
+pred coherent_impls[crates: Crate] {
+    all tr: Trait, ty: Type | lone crates & impls.ty.tr
+}
+``````
+
+This is the basic orphan rule, with a tight definition of "local": either the
+type or the trait must be defined in the crate:
+
+```alloy
+pred local_orphan_rule[crates: Crate] {
+    all crate: crates |
+        crate.impls in
+            (crate[trait_def] -> (crate + crate.deps)[type_def]) +
+            ((crate + crate.deps)[trait_def] -> crate[type_def])
+}
+```
+
+We can check that if `local_orphan_rule` is true for all crates, then we have
+coherence for all crates. This has no counter-examples.
+
+```alloy
+check local_coherent {
+    -- ie, checking local_orphan_rule on each crate implies that all crates are globally coherent
+    local_orphan_rule[Crate] => coherent_impls[Crate]
+}
+```
+
+## impl dependencies
+
+Let's extend the orphan constraint so that the definition of "local" is extended
+to immediate dependencies as well. In a way this is simpler than
+`local_orphan_rule` because we no longer have to constrain either the type or
+trait to be in `crate`.
+
+```alloy
+pred dep_orphan_rule[crates: Crate] {
+    all crate: crates |
+        crate.impls in
+            (crate + crate.deps)[trait_def] -> (crate + crate.deps)[type_def]
+}
+```
+
+However, this is not sufficient to maintain the invariant. This will quickly
+find a counter-example with two crates with a duplicate implementation.
+
+```alloy
+check dep_coherent_bad {
+    dep_orphan_rule[Crate] => coherent_impls[Crate]
+}
+```
+
+We need to add additional constraints to maintain the invariant. First, let's
+define a function which, for a given crate which defines types and/or traits,
+all the crates with implementations for those definitions:
+```alloy
+fun impl_crates[c: Crate]: set Crate {
+    c[trait_def][impls.univ] + c[type_def][impls].univ
+}
+```
+
+We can then apply the constraint in the most general way: for all dependencies
+of Binary, all the crates implementing anything must be coherent. We'll ignore
+that this is a tautology for now, as we'll tighten this up later.
+
+```alloy
+check dep_coherent_impl_crates {
+    {
+        dep_orphan_rule[Crate] -- redundant
+        all dep: Binary.*deps |
+            coherent_impls[impl_crates[dep]] -- tautology
+    } => coherent_impls[Crate]
+} for 10 -- make sure there's enough objects to be interesting
+```
+
+Unfortunately, this doesn't correspond to how the build is actually performed in
+practice. When we're compiling a crate which has definitions, we don't know
+which crates will have implementations. And when we're compiling the crates with
+implementations, we don't know which other crate to cross-check with for
+coherence.
+
+The key insight is that there must be *some* crate which has both the
+implementing crates in its transitive dependencies (even if it's the top-level
+Binary), which means it can check for coherence when compiling that crate.
+
+For example, here `Use` is responsible for checking the coherence of `ImplA` and
+`ImplB` in its dependencies, but `ImplC` someone else's problem, and `Other` is
+not relevant.
+```mermaid
+graph TD
+  Use --> ImplA & ImplB
+  ImplA & ImplB & ImplC --> Defn
+  Use --> Other
+```
+
+```alloy
+-- TODO check constraint for common deps
+```

From 8dd90d73e5290fefb89a8d9fafb8a329e47be554 Mon Sep 17 00:00:00 2001
From: Jeremy Fitzhardinge <jeremy@goop.org>
Date: Sat, 2 Sep 2023 18:03:10 -0700
Subject: [PATCH 5/6] Revise and clarify

---
 text/0000-extern-impl.md | 471 +++++++++++++++++++++++++--------------
 1 file changed, 298 insertions(+), 173 deletions(-)

diff --git a/text/0000-extern-impl.md b/text/0000-extern-impl.md
index ae74c4d6266..385580c8f1f 100644
--- a/text/0000-extern-impl.md
+++ b/text/0000-extern-impl.md
@@ -8,42 +8,51 @@
 
 This RFC proposes a mechanism to allow `impl`s for a type or trait to be in a
 separate crate, while still meeting the soundness guarantees currently enforced
-by the orphan rule. This allows the code author to use the build system to more
-flexibly partition functionality among crates.
+by the orphan rule. With this, the crate structure (and thus build system
+dependency graph) can be decoupled from the actual implementations.
 
 # Motivation
 [motivation]: #motivation
 
-In order to guarantee coherence, Rust currently imposes the fairly strict orphan
-rule. That is:
-- only a crate defining a type can have inherent implementations for that type,
-  or for any traits, and
-- only a crate defining a trait can implement that trait for any type
-
-In other words, a crate is not allowed to have "third-party implementations" -
-it's not allowed to add new inherent implementations to a non-local type, nor is
-it allowed to implement a non-local trait for a non-local type.
-
-This raises a number of problems, but this RFC's principle concern is that of
-dependency graph scaling and build performance. Specifically, the current orphan
-rule sets a hard limit how finely implementations can be distributed amongst
-crates.
-
-For example:
-
-- When introducing a new trait, it's useful to provide implementations for other
-  external types. But this means the trait-definition crate takes on
-  depdendencies on all those downstream dependencies whether they're ultimately
-  needed or not.
-- It common for a to only use a crate for its type definitions, but not need any
-  of it's implementations (for example, when mentioning a type in a function
-  signature or another type definition). However, with the current orphan rule,
-  this would require waiting for the full crate to compile, including all of its
-  dependencies. 
-
-What this RFC proposes is a new kind of dependency relationship between two
-crates, in which a crate can have an "impl" dependency on a crate defining a
-type or trait, and coherence is a computed over more than one crate.
+Rust requires implementation coherence - that is, there's at most one
+implementation of a trait for a type. This is required to:
+- avoid any ambiguity about which implementation to use when compiling, and
+- make sure that adding a new implementation does not change the meaning of
+existing code.
+
+To do this, Rust implements the orphan rule, which requires that at least one of
+the trait or the type are defined in the same crate as the implementation (ie is
+a "local" definition). The "other" definition must be in a crate which is a
+dependency, and as the dependency graph is acyclic, that "other" crate can't
+possibly have a conflicting implementation because it can't have both
+definitions in scope.
+
+The effect of this rule imposes some very hard constraints on how a developer
+can factor their code into crates. For example it prevents:
+- introducing a new trait and implementing it for existing types, without adding
+  dependencies for those types to all trait users
+  - more generally, refactoring to break dependency chains
+- refactoring code to increase build parallelism
+- adding trait implementations for types in crates whose source is generated (eg protobuf)
+- splitting libstd/libcore
+- multiple conflicting implementations to make different implementation
+  tradeoffs (eg size vs performance), or bindings to incompatible native
+  libraries (python 2 vs python 3)
+
+This RFC proposes an extension to the orphan rule which retains the coherence
+invariant, while loosening the "locality" requirement for implementation. This
+considerably increases implementation flexibility, at the cost of no longer
+being able to enforce the invariant purely locally.
+
+At the rustc and language level, this mechanism is very general, and allows
+almost arbitrary implementation relations between crates. However in practice
+unconstrained use of this feature (ie implementing traits for types from
+different crates from unrelated origins) would have large impacts on the general
+Rust / crates.io ecosystem if not carefully considered.
+
+As a result, this RFC also proposes a fairly small Cargo extension which allows
+the feature to be used within a workspace for experimentation, but leaves the
+implications of publishing to crates.io to a future RFC.
 
 ## More concrete example
 
@@ -77,16 +86,45 @@ DBMysql --> Mysql
 DBMongo --> MongoDB
 DBRocks --> RocksDB
 ```
-# Guide-level explanation
-[guide-level-explanation]: #guide-level-explanation
+# Guide-level explanation (cargo)
+[guide-level-explanation]: #guide-level-explanation 
+
+Within a workspace, you can designate an "impl" relationship between two crates.
+For example, in your `my_impls/Cargo.toml` you can put:
 
-This RFC defines the notion of an *defining* crate and an *implementing* crate.
+```
+[dependency]
+my_definitions = { version = "0.1", path = "../my_definitions", impl = True }
+```
 
-We extend the `rustc` `--extern` flag with an additional `impl` flag:
-`--extern impl:dbtrait=dbtrait.rlib`. This means that *this* crate (the
-*implementing* crate), that rustc is currently compiling, is allowed to
-implement traits and methods for types defined in the `dbtrait` crate (the
-*defining* crate).
+This means that your `my_impls` crate can add trait implementations using types
+or traits defined in `my_definitions`, almost as if they were defined in
+`my_definitions`. If you want to use those implementations in other crates,
+however, you must also add a dependency to both `my_impls` and `my_definitions`
+(or you could re-export the definitions in `my_impls`).
+
+You may only implement for types/traits actually defined in `my_definitions`
+itself. Re-exports don't count.
+
+The dependency *must* be a path dependency to another crate defined in the same
+workspace. Cargo will not allow you specify `impl = True` for other crates.
+
+Publishing packages with `impl = True` to crates.io is not allowed.
+
+# Guide-level explanation (rustc)
+
+In current Rust, when you specify in your build system that a crate depends on
+another, this is a "use" dependency - the dependee can use definitions in the
+dependant.
+
+This RFC defines a new dependency relationship: the "impl" dependency. The
+dependee can provide implementations for types and traits defined in the
+dependant.
+
+We extend the `rustc` `--extern` flag with an additional `impl` flag: `--extern
+impl:dbtrait=dbtrait.rlib`. This means that the crate currently being compiled
+with this option, is allowed to implement traits and methods for types defined
+in the `dbtrait` crate (the *defining* crate).
 
 For example:
 ```rust
@@ -106,11 +144,6 @@ crate as far as the orphan rule goes. Note that you may only implement traits
 and types for crates directly defined in the defining crate &mdash; you cannot
 implement re-exported types.
 
-The implementing crate is also considered the same as the defining crate from a
-visibility perspective - it will be able to access `pub(crate)` modules, types,
-fields, methods, etc in the defining crate (though in this case including
-re-exports).
-
 The defining and implementing crates are not the same in other respects. They
 still have distinct crate names, defining their own crate-level namespaces. For
 example, if they both defined a type `Foo`, it would be two definitions
@@ -133,136 +166,162 @@ scope in the dependency graph, they must also be coherent.
 currently defined in terms of "local" definitions - that is, items defined with
 in the current crate.
 
-This RFC does not change the definition of Coherence, just the definition of
-"local". Instead of only considering items defined in one specific crate, it
-defines a set of crates whose definitions are considered "local".
-
-This set is defined in terms of immediate dominators in the crate dependency
-graph.
-
-Some definitions & axioms:
-- a dependency graph always has a top-level crate (typically a binary) which:
-  - has no dependencies
-  - only crates which it transitively depends on are relevant
-- a "local set" of crates TODO
-- an implementing crate must have a dependency relationship to a defining crate
-  in order to actually have implementations
-- a crate always dominates itself
-- a crate is the dominator of other crates if it's on the only path to
-  those other crates
- - it's the immediate dominator if it's the closest on that path
-
-So given this example compiling binary crate `Binary`:
-```mermaid
-graph TB
- Bin(Binary) --> A & B
- A & B -->|impl| C
- D -->|impl| C
- ```
-- When compiling `C`, because it dominates itself, `rustc` must check that
-  it is internally coherent (this is identical to today)
-- When compiling `A`, which dominates itself and `C`, `rustc` must check `A` is
-  internally coherent and coherence between `A` and `C`. Likewise `B` with
-  respect to `C`.`
-- `Binary` dominates both `A` and `B`, so when compiling it `rustc` must check
-  for coherence between them. It doesn't need to worry about coherence between
-  `A` and `C`, nor `B` and `C`, since that's already been checked.
-- `D` is not relevant to any of this, as it is not in `Binary`'s transitive
-  dependencies, and its coherence is not checked.
-
-In a more complex example:
-```mermaid
-graph TB
- Bin(Binary) --> A & C
- A --> D & E
- C & D & E -->|impl| F
- ```
- - `F` must be internally coherent, likewise `D` & `E` with respect to `F`.
- - Compiling `A` must check for coherence between `D` and `E` with respect to `F`
- - Compiling `Binary` must check for coherence between `C`, `D` and `E` (which
-   may be able to take advantage of the knowledge that `D` and `E` are already
-   coherent because `A` already checked it.)
+The details of the coherence checking logic are unchanged; the only change is
+the constraints on which crates a definition and an implementation can be in.
+
+The discussion below talks about coherence in terms of the crate dependency
+relations, but the actual coherence check is at the level of individual type and
+trait implementations. The crate dependency relations set an upper bound on
+this, as any crate which does not appear within the relevent dependency graph
+need not be considered for coherence checking.
 
 ## Changes to coherence checking
 
-In the simple case, when checking the coherence between an implementing crate
-and the defining crate, definitions in both the defining and implementing crate
-are considered "local". Aside from that, the algorithm then proceeds as before.
-Here, implementing crates `A`, `B` and `C` are each considered local to defining crate `D`, but not to each other.
+The key problem we need to solve is how to reconcile the Rust language-level
+details (types and traits) with the build system view (dependency relationship
+between crates). Coherence is defined in terms of type/trait definitions and
+implementations, which forms its own graph. This graph is embedded in the crate
+dependency graph &mdash; that is there are no edges in the type/trait
+implementation graph which are not also present in the crate dependency graph.
+
 ```mermaid
-graph TD
-  A & B & C --> D
+graph TB
+  CrateC -->|dep| CrateA & CrateB
+  subgraph CrateA
+    A["Trait A"]
+    D["Type D"]
+    E["Impl A for D"] -..->|impl| A & D
+  end
+  subgraph CrateB
+    B["Type B"]
+  end
+  subgraph CrateC
+    C["Impl A for B"] -..->|impl| A & B
+  end
 ```
 
-When checking the coherence of multiple implementing crates, we must be careful
-with the definition of "local". Since the implementing crates don't depend on
-each other, and can't have implementations for each other's types, we can safely
-consider them local to each other with respect to the defining crate `D` for the
-purposes of coherence checking.
+For the discussion below, crate B is visible to crate A if it falls within its
+transitive dependencies; a crate is visible to itself.
 
 ```mermaid
 graph TB
-  Bin(Binary) --> A & B & C --> D
-  subgraph Local
-  A & B & C
+  subgraph A sees only A
+    c1["A"] --> c2["..."]
+    c3["B"]
+  end
+  subgraph A sees A&B
+    b1["A"] --> b2["..."] --> b3["B"]
+  end
+  subgraph A sees A&B
+    a1["A"] --> a2["B"]
   end
 ```
 
-## Extra cases
+The general rule is when compiling a crate, `rustc` must check the coherence of
+a set of implementations for a type if:
+- all those implementations are visible
+- no other visible crate has checked them
 
-Impl crates depend on each other
+This means the compilation of a crate must check for coherence when:
+- all the definitions and implementations are within one crate
+- if a crate A has an impl dependency on crate B, it must check coherency
+  between A and B
+- if a crate has two or more crates with implementations for a given definition
+  within its view via distinct direct dependencies
+
+This last point is the most complex. For example:
 ```mermaid
-graph LR
- I1 & I2 -->|impl| D
- I2 --> I1
+graph TB
+  T -->|view| A & B -->|impl| D
+  A & B --> E
 ```
-Impl crates use other crates (common)
+Here, `T` must check `A` and `B`'s implementations are coherent with respect to
+each other and definitions in `D` and `E`. It can rely that `A` and `B` are each
+individually coherent with respect to `D` and `E` because that would have been
+checked when `A` and `B` were compiled.
+
+Note that while `A` and `B` both depend
+on `D` and `E`, they need only have a `impl` dependency on one of them; the `impl`
+dependency simply means that `D`'s definitions are considered local to `A` and
+`B` when seen in terms of the current orphan rule.
+
+This is OK even if `A` and `B` have different `impl` dependencies:
 ```mermaid
-graph LR
- I1 -->|impl| D
- I1 --> T
+graph TB
+  T -->|view| A & B
+  A -->|impl| D
+  A --> E
+  B --> D
+  B -->|impl| E
 ```
-Impl crates implement multiple definition crates
+Since the groups (`A`, `D` `E`) and (`B`, `D` `E`) must be internally coherent,
+and `T` still needs to check the coherence of (`A` and `B`).
+
+In this case:
 ```mermaid
-graph LR
- I1 -->|impl| D0 & D1
+graph TB
+  T --> A -->|view| B & C -->|impl| D
 ```
-Impl crates define types and have their own impl crates
+`A` must check the coherence of `B` & `C`'s implementations with respect to `D`.
+`T`, however, can rely on `A` having already checked `B` and `C`'s coherence
+because both are visible through it's `A` dependency.
+
+A more complex example:
 ```mermaid
-graph LR
- I1 -->|impl| D0
- I1 -->|impl| I2
- I2 -->|impl| D0
+graph TB
+  T --> A -->|view| B & C -->|impl| D
+  B & C --> E
+  T --> C
 ```
-Duplicated checking -- `A` and `B` would both check the coherence of `I1` and
-`I2` with respect to each other, since neither knows about the other. 
+`T` can rely on `A` to check the coherence of `B` and `C`, even though `T` has
+its own direct dependency on `C`, because `C` is also visible through `A`.
+
+But in this case:
 ```mermaid
 graph TB
-I1 & I2 -->|impl| D
-Bin([Binary]) --> A & B --> I1 & I2
+  T --> A -->|view| B & C -->|impl| E
+  B & C --> F
+  T -->|view| D -->|impl| E
+```
+`T` can rely on `A` for the coherency of `B` and `C`, but it must check the
+coherence of `B`, `C` and `D` with respect to `E`.
 
+Note that this can mean that the same crates' coherency can be checked
+redundantly. For example:
+```mermaid
+graph TB
+  Binary --> X & Y & Z -->|view| A & B -->|impl| C
 ```
+`X`, `Y` and `Z` are all the first to see the implementations of `A` and `B`
+with respect to `C`, so must each do their own coherency checks, even though
+`Binary` only needs for this to be performed once. Fortunately these redundant
+checks can be done in parallel, but it's still a waste of CPU. See below for
+some discussion about optimizing this.
 
 # Drawbacks
 [drawbacks]: #drawbacks
 
 This adds a fair amount of complexity to a delecate part of the language
-semantics and to the compiler's implementation of it. I've tried to keep the
-changes as light-touch as possible, but I haven't tried to prototype it yet.
+semantics and to the compiler's implementation of it. This is opt-in, but one
+would need to understand these semantics when using other people's packages
+which use this feature.
 
 If implemented as described here, and enabled in Cargo in an unconstrainted way,
-it would enabled full third-party implementations. That is, anyone could publish
-a crate implementing any trait for any type. This would be fine in the small,
-but at large scales it would mean that if there were two crates with conflicting
-implementations, then could never appear in the same dependency graph. It means
-that non-local changes to dependencies could cause downstream build breakage.
-This would lead to very undesireable ecosystem-wide dynamics. As such how this
-feature is used must be very carefully considered.
+it would enable full third-party implementations. That is, anyone could publish
+a package implementing any type or trait in any other package. This would be
+fine at small scalle, but at large scales it would mean that if there were two
+crates with conflicting implementations, then could never appear in the same
+dependency graph. That is a change to private dependencies, which wouldn't
+usually be considered compatibility breaking, would cause downstream build
+breakage. This would lead to very undesireable ecosystem-wide dynamics. As such
+how this feature is used must be very carefully considered.
 
 # Rationale and alternatives
 [rationale-and-alternatives]: #rationale-and-alternatives
 
-In principle you could use Cargo features to achieve similar outcomes to the
+## Features
+
+In principle you could use Cargo "features" to achieve similar outcomes to the
 motivating `DBTrait` example above - you'd define a feature for each DB
 implementation, and the consumer would only enable the specific implementations
 they want.
@@ -280,18 +339,62 @@ which end up single-threading the build.
 (This is particularly acute in cases where Cargo is primarily used as a
 dependency management tool, and builds are done with another tool like Buck.)
 
+Sometimes features are used in a non-additive way, which is technically not
+supported, but there's no mechanism to prevent it (or even warn about it). In
+contrast, the mechanism described here does allow for separate implementation
+crates with conflicting implementations, perhaps to allow for different
+implementation tradeoffs, or bindings to different non-Rust implementations.
+
+Extensive use of features make code navigation difficult. For example, IDE
+integration (VS Code + Rust Analyzer) often won't know which features are
+enabled, and so will tend to dim feature-controlled code which should be
+enabled, or vice versa.
+
+It's also hard to know which features should be enabled when generating
+documentation with `rustdoc`. Not enabling features which are actually used
+means that crate functionality goes undocumented, where as enabling too many
+unused features can obscure the API the user actually wants to see.
+
 # Prior art
 [prior-art]: #prior-art
 
-...?
+## Generally weak coherence
 
 Other languages have a very weak notion of coherence. C++ doesn't enforce it at
 all, and simply defines conflicting implementations as UB. Haskell allows for
 ecosystem wide conflicts, which may or may not be a problem in practice. 
 
-Rust is unique(?) in separately languages in taking such a strong stand on this,
-and it's been mostly beneficial. But it would be nice to have a mechanism to
-loosen the constraints on this without admitting unsoundness.
+Rust is unique(?) in separately-compiled languages in taking such a strong stand
+on this, and it's been mostly beneficial. But it would be nice to have a
+mechanism to loosen the constraints on this without admitting unsoundness.
+
+## Interface/implementation split
+
+In C and C++, there's a clear separation between "header files" containing a
+specification of an interface, and "source files" which contain actual
+implementations. Typically a compilation unit which has a dependency on another
+compilation unit will include the header files containing the interface
+definitions, and can be compiled with that information alone; it does not requre
+the implementation of the dependant to be compiled.
+
+Ideally (and often in practice) this means that the entire program can be
+compiled in parallel, with only the final linking step dependent on all the
+compilation units.
+
+The primary disadvantage of this scheme is that the header file definitions must
+be manually kept in sync with the implementations, and failing to do so can
+cause numerous poor outcomes from linker errors to undefined behaviour.
+
+Today, Rust approximates this interface/implementation split with pipelined
+builds, where `rustc` will generate a metadata object containing a crates
+definitions, followed by actual code generation. The dependent compilation can
+start as soon as the metadata is available without having to wait for the
+generated code. But this still requires metadata for the entire dependency chain
+- and for many crates the metadata generation *is* the expensive part of
+compilation.
+
+With this RFC the split can be made more explicit by having actual separate
+crates for definitions and implementations.
 
 # Unresolved questions
 [unresolved-questions]: #unresolved-questions
@@ -299,37 +402,64 @@ loosen the constraints on this without admitting unsoundness.
 Intended for close coupled crates from same origin, but *could* be used for
 generic third-party impls.
 
-The big question is what this looks like from a Cargo perspective. I can think
-of a few options:
-- Extend the namespace proposal so that crates in the same namespace can have
-  `impl` relations
-- Multi-crate Cargo packages, which allow internal `impl` dependencies
+The big question is what this looks like from a crates.io perspective. If we
+could guarantee that `cargo publish` published an entire workspace atomically,
+then 
 
 
 # Future possibilities
 [future-possibilities]: #future-possibilities
 
-Allow transitive impl relations. For example:
-```mermaid
-graph LR
-  A -->|impl| B -->|impl| C
-```
-Allowing `A` to implement items in `C` because of the transitive `impl`
-relationship through B. I think this potentially complicates the correctness
-analysis a lot though, and doesn't really have a lot of benefits over just
-making `A` depend on `C` directly.
-
-There's a question on whether to globally enforce coherence even between crates
-which are not in the same dependency graph. This could be particularly useful at
-crates.io publication time to make sure the package doesn't have lurking
-problems. On the other hand there might be cases where it's desireable to allow
-incoherent implementations to exist; for example, two implementations of the
-same API with different algorithms tuned for different use-cases.
+## Support intrinsic implementations
+
+This RFC focuses on implementing traits for types, since this is the common pain
+point. In principle everything described above will also work for intrinsic
+implementations, but there's some awkward secondary questions. For example,
+intrinsic implementations tend to access the private fields of structs; does
+that mean we need add or redefine visibility rules?
+
+## "Global" third-party implementations
+
+If this mechanism were broadly enabled for the crates.io ecosystem, it could
+cause undesireably ecosystem splits. For example if you have type-defining
+package `some_type` and trait defining package `some_trait`, you could have two
+separate crates `A` & `B` implementing `some_trait` for `some_type`. If your
+package `my_package` contains just one of `A` or `B` in its dependencies, then
+all is OK. But at any point, with only a minor patch update, any package could
+add the other to its dependencies, preventing your package from compiling. This
+kind of brittleness is highly undesireable.
+
+So is there some way to enable third-party these kinds of third-party
+implementation crates in a more general way without introducing this kind of
+ecosystem brittleness?
+
+The [namespacing RFC](https://github.com/Manishearth/namespacing-rfc)
+([PR)(https://github.com/rust-lang/rfcs/pull/3243)]) is also looking at
+questions relating the the overall crates.io ecosystem. One could imagine a
+proposal, for example, that packages within the same namespace could have this
+`impl` relationship. But whether that makes sense really depends on the intended
+semantics of namespaces.
+
+## Cached coherence checks
 
 There are cases where the same coherence properties may be checked multiple
-times. Perhaps these checks could be memoized with the incremental machinery to
+times, if the same implementing crates are dependencies of multiple unrelated
+crates. Perhaps these checks could be memoized with the incremental machinery to
 avoid the redundancy.
 
+
+## Auto-split crates
+
+Modules
+within crates may have cyclic dependencies, but the crate dependency graph must
+be acyclic. But only in extremely pathological cases will a crate's internal
+dependency graph consist of a single strongly connected component.
+
+This opens the possibility of using the mechanism described in this RFC to
+automatically partition a crate into sub-crates in order to expose more
+built-time parallelism. This could either be performed inline during the build
+or some tooling to perform pre-build processing on the crates.
+
 # Appendix - Alloy spec (WIP)
 
 ---
@@ -343,7 +473,9 @@ https://alloy.readthedocs.io for documentation.
 
 # Alloy spec for Rust implementation coherency
 
-This is a simplified spec of Rust coherency checking. Rust requires that there,
+(TODO: make this fully consistent with the description above.)
+
+This is a simplified spec of Rust coherency checking. Rust requires that there
 are no overlapping or conflicting implementations with a complete Rust program,
 as that would allow for ambiguity about which one to use.
 
@@ -351,13 +483,6 @@ as that would allow for ambiguity about which one to use.
 subset of simple trait implementations for types, with no generic type
 parameters.)
 
-The current solution to this is to requires that when a crate implements a trait
-for a type, at least one of the type or the trait have to be "local" - ie,
-defined in the same crate.
-
-This is simple to describe and implement, but in practice it has several downsides:
-- (dependencies, no third-party impls)
-
 ## Alloy Spec
 
 First we define signatures for types and traits. Both are defined in a crate:

From e0b22b24ee172c1fb648fa679e4b8c124cb53d47 Mon Sep 17 00:00:00 2001
From: Jeremy Fitzhardinge <jeremy@goop.org>
Date: Wed, 6 Sep 2023 08:32:09 -0700
Subject: [PATCH 6/6] WIP address comments

---
 text/0000-extern-impl.md | 69 +++++++++++++++++++++++++---------------
 1 file changed, 44 insertions(+), 25 deletions(-)

diff --git a/text/0000-extern-impl.md b/text/0000-extern-impl.md
index 385580c8f1f..305bd06ae02 100644
--- a/text/0000-extern-impl.md
+++ b/text/0000-extern-impl.md
@@ -6,10 +6,11 @@
 # Summary
 [summary]: #summary
 
-This RFC proposes a mechanism to allow `impl`s for a type or trait to be in a
-separate crate, while still meeting the soundness guarantees currently enforced
-by the orphan rule. With this, the crate structure (and thus build system
-dependency graph) can be decoupled from the actual implementations.
+This RFC proposes an opt-in mechanism to relax some of the restrictions of the
+orphan rule, while maintaining the soundness of implementation coherence. It
+also proposes a minimal Cargo extension to allow this feature to be exercised.
+With this, the crate structure (and thus build system dependency graph) can be
+decoupled from the actual implementations.
 
 # Motivation
 [motivation]: #motivation
@@ -109,7 +110,8 @@ itself. Re-exports don't count.
 The dependency *must* be a path dependency to another crate defined in the same
 workspace. Cargo will not allow you specify `impl = True` for other crates.
 
-Publishing packages with `impl = True` to crates.io is not allowed.
+Publishing packages with any `impl = True` dependencies to crates.io is not
+allowed.
 
 # Guide-level explanation (rustc)
 
@@ -139,20 +141,22 @@ use mysql::MySql;
 impl dbtrait::DBTrait for MySql {}
 ```
 
-In other words, the implementing crate is considered the same as the defining
-crate as far as the orphan rule goes. Note that you may only implement traits
-and types for crates directly defined in the defining crate &mdash; you cannot
-implement re-exported types.
+In other words, when the overlap check is considering what definitions are
+"local",  both the implementing crate and the defining crate both count. Note
+that you may only implement traits and types for crates directly defined in the
+defining crate &mdash; you cannot implement re-exported types.
 
 The defining and implementing crates are not the same in other respects. They
 still have distinct crate names, defining their own crate-level namespaces. For
 example, if they both defined a type `Foo`, it would be two definitions
 `defining::Foo` and `implementing::Foo`.
 
-There are no other constraints on the defining and implementing crates - they
-can freely define new traits, types and other items, and implementations for
-those types and traits. An implementing crate can be a defining crate for
-another crate.
+The terms "implementing" and "defining" describe a property of the dependency
+relationship between two crates, not an intrinsic property of the crates
+themselves. There are no other constraints on the defining and implementing
+crates - they can freely define new traits, types and other items, and
+implementations for those types and traits. An implementing crate can be a
+defining crate for another crate.
 
 Regardless, the definitions within the defining crate, the implementing crate,
 and between the implementing crate and defining crate must meet the current
@@ -162,18 +166,27 @@ scope in the dependency graph, they must also be coherent.
 # Reference-level explanation
 [reference-level-explanation]: #reference-level-explanation
 
-[Coherence](https://rust-lang.github.io/chalk/book/clauses/coherence.html) is
-currently defined in terms of "local" definitions - that is, items defined with
-in the current crate.
+> The idea of trait
+> [coherence](https://rust-lang.github.io/chalk/book/clauses/coherence.html) is
+> that, given a trait and some set of types for its type parameters, there
+> should be exactly one impl that applies.
 
-The details of the coherence checking logic are unchanged; the only change is
-the constraints on which crates a definition and an implementation can be in.
+A key part of this are *overlap checks*, which check that not only are there
+currently no conflicting implementations, but adding new implementations won't
+invalidate or change the meaning of existing code.
 
-The discussion below talks about coherence in terms of the crate dependency
-relations, but the actual coherence check is at the level of individual type and
-trait implementations. The crate dependency relations set an upper bound on
-this, as any crate which does not appear within the relevent dependency graph
-need not be considered for coherence checking.
+These overlap checks can only be performed if all the relevant implementation
+sites are known so they can be checked. This is the role of the *orphan rule*,
+which has the consequence that there's only one possible crate in which contain
+a given implementation. This is the restriction this RFC proposes to loosen.
+
+We introduce what could be called *the adoption rule*. This is an extension of
+the orphan rule, in which implementations can be in more than one crate, but
+some other crate adopts those implementation crates in order to apply the
+overlap check across them in order to guarantee coherence.
+
+The actual details of the overlap check logic are unchanged; the only change is
+a wider definition of what's considered a "local definition".
 
 ## Changes to coherence checking
 
@@ -184,6 +197,12 @@ implementations, which forms its own graph. This graph is embedded in the crate
 dependency graph &mdash; that is there are no edges in the type/trait
 implementation graph which are not also present in the crate dependency graph.
 
+Note that a blanket implementations - `impl<T> Foo for T` - are a universal
+quantifier over all types `T`, whether they're in scope or not. However if we're
+interested in applying it to specific concrete types, we need only consider
+those types which are in scope, which are therefore constrained by the
+dependency graph.
+
 ```mermaid
 graph TB
   CrateC -->|dep| CrateA & CrateB
@@ -224,8 +243,8 @@ a set of implementations for a type if:
 
 This means the compilation of a crate must check for coherence when:
 - all the definitions and implementations are within one crate
-- if a crate A has an impl dependency on crate B, it must check coherency
-  between A and B
+- if a crate A has an impl dependency (ie, a direct dependency with the impl
+  option) on crate B, it must check coherency between A and B
 - if a crate has two or more crates with implementations for a given definition
   within its view via distinct direct dependencies