-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Variable-declaration and storage representation in Qiskit #50
Conversation
This is a proposal and design comments on how to begin adding the representations of declared classical variables and stores to Qiskit, including input variables, which correspond to the `input` keyword in OpenQASM 3, and can be used to define parametric input of arbitrary types into Qiskit. For now, this RFC does not propose expanding Qiskit's classical type system, so only `Bool` and `Uint` remain valid, and these parameters cannot yet be used in gate-argument positions. This is still strongly desired, just not for this RFC.
To be clear: the new features that Qiskit's adding here are not expected to be supported on hardware immediately, and we'll clearly document that that's the case on release. This is a design process based on how Qiskit can build incrementally to a far greater set of classical operations over a longer time period than just minor-release-to-minor-release. We're targetting having first-class support for classical runtime parameters with |
It's not clear to me whether it should be possible for a circuit to involve variables whose names conflict in scope with the program it's being inlined into, or what if any automatic variable-renaming methods we should make available. | ||
|
||
If there are no conflicts in the variable names, it seems expected to me that the lists of input variables would catenate, and so on. | ||
Similarly, two declared `Var` instances that conflict (either by having incompatible initialisers, or one being an input, one not, etc) cannot be allowed to coexist, but does this mean that the composition should be rejected, or should we implement a renaming? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think how to handle namespace conflicts should be parametrizable to the compose
method. It might cover a valid usecase glue circuits referring to the save variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only time in this system that you could have two circuits really validly referring to the same variable without a redefinition via the initialiser is in the cases:
- base circuit declares the
Var
as aninput
, inlined declares it as input or closure - base circuit declares the
Var
as a closure, inlined declares it as closure - base circuit initialises the
Var
, inlined declares it as closure
Those are all largely ok, but given the nature of how closures work, you have to have a reference to the Var
already, so I don't think there's much problem in compose
; we can easily enforce that it only works with the same Var
.
The issues are if I've got two circuits like
base = QuantumCircuit()
a1 = base.add_var("a", types.Bool(), expr.value(True))
inlined = QuantumCircuit(
a2 = inlined.add_var("a", types.Bool(), expr.value(False))
now there are two variables named "a" that are not the same, and naive inlining re-using the same named storage location would cause a2
to shadow a1
, and for the ensuing data model to work, we'd already have to rewrite all expressions in inlined
to use a1
instead of a2
.
My preference here would likely be to reject compositions with any variable clashes that would require complete variable rewriting, just for ease in the initial implementation; I think there's nuances to classical-function inlining that I'm not super confident of at the moment to allow something we're not 100% of the utility of.
OpenQASM 3 supports an `output` specifier on variables for rich returns from programs. | ||
Qiskit currently doesn't have a way of representing this through its result types, and it's not clear when hardware vendors will have support for this. | ||
For ease of user expectations, it's left til later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm bending things too much. I feel the current approach is that every classical wire has an implicit output
right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They do, and the OQ3 spec even defines that behaviour to make it consistent-ish with OQ2-era Qiskit behaviour. In practice, I shied away from defining this because I think the harder questions to answer are hardware-vendor side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actually touches on my primary question about this RFC, which was that I was expecting it to simultaneously address circuit outputs. At the very least, I can imagine wanting to NOT return some values that I used as intermediaries in my circuit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can certainly imagine that too. My intent with this RFC wasn't to imply that all variables will always be output
for sure. I'm fine if we'd like to discuss including output
in this RFC directly (especially now it's pushed to 1.0) - my initial concern was that I didn't want to specify an interface for something that (at least) IBM hardware definitely has no way of representing in the timescales that Qiskit could implement it.
I think it's relatively easy to imagine ways to designate output
variables in this form. Some (non-exclusive) options are
- have an
output=True
keyword onadd_var
(but notadd_capture
oradd_input
) - have an
add_output
function with the same signature asadd_var
- have an
outputs=...
field in theQuantumCircuit
constructor
The bulk of the remaining output
work will then be in the results interfaces of the primitives and backend.run
, and I think the design process might be smoother and more focussed for other affected teams if we separate that out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of those options seem reasonable.
|
||
#### Allow `str` names in `expr` builder constructions instead of using `QuantumCircuit.get_var` to get there | ||
|
||
It feels tempting to allow things like `expr.bit_and("a", 7)` to automatically promote `a` to a `Var`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does seem like it'd be reasonable to have some sort of expression to read from a variable of a given name in whatever context/circuit the expression is evaluated in, e.g. expr.Var("a")
. Though this would be a big diversion from the current expr.Var
which stores circuit-specific things, if I'm not mistaken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give any examples of what you mean? I'm having a hard time seeing an API call structure that would be sensible here.
It's not clear to me whether it should be possible for a circuit to involve variables whose names conflict in scope with the program it's being inlined into, or what if any automatic variable-renaming methods we should make available. | ||
|
||
If there are no conflicts in the variable names, it seems expected to me that the lists of input variables would catenate, and so on. | ||
Similarly, two declared `Var` instances that conflict (either by having incompatible initialisers, or one being an input, one not, etc) cannot be allowed to coexist, but does this mean that the composition should be rejected, or should we implement a renaming? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, names defined explicitly by the user with conflicts should be an error, by default. But perhaps it's worth providing a mechanism to override this behavior. I'd think a new boolean parameter on compose
would be sufficient for this, e.g. shared_vars: Bool
. A value of True
meaning that rather than erroring-out on conflict, conflicting vars should be checked for compatibility and shared. Then, it's up to the user to first rename any conflicting vars that aren't intended to be the same prior to composition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer the Kevin's compose
-parametrisation approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this sounds totally reasonable to me. If there's no objections, for an initial implementation I suggest we just put in "error on conflict", and add any additional flags to implement a partial sharing once we find a compelling use case.
This adds all the new `QuantumCircuit` methods discussed in the variable-declaration RFC[^1], and threads the support for them through the methods that are called in turn, such as `QuantumCircuit.append`. It does yet not add support to methods such as `copy` or `compose`, which will be done in a follow-up. The APIs discussed in the RFC necessitated making `Var` nodes hashable. This is done in this commit, as it is logically connected. These nodes now have enforced immutability, which is technically a minor breaking change, but in practice required for the properties of such expressions to be tracked correctly through circuits. A helper attribute `Var.standalone` is added to unify the handling of whether a variable is an old-style existing-memory wrapper, or a new "proper" variable with its own memory. [^1]: Qiskit/RFCs#50
This adds all the new `QuantumCircuit` methods discussed in the variable-declaration RFC[^1], and threads the support for them through the methods that are called in turn, such as `QuantumCircuit.append`. It does yet not add support to methods such as `copy` or `compose`, which will be done in a follow-up. The APIs discussed in the RFC necessitated making `Var` nodes hashable. This is done in this commit, as it is logically connected. These nodes now have enforced immutability, which is technically a minor breaking change, but in practice required for the properties of such expressions to be tracked correctly through circuits. A helper attribute `Var.standalone` is added to unify the handling of whether a variable is an old-style existing-memory wrapper, or a new "proper" variable with its own memory. [^1]: Qiskit/RFCs#50
This commit adds support to the `QuantumCircuit` methods `copy` and `copy_empty_like` for manual variables. This involves the non-trivial extension to the original RFC[^1] that variables can now be uninitialised; this is somewhat required for the logic of how the `Store` instruction works and the existence of `QuantumCircuit.copy_empty_like`; a variable could be initialised with the result of a `measure` that no longer exists, therefore it must be possible for variables to be uninitialised. This was not originally intended to be possible in the design document, but is somewhat required for logical consistency. A method `add_uninitialized_var` is added, so that the behaviour of `copy_empty_like` is not an awkward special case only possible through that method, but instead a complete part of the data model that must be reasoned about. The method however is deliberately a bit less ergononmic to type and to use, because really users _should_ use `add_var` in almost all circumstances. [^1]: Qiskit/RFCs#50
This adds all the new `QuantumCircuit` methods discussed in the variable-declaration RFC[^1], and threads the support for them through the methods that are called in turn, such as `QuantumCircuit.append`. It does yet not add support to methods such as `copy` or `compose`, which will be done in a follow-up. The APIs discussed in the RFC necessitated making `Var` nodes hashable. This is done in this commit, as it is logically connected. These nodes now have enforced immutability, which is technically a minor breaking change, but in practice required for the properties of such expressions to be tracked correctly through circuits. A helper attribute `Var.standalone` is added to unify the handling of whether a variable is an old-style existing-memory wrapper, or a new "proper" variable with its own memory. [^1]: Qiskit/RFCs#50
This commit adds support to the `QuantumCircuit` methods `copy` and `copy_empty_like` for manual variables. This involves the non-trivial extension to the original RFC[^1] that variables can now be uninitialised; this is somewhat required for the logic of how the `Store` instruction works and the existence of `QuantumCircuit.copy_empty_like`; a variable could be initialised with the result of a `measure` that no longer exists, therefore it must be possible for variables to be uninitialised. This was not originally intended to be possible in the design document, but is somewhat required for logical consistency. A method `add_uninitialized_var` is added, so that the behaviour of `copy_empty_like` is not an awkward special case only possible through that method, but instead a complete part of the data model that must be reasoned about. The method however is deliberately a bit less ergononmic to type and to use, because really users _should_ use `add_var` in almost all circumstances. [^1]: Qiskit/RFCs#50
This commit adds support to the `QuantumCircuit` methods `copy` and `copy_empty_like` for manual variables. This involves the non-trivial extension to the original RFC[^1] that variables can now be uninitialised; this is somewhat required for the logic of how the `Store` instruction works and the existence of `QuantumCircuit.copy_empty_like`; a variable could be initialised with the result of a `measure` that no longer exists, therefore it must be possible for variables to be uninitialised. This was not originally intended to be possible in the design document, but is somewhat required for logical consistency. A method `add_uninitialized_var` is added, so that the behaviour of `copy_empty_like` is not an awkward special case only possible through that method, but instead a complete part of the data model that must be reasoned about. The method however is deliberately a bit less ergononmic to type and to use, because really users _should_ use `add_var` in almost all circumstances. [^1]: Qiskit/RFCs#50
declarations: Mapping[expr.Var, expr.Expr] | Iterable[Tuple[expr.Var, expr.Expr]] = (), | ||
): ... | ||
|
||
def add_var(self, name_or_var: str | expr.Var, / initial: expr.Expr) -> expr.Var: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are missing a type: types.Type
between name_or_var
and /
.
def add_var(self, name_or_var: str | expr.Var, / initial: expr.Expr) -> expr.Var: ... | |
def add_var(self, name_or_var: str | expr.Var, type: types.Type, / initial: expr.Expr) -> expr.Var: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's deliberately not there because it's always inferable, so having it would only really offer an opportunity for desynchronisation and errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In all your examples below you provide a type, though, e.g.
qc.add_var("mask", types.Uint(3), expr.value(5))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, thanks! That's a mistake - I originally did have a separate type
field as you have here and those examples must be left from that. The reason for it is that if name_or_var
is a Var
, then the type
is redundant information - you'd have to frequently be doing
qc.add_var(var, var.type, expr.value(...))
and we'd have encoded a requirement in the interface to provide a type, except it gives the function no information it didn't already have, yet it still has to check to validate it because the only way the argument can change the behaviour of the function is to cause it to raise an error.
It's tricky to make a function that takes either 2 or 3 arguments positionally, and have the last one (initial
) be able to be passed by keyword (which I think is valuable for explicitness). It's doable, but the interface ends up quite fiddly:
def add_var(var_or_name: Var | str, initial_or_type: Expr | Type | None = None, /, initial: Expr | None = None)
and we need a big if/elif
chain to handle all the combinations of input arguments. It's doable, but hard to get the internal logic right (which also means it can be slower) and introduces several ways of providing invalid input combinations, which is something I was trying to avoid where possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 1dc9c45.
0000-classical-stores.md
Outdated
qr = QuantumRegister(3) | ||
cr = ClassicalRegister(3) | ||
qc = QuantumCircuit(qr, cr) | ||
mask = qc.add_var("mask", types.Uint(3), expr.value(5)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this a little hard to scan as an assignment uint[3] mask = 5
, but I am struggling to come up with a good alternative. I guess it does read a little better when providing the initial=
keyword.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's not perfect for sure. This RFC is principally meant to be the low-level methods and implementation - I think a higher order interface would be a follow up.
- There is a natural split in hardware between a `Clbit`, which is a target for a `measure` instruction and is likely written to in controllers very close to the QPU hardware, and classical variables that can be in outer control systems. | ||
For example, gathering the result of a `measure` into the particular bit of a `uint[8]` might involve more expensive transfer and synchronisation operations than adding together two `uint[8]` objects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This distinction makes me a bit uncomfortable, particularly because when you serialize a Var
of type bit
or a Clbit
to OpenQASM they should end up indistinguishable.
Your comments about QuantumCircuit
bloat are certainly valid, but it seems like something is missing if we now have two representations for essentially the same thing. Can these be brought together to just replace Clbit
s with bit-typed Var
s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment on this in 8096054. We definitely do want to remove ClassicalRegister
in the future, but at the moment it's too baked into the fabric of Qiskit and hardware offerings to have a smooth removal path. My hope is to revisit this after Qiskit 1.0 to plot a path to removing ClassicalRegister
entirely in favour of the type-safe data of the Var
nodes. Clbit
will also likely disappear there in favour of a bit-typed Var
.
For examples of some blockers on doing that work sooner:
- Removing
ClassicalRegister
needs us to get the replacement circuit-output formats designed, because at the moment it's a bit baked into the returns (e.g. by hardware inserting spaces between separate registers in returned bitstrings) - Replacing
Clbit
with a bit-typedVar
is a large technical undertaking within Qiskit that will want to build on the work done to makeExpr
nodes available in gate-angle position etc; it's part of a general larger expansion of the type system and use of typed classical data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very strong proposal, @jakelishman. My two primary pieces of feedback are inline. But, to pull them out here for visibility they are:
- With this change,
ClassicalRegister
now seems redundant with bit-typedVar
s. Can these be brought together? - What about declaring variables as
output
s? It is oddly asymmetrical to introduce support forinput
s and not also introduceoutput
s.
I've (finally) updated this RFC to respond to the comments above, thanks. For visibility on Blake's comment above:
|
@blakejohnson and @kevinhartman , if you are happy with the way your questions were addressed and you documented all your opinions on the matter, do you mind approving this PR? I will merge it after. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
This seems like the right size of step to take as well, and doesn't lock us into anything that isn't absolutely necessary now.
This means that all consumers of `QuantumCircuit` that care about data-flow orderings and display must know to check for this particular subclass, and include custom code to handle this. | ||
Similarly, the `DAGCircuit` would need to grow an alternative to `DAGOpNode` that could represent this with new first-class support. | ||
|
||
A strong alternative would be to have different versions of the `CircuitInstruction`/`DAGOpNode` circuit context object that better represent the "special" classical operations such as `Measure`, `Store` and the `ControlFlowOp` subclasses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth noting that this would imply that CircuitData
introduced in Qiskit/qiskit#10827 would need to gain awareness of these special classical operations (i.e. via separate methods like append_store
or with subclass instance checking through its append
, in addition to internal tracking in either case).
It's certainly doable, but it's useful to know that CircuitData
(though currently private) will likely need more awareness/introspection of the instructions it stores, esp. as we oxidize more parts of QuantumCircuit
.
This adds all the new `QuantumCircuit` methods discussed in the variable-declaration RFC[^1], and threads the support for them through the methods that are called in turn, such as `QuantumCircuit.append`. It does yet not add support to methods such as `copy` or `compose`, which will be done in a follow-up. The APIs discussed in the RFC necessitated making `Var` nodes hashable. This is done in this commit, as it is logically connected. These nodes now have enforced immutability, which is technically a minor breaking change, but in practice required for the properties of such expressions to be tracked correctly through circuits. A helper attribute `Var.standalone` is added to unify the handling of whether a variable is an old-style existing-memory wrapper, or a new "proper" variable with its own memory. [^1]: Qiskit/RFCs#50
This commit adds support to the `QuantumCircuit` methods `copy` and `copy_empty_like` for manual variables. This involves the non-trivial extension to the original RFC[^1] that variables can now be uninitialised; this is somewhat required for the logic of how the `Store` instruction works and the existence of `QuantumCircuit.copy_empty_like`; a variable could be initialised with the result of a `measure` that no longer exists, therefore it must be possible for variables to be uninitialised. This was not originally intended to be possible in the design document, but is somewhat required for logical consistency. A method `add_uninitialized_var` is added, so that the behaviour of `copy_empty_like` is not an awkward special case only possible through that method, but instead a complete part of the data model that must be reasoned about. The method however is deliberately a bit less ergononmic to type and to use, because really users _should_ use `add_var` in almost all circumstances. [^1]: Qiskit/RFCs#50
This adds all the new `QuantumCircuit` methods discussed in the variable-declaration RFC[^1], and threads the support for them through the methods that are called in turn, such as `QuantumCircuit.append`. It does yet not add support to methods such as `copy` or `compose`, which will be done in a follow-up. The APIs discussed in the RFC necessitated making `Var` nodes hashable. This is done in this commit, as it is logically connected. These nodes now have enforced immutability, which is technically a minor breaking change, but in practice required for the properties of such expressions to be tracked correctly through circuits. A helper attribute `Var.standalone` is added to unify the handling of whether a variable is an old-style existing-memory wrapper, or a new "proper" variable with its own memory. [^1]: Qiskit/RFCs#50
This commit adds support to the `QuantumCircuit` methods `copy` and `copy_empty_like` for manual variables. This involves the non-trivial extension to the original RFC[^1] that variables can now be uninitialised; this is somewhat required for the logic of how the `Store` instruction works and the existence of `QuantumCircuit.copy_empty_like`; a variable could be initialised with the result of a `measure` that no longer exists, therefore it must be possible for variables to be uninitialised. This was not originally intended to be possible in the design document, but is somewhat required for logical consistency. A method `add_uninitialized_var` is added, so that the behaviour of `copy_empty_like` is not an awkward special case only possible through that method, but instead a complete part of the data model that must be reasoned about. The method however is deliberately a bit less ergononmic to type and to use, because really users _should_ use `add_var` in almost all circumstances. [^1]: Qiskit/RFCs#50
…0974) * Add definition of `Store` instruction This does not yet add the implementation of `QuantumCircuit.store`, which will come later as part of expanding the full API of `QuantumCircuit` to be able to support these runtime variables. The `is_lvalue` helper is added generally to the `classical.expr` module because it's generally useful, while `types.cast_kind` is moved from being a private method in `expr` to a public-API function so `Store` can use it. These now come with associated unit tests. * Add variable-handling methods to `QuantumCircuit` This adds all the new `QuantumCircuit` methods discussed in the variable-declaration RFC[^1], and threads the support for them through the methods that are called in turn, such as `QuantumCircuit.append`. It does yet not add support to methods such as `copy` or `compose`, which will be done in a follow-up. The APIs discussed in the RFC necessitated making `Var` nodes hashable. This is done in this commit, as it is logically connected. These nodes now have enforced immutability, which is technically a minor breaking change, but in practice required for the properties of such expressions to be tracked correctly through circuits. A helper attribute `Var.standalone` is added to unify the handling of whether a variable is an old-style existing-memory wrapper, or a new "proper" variable with its own memory. [^1]: Qiskit/RFCs#50 * Support manual variables `QuantumCircuit` copy methods This commit adds support to the `QuantumCircuit` methods `copy` and `copy_empty_like` for manual variables. This involves the non-trivial extension to the original RFC[^1] that variables can now be uninitialised; this is somewhat required for the logic of how the `Store` instruction works and the existence of `QuantumCircuit.copy_empty_like`; a variable could be initialised with the result of a `measure` that no longer exists, therefore it must be possible for variables to be uninitialised. This was not originally intended to be possible in the design document, but is somewhat required for logical consistency. A method `add_uninitialized_var` is added, so that the behaviour of `copy_empty_like` is not an awkward special case only possible through that method, but instead a complete part of the data model that must be reasoned about. The method however is deliberately a bit less ergononmic to type and to use, because really users _should_ use `add_var` in almost all circumstances. [^1]: Qiskit/RFCs#50 * Ensure `QuantumCircuit.append` validates captures in control-flow This adds an inner check to the control-flow operations that their blocks do not contain input variables, and to `QuantumCircuit.append` that any captures within blocks are validate (in the sense of the variables existing in the outer circuit). In order to avoid an `import` on every call to `QuantumCircuit.append` (especially since we're already eating the cost of an extra `isinstance` check), this reorganises the import structure of `qiskit.circuit.controlflow` to sit strictly _before_ `qiskit.circuit.quantumcircuit` in the import tree. Since those are key parts of the circuit data structure, that does make sense, although by their nature the structures are of course recursive at runtime. * Update documentation Co-authored-by: Matthew Treinish <mtreinish@kortar.org> * Catch simple error case in '_prepare_new_var' Co-authored-by: Matthew Treinish <mtreinish@kortar.org> * Add partial release note --------- Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
This is a proposal and design comments on how to begin adding the representations of declared classical variables and stores to Qiskit, including input variables, which correspond to the
input
keyword in OpenQASM 3, and can be used to define parametric input of arbitrary types into Qiskit.For now, this RFC does not propose expanding Qiskit's classical type system, so only
Bool
andUint
remain valid, and these parameters cannot yet be used in gate-argument positions. This is still strongly desired, just not for this RFC.Apologies if there's any thoughts that trail off, or places that need expansion - I needed to push this now so people can start to read it, rather than keeping it all local for longer. It's also a bit wordier and less organised than I'd like because I haven't really gone back through and edited it.