Description
We've been talking recently about how we might want to change the format of our instructions in different ways (variable-length instructions, wider opargs, compression of serialized forms, etc.). I think it's useful to consider all of the different forms that bytecode takes throughout a typical Python process when discussing these ideas.
The lifecycle of a string of bytecode (opcodes, opargs, and caches) currently looks something like this:
graph TB
COMPILER((compiler))
USER((user))
RAW[raw bytes]
MARSHALLED[marshalled bytes]
FROZEN[frozen bytes]
PYC[.pyc file]
DEEPFROZEN[deep-frozen code tail]
HEAP["executable code tail (heap)"]
STATIC["executable code tail (static)"]
style RAW fill:blue
style MARSHALLED fill:blue
style FROZEN fill:blue
style PYC fill:blue
style DEEPFROZEN fill:blue
style STATIC fill:red
style HEAP fill:red
COMPILER --> |compile| RAW
RAW -----> |disassemble| USER
MARSHALLED --> |cache| PYC
PYC --> |import| MARSHALLED
MARSHALLED --> |unmarshal| RAW
RAW --> |marshal| MARSHALLED
MARSHALLED -.-> |freeze| FROZEN
FROZEN -.-> |deepfreeze| DEEPFROZEN
DEEPFROZEN --> |quicken| STATIC
STATIC --> |unquicken| DEEPFROZEN
STATIC --> |copy + unquicken| RAW
FROZEN --> |import| MARSHALLED
RAW -----> |copy + quicken| HEAP
HEAP -----> |copy + unquicken| RAW
The boxes in red are quickened forms, while the boxes in blue are unquickened forms. Quickening (_PyCode_Quicken
) currently initializes adaptive counters and inserts superinstructions. Unquickening (deopt_code
) removes superinstructions, converts other instructions back to their adaptive form, and zeroes out all caches (including counters).
Let's remove frozen and cached modules, for simplicity (they're basically just marshalled bytes):
graph TB
COMPILER((compiler))
USER((user))
RAW[raw bytes]
MARSHALLED[marshalled bytes]
DEEPFROZEN[deep-frozen code tail]
HEAP["executable code tail (heap)"]
STATIC["executable code tail (static)"]
style RAW fill:blue
style MARSHALLED fill:blue
style DEEPFROZEN fill:blue
style STATIC fill:red
style HEAP fill:red
COMPILER --> |compile| RAW
RAW ----> |disassemble| USER
MARSHALLED --> |unmarshal| RAW
RAW --> |marshal| MARSHALLED
MARSHALLED -.-> |deepfreeze| DEEPFROZEN
DEEPFROZEN --> |quicken| STATIC
STATIC --> |unquicken| DEEPFROZEN
STATIC --> |copy + unquicken| RAW
RAW ----> |copy + quicken| HEAP
HEAP ----> |copy + unquicken| RAW
Some observations:
- It would simplify things a lot (especially deepfreeze) if we didn't have a concept of "quickening" or "unquickening". Perhaps a more useful model would be the ability to "reset" code to its initial quickened form, for consumers of
co_code
and finalization of deepfrozen code objects. This means that superinstructions and non-zero counters would be present inco_code
, but no specialized instructions or other populated caches. If we do this, we only have one idempotent transformation that can be applied to the bytecode, and what we currently call "quickening" can be entirely encapsulated in the compiler, where it belongs (not evenmarshal
or code objects need to understand it). If so, the new graph would be roughly:
graph TB
COMPILER((compiler))
USER((user))
RAW[raw bytes]
MARSHALLED[marshalled bytes]
HEAP["executable code tail (heap)"]
STATIC["executable code tail (static)"]
style RAW fill:red
style MARSHALLED fill:red
style STATIC fill:red
style HEAP fill:red
COMPILER --> |compile| RAW
RAW ---> |disassemble| USER
MARSHALLED --> |unmarshal| RAW
RAW --> |marshal| MARSHALLED
MARSHALLED -.-> |deepfreeze| STATIC
STATIC --> |reset| STATIC
STATIC --> |copy + reset| RAW
RAW ---> |copy + reset| HEAP
HEAP ---> |copy + reset| RAW
At this point, there's not really any difference between static and heap code (we just need to reset static code at finalization):
graph TB
COMPILER((compiler))
USER((user))
RAW[raw bytes]
MARSHALLED[marshalled bytes]
HEAP[executable code tail]
style RAW fill:red
style MARSHALLED fill:red
style HEAP fill:red
COMPILER --> |compile| RAW
RAW ---> |disassemble| USER
MARSHALLED --> |unmarshal| RAW
RAW --> |marshal| MARSHALLED
MARSHALLED -.-> |deepfreeze| HEAP
HEAP --> |reset| HEAP
RAW --> |copy + reset| HEAP
HEAP --> |copy + reset| RAW
- While it's an open question whether marshal should have an intimate knowledge of the bytecode format for compression purposes, it's certainly desirable to at least marshal the bytecode directly in and out of the code object's tail (and not through an intermediate
bytes
object):
graph TB
COMPILER((compiler))
USER((user))
RAW[raw bytes]
MARSHALLED[marshalled bytes]
HEAP[executable code tail]
style RAW fill:red
style MARSHALLED fill:red
style HEAP fill:red
COMPILER --> |compile| RAW
RAW ---> |disassemble| USER
MARSHALLED --> |unmarshal| HEAP
HEAP --> |marshal + reset| MARSHALLED
MARSHALLED -.-> |deepfreeze| HEAP
HEAP --> |reset| HEAP
RAW --> |copy + reset| HEAP
HEAP --> |copy + reset| RAW
If marshal has a way of building code without an intermediate bytes
object, then the compiler does too:
graph TB
COMPILER((compiler))
USER((user))
RAW[raw bytes]
MARSHALLED[marshalled bytes]
HEAP[executable code tail]
style RAW fill:red
style MARSHALLED fill:red
style HEAP fill:red
COMPILER --> |compile| HEAP
MARSHALLED --> |unmarshal| HEAP
HEAP --> |marshal + reset| MARSHALLED
MARSHALLED -.-> |deepfreeze| HEAP
HEAP --> |reset| HEAP
HEAP --> |copy + reset| RAW
RAW --> |disassemble| USER
So, by changing these two relatively minor things, it seems that we can simplify our handling of the bytecode quite a bit.