-
-
Notifications
You must be signed in to change notification settings - Fork 56
The MOJO File Format
This page describes Austin's MOJO binary file format for version 3. At the high level, it consists of a stream of events that correspond to the various parts of Austin's default output format. Each MOJO file has a header with a magic sequence of bytes and a version number.
There are only two fundamental data types used by the MOJO format: strings and varints. A string is a null-terminated sequence of bytes. A varint is an integer of variable size which is encoded as follows: the most significant bit of each byte determines whether the next byte is part of the number. The second most significant bit of the first byte is the sign bit. Hence, only the last 6 least significant bits of the first byte contribute to the value of the integer, whereas for each of the subsequent bytes, the number of bits is 7. For example, the byte sequence C3 02
encodes the integer -131
.
A MOJO file starts with the byte sequence
Byte 0 .. 2 | Byte 3 .. |
---|---|
MOJ |
version varint |
that is, the byte sequence MOJ
followed by the varint encoding of the format version. The initial version is 1. The latest version is 3.
Each event has the following structure
Byte 0 | Byte 1 .. n |
---|---|
Event ID | Event data |
Note that some events might not have any additional event data. The currently supported event IDs are listed in the following table.
Event ID | Name | Event Data | Description |
---|---|---|---|
0 | Reserved. | ||
1 | Metadata |
key: string , value: string
|
Metadata key-value pair, e.g. Austin version, detected Python version, sampling metrics etc... . |
2 | Stack |
pid: varint , iid: varint , tid: string
|
This signals the beginning of a frame stack. The event data includes the PID, the interpreter ID, and the thread ID. Every new stack event signals the end of the previous stack (if any) and the beginning of a new one. |
3 | Frame |
key: varint , filename_key: varint , scope_key: varint , line: varint , line_end: varint , column: varint , column_end: varint
|
This event carries information about a frame. The event data consists of the frame key identifier, two string references for the file path and the function name respectively, and the location information. The location information consists of 4 numbers: the start and end line, and the start and end column. A value of 0 indicates that no information is available for that location value. |
4 | Invalid frame | Emitted when an invalid frame is detected. | |
5 | Frame reference |
frame_key: varint
|
A reference to a frame by key identifier. These events define the actual frame content of frame stacks. |
6 | Kernel frame |
symbol: string
|
Emitted by the austinp variant to report a kernel frame. The event data is a single string with the name of the kernel symbol. |
7 | Garbage collector | Emitted if the garbage collector is running while sampling with the garbage collector option. | |
8 | Idle stack | Emitted if the stack is idle (only in full mode). | |
9 | Time metric |
value: varint
|
A time delta in microseconds |
10 | Memory metric |
value: varint
|
A memory delta in bytes. |
11 | String event |
key: varint , value: string
|
This is a pair of key followed by a literal string. Used to provide a mapping between a string and a string reference, which is used to reduce redundancy. |
12 | String reference |
string_key: varint
|
A reference to a string by key. |
Reference events, like frame and string events, make use of varint key. To further reduce the size of MOJO files, these keys are such that their varint-encoded value is at most 4 bytes, for a total of 2^27
(~134 M) possible values.
- The Stack event now carries sub-interpreter identification information.