Skip to content

Commit

Permalink
Merge branch 'main' into pythongh-127411-add-cast
Browse files Browse the repository at this point in the history
  • Loading branch information
mpage committed Dec 1, 2024
2 parents 69dd507 + 04673d2 commit 9c49e70
Show file tree
Hide file tree
Showing 25 changed files with 303 additions and 229 deletions.
2 changes: 1 addition & 1 deletion Doc/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ pydoc-topics: build

.PHONY: gettext
gettext: BUILDER = gettext
gettext: SPHINXOPTS += -d build/doctrees-gettext
gettext: override SPHINXOPTS := -d build/doctrees-gettext $(SPHINXOPTS)
gettext: build

.PHONY: htmlview
Expand Down
48 changes: 23 additions & 25 deletions Doc/c-api/type.rst
Original file line number Diff line number Diff line change
Expand Up @@ -529,19 +529,19 @@ The following functions and structs are used to create
The following “offset” fields cannot be set using :c:type:`PyType_Slot`:
* :c:member:`~PyTypeObject.tp_weaklistoffset`
(use :c:macro:`Py_TPFLAGS_MANAGED_WEAKREF` instead if possible)
* :c:member:`~PyTypeObject.tp_dictoffset`
(use :c:macro:`Py_TPFLAGS_MANAGED_DICT` instead if possible)
* :c:member:`~PyTypeObject.tp_vectorcall_offset`
(use ``"__vectorcalloffset__"`` in
:ref:`PyMemberDef <pymemberdef-offsets>`)
If it is not possible to switch to a ``MANAGED`` flag (for example,
for vectorcall or to support Python older than 3.12), specify the
offset in :c:member:`Py_tp_members <PyTypeObject.tp_members>`.
See :ref:`PyMemberDef documentation <pymemberdef-offsets>`
for details.
* :c:member:`~PyTypeObject.tp_weaklistoffset`
(use :c:macro:`Py_TPFLAGS_MANAGED_WEAKREF` instead if possible)
* :c:member:`~PyTypeObject.tp_dictoffset`
(use :c:macro:`Py_TPFLAGS_MANAGED_DICT` instead if possible)
* :c:member:`~PyTypeObject.tp_vectorcall_offset`
(use ``"__vectorcalloffset__"`` in
:ref:`PyMemberDef <pymemberdef-offsets>`)
If it is not possible to switch to a ``MANAGED`` flag (for example,
for vectorcall or to support Python older than 3.12), specify the
offset in :c:member:`Py_tp_members <PyTypeObject.tp_members>`.
See :ref:`PyMemberDef documentation <pymemberdef-offsets>`
for details.
The following internal fields cannot be set at all when creating a heap
type:
Expand All @@ -557,20 +557,18 @@ The following functions and structs are used to create
To avoid issues, use the *bases* argument of
:c:func:`PyType_FromSpecWithBases` instead.
.. versionchanged:: 3.9
Slots in :c:type:`PyBufferProcs` may be set in the unlimited API.
.. versionchanged:: 3.9
Slots in :c:type:`PyBufferProcs` may be set in the unlimited API.
.. versionchanged:: 3.11
:c:member:`~PyBufferProcs.bf_getbuffer` and
:c:member:`~PyBufferProcs.bf_releasebuffer` are now available
under the :ref:`limited API <limited-c-api>`.
.. versionchanged:: 3.11
:c:member:`~PyBufferProcs.bf_getbuffer` and
:c:member:`~PyBufferProcs.bf_releasebuffer` are now available
under the :ref:`limited API <limited-c-api>`.
.. versionchanged:: 3.14
The field :c:member:`~PyTypeObject.tp_vectorcall` can now set
using ``Py_tp_vectorcall``. See the field's documentation
for details.
.. versionchanged:: 3.14
The field :c:member:`~PyTypeObject.tp_vectorcall` can now set
using ``Py_tp_vectorcall``. See the field's documentation
for details.
.. c:member:: void *pfunc
Expand Down
1 change: 0 additions & 1 deletion InternalDocs/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

# CPython Internals Documentation

The documentation in this folder is intended for CPython maintainers.
Expand Down
8 changes: 6 additions & 2 deletions InternalDocs/adaptive.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ quality of specialization and keeping the overhead of specialization low.
Specialized instructions must be fast. In order to be fast,
specialized instructions should be tailored for a particular
set of values that allows them to:

1. Verify that incoming value is part of that set with low overhead.
2. Perform the operation quickly.

Expand All @@ -107,9 +108,11 @@ For example, `LOAD_GLOBAL_MODULE` is specialized for `globals()`
dictionaries that have a keys with the expected version.

This can be tested quickly:

* `globals->keys->dk_version == expected_version`

and the operation can be performed quickly:

* `value = entries[cache->index].me_value;`.

Because it is impossible to measure the performance of an instruction without
Expand All @@ -122,10 +125,11 @@ base instruction.
### Implementation of specialized instructions

In general, specialized instructions should be implemented in two parts:

1. A sequence of guards, each of the form
`DEOPT_IF(guard-condition-is-false, BASE_NAME)`.
`DEOPT_IF(guard-condition-is-false, BASE_NAME)`.
2. The operation, which should ideally have no branches and
a minimum number of dependent memory accesses.
a minimum number of dependent memory accesses.

In practice, the parts may overlap, as data required for guards
can be re-used in the operation.
Expand Down
4 changes: 2 additions & 2 deletions InternalDocs/changing_grammar.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ Below is a checklist of things that may need to change.
[`Include/internal/pycore_ast.h`](../Include/internal/pycore_ast.h) and
[`Python/Python-ast.c`](../Python/Python-ast.c).

* [`Parser/lexer/`](../Parser/lexer/) contains the tokenization code.
* [`Parser/lexer/`](../Parser/lexer) contains the tokenization code.
This is where you would add a new type of comment or string literal, for example.

* [`Python/ast.c`](../Python/ast.c) will need changes to validate AST objects
Expand Down Expand Up @@ -60,4 +60,4 @@ Below is a checklist of things that may need to change.
to the tokenizer.

* Documentation must be written! Specifically, one or more of the pages in
[`Doc/reference/`](../Doc/reference/) will need to be updated.
[`Doc/reference/`](../Doc/reference) will need to be updated.
87 changes: 61 additions & 26 deletions InternalDocs/code_objects.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

# Code objects

A `CodeObject` is a builtin Python type that represents a compiled executable,
Expand Down Expand Up @@ -43,7 +42,7 @@ so a compact format is very important.
Note that traceback objects don't store all this information -- they store the start line
number, for backward compatibility, and the "last instruction" value.
The rest can be computed from the last instruction (`tb_lasti`) with the help of the
locations table. For Python code, there is a convenience method
locations table. For Python code, there is a convenience method
(`codeobject.co_positions`)[https://docs.python.org/dev/reference/datamodel.html#codeobject.co_positions]
which returns an iterator of `({line}, {endline}, {column}, {endcolumn})` tuples,
one per instruction.
Expand Down Expand Up @@ -75,9 +74,11 @@ returned by the `co_positions()` iterator.
> See [`Objects/lnotab_notes.txt`](../Objects/lnotab_notes.txt) for more details.
`co_linetable` consists of a sequence of location entries.
Each entry starts with a byte with the most significant bit set, followed by zero or more bytes with the most significant bit unset.
Each entry starts with a byte with the most significant bit set, followed by
zero or more bytes with the most significant bit unset.

Each entry contains the following information:

* The number of code units covered by this entry (length)
* The start line
* The end line
Expand All @@ -86,54 +87,88 @@ Each entry contains the following information:

The first byte has the following format:

Bit 7 | Bits 3-6 | Bits 0-2
---- | ---- | ----
1 | Code | Length (in code units) - 1
| Bit 7 | Bits 3-6 | Bits 0-2 |
|-------|----------|----------------------------|
| 1 | Code | Length (in code units) - 1 |

The codes are enumerated in the `_PyCodeLocationInfoKind` enum.

## Variable-length integer encodings
### Variable-length integer encodings

Integers are often encoded using a variable-length integer encoding
Integers are often encoded using a variable length integer encoding

### Unsigned integers (`varint`)
#### Unsigned integers (`varint`)

Unsigned integers are encoded in 6-bit chunks, least significant first.
Each chunk but the last has bit 6 set.
For example:

* 63 is encoded as `0x3f`
* 200 is encoded as `0x48`, `0x03`
* 200 is encoded as `0x48`, `0x03` since ``200 = (0x03 << 6) | 0x48``.

The following helper can be used to convert an integer into a `varint`:

```py
def encode_varint(s):
ret = []
while s >= 64:
ret.append(((s & 0x3F) | 0x40) & 0x3F)
s >>= 6
ret.append(s & 0x3F)
return bytes(ret)
```

To convert a `varint` into an unsigned integer:

```py
def decode_varint(chunks):
ret = 0
for chunk in reversed(chunks):
ret = (ret << 6) | chunk
return ret
```

### Signed integers (`svarint`)
#### Signed integers (`svarint`)

Signed integers are encoded by converting them to unsigned integers, using the following function:
```Python
def convert(s):

```py
def svarint_to_varint(s):
if s < 0:
return ((-s)<<1) | 1
return ((-s) << 1) | 1
else:
return (s<<1)
return s << 1
```

To convert a `varint` into a signed integer:

```py
def varint_to_svarint(uval):
return -(uval >> 1) if uval & 1 else (uval >> 1)
```

*Location entries*
### Location entries

The meaning of the codes and the following bytes are as follows:

Code | Meaning | Start line | End line | Start column | End column
---- | ---- | ---- | ---- | ---- | ----
0-9 | Short form | Δ 0 | Δ 0 | See below | See below
10-12 | One line form | Δ (code - 10) | Δ 0 | unsigned byte | unsigned byte
13 | No column info | Δ svarint | Δ 0 | None | None
14 | Long form | Δ svarint | Δ varint | varint | varint
15 | No location | None | None | None | None
| Code | Meaning | Start line | End line | Start column | End column |
|-------|----------------|---------------|----------|---------------|---------------|
| 0-9 | Short form | Δ 0 | Δ 0 | See below | See below |
| 10-12 | One line form | Δ (code - 10) | Δ 0 | unsigned byte | unsigned byte |
| 13 | No column info | Δ svarint | Δ 0 | None | None |
| 14 | Long form | Δ svarint | Δ varint | varint | varint |
| 15 | No location | None | None | None | None |

The Δ means the value is encoded as a delta from another value:

* Start line: Delta from the previous start line, or `co_firstlineno` for the first entry.
* End line: Delta from the start line
* End line: Delta from the start line.

### The short forms

*The short forms*
Codes 0-9 are the short forms. The short form consists of two bytes,
the second byte holding additional column information. The code is the
start column divided by 8 (and rounded down).

Codes 0-9 are the short forms. The short form consists of two bytes, the second byte holding additional column information. The code is the start column divided by 8 (and rounded down).
* Start column: `(code*8) + ((second_byte>>4)&7)`
* End column: `start_column + (second_byte&15)`
Loading

0 comments on commit 9c49e70

Please sign in to comment.