Skip to content

Commit

Permalink
bpo-40059: Add tomllib (PEP-680) (GH-31498)
Browse files Browse the repository at this point in the history
This adds a new standard library module, `tomllib`, for parsing TOML.

The implementation is based on Tomli (https://github.com/hukkin/tomli).

## Steps taken (converting `tomli` to `tomllib`)

- Move everything in `tomli:src/tomli` to `Lib/tomllib`. Exclude `py.typed`.
- Remove `__version__ = ...` line from `Lib/tomllib/__init__.py`
- Move everything in `tomli:tests` to `Lib/test/test_tomllib`. Exclude the following test data dirs recursively:
  - `tomli:tests/data/invalid/_external/`
  - `tomli:tests/data/valid/_external/`
- Create `Lib/test/test_tomllib/__main__.py`:

  ```python
  import unittest

  from . import load_tests


  unittest.main()
  ```


- Add the following to `Lib/test/test_tomllib/__init__.py`:

  ```python
  import os
  from test.support import load_package_tests

  def load_tests(*args):
      return load_package_tests(os.path.dirname(__file__), *args)
  ```

  Also change `import tomli as tomllib` to `import tomllib`.

- In `cpython/Lib/tomllib/_parser.py` replace `__fp` with `fp` and `__s` with
  `s`. Add the `/` to `load` and `loads` function signatures.

- Run `make regen-stdlib-module-names`

- Create `Doc/library/tomllib.rst` and reference it in `Doc/library/fileformats.rst`
  • Loading branch information
hukkin authored Mar 8, 2022
1 parent 4d95fa1 commit 591f675
Show file tree
Hide file tree
Showing 90 changed files with 1,479 additions and 1 deletion.
2 changes: 2 additions & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,8 @@ Lib/ast.py @isidentical
**/*cgi* @ethanfurman
**/*tarfile* @ethanfurman

**/*tomllib* @encukou

# macOS
/Mac/ @python/macos-team
**/*osx_support* @python/macos-team
1 change: 1 addition & 0 deletions Doc/library/fileformats.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ that aren't markup languages and are not related to e-mail.

csv.rst
configparser.rst
tomllib.rst
netrc.rst
xdrlib.rst
plistlib.rst
117 changes: 117 additions & 0 deletions Doc/library/tomllib.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
:mod:`tomllib` --- Parse TOML files
===================================

.. module:: tomllib
:synopsis: Parse TOML files.

.. versionadded:: 3.11

.. moduleauthor:: Taneli Hukkinen
.. sectionauthor:: Taneli Hukkinen

**Source code:** :source:`Lib/tomllib`

--------------

This module provides an interface for parsing TOML (Tom's Obvious Minimal
Language, `https://toml.io <https://toml.io/en/>`_). This module does not
support writing TOML.

.. seealso::

The `Tomli-W package <https://pypi.org/project/tomli-w/>`__
is a TOML writer that can be used in conjunction with this module,
providing a write API familiar to users of the standard library
:mod:`marshal` and :mod:`pickle` modules.

.. seealso::

The `TOML Kit package <https://pypi.org/project/tomlkit/>`__
is a style-preserving TOML library with both read and write capability.
It is a recommended replacement for this module for editing already
existing TOML files.


This module defines the following functions:

.. function:: load(fp, /, *, parse_float=float)

Read a TOML file. The first argument should be a readable and binary file object.
Return a :class:`dict`. Convert TOML types to Python using this
:ref:`conversion table <toml-to-py-table>`.

*parse_float* will be called with the string of every TOML
float to be decoded. By default, this is equivalent to ``float(num_str)``.
This can be used to use another datatype or parser for TOML floats
(e.g. :class:`decimal.Decimal`). The callable must not return a
:class:`dict` or a :class:`list`, else a :exc:`ValueError` is raised.

A :exc:`TOMLDecodeError` will be raised on an invalid TOML document.


.. function:: loads(s, /, *, parse_float=float)

Load TOML from a :class:`str` object. Return a :class:`dict`. Convert TOML
types to Python using this :ref:`conversion table <toml-to-py-table>`. The
*parse_float* argument has the same meaning as in :func:`load`.

A :exc:`TOMLDecodeError` will be raised on an invalid TOML document.


The following exceptions are available:

.. exception:: TOMLDecodeError

Subclass of :exc:`ValueError`.


Examples
--------

Parsing a TOML file::

import tomllib

with open("pyproject.toml", "rb") as f:
data = tomllib.load(f)

Parsing a TOML string::

import tomllib

toml_str = """
python-version = "3.11.0"
python-implementation = "CPython"
"""

data = tomllib.loads(toml_str)


Conversion Table
----------------

.. _toml-to-py-table:

+------------------+--------------------------------------------------------------------------------------+
| TOML | Python |
+==================+======================================================================================+
| table | dict |
+------------------+--------------------------------------------------------------------------------------+
| string | str |
+------------------+--------------------------------------------------------------------------------------+
| integer | int |
+------------------+--------------------------------------------------------------------------------------+
| float | float (configurable with *parse_float*) |
+------------------+--------------------------------------------------------------------------------------+
| boolean | bool |
+------------------+--------------------------------------------------------------------------------------+
| offset date-time | datetime.datetime (``tzinfo`` attribute set to an instance of ``datetime.timezone``) |
+------------------+--------------------------------------------------------------------------------------+
| local date-time | datetime.datetime (``tzinfo`` attribute set to ``None``) |
+------------------+--------------------------------------------------------------------------------------+
| local date | datetime.date |
+------------------+--------------------------------------------------------------------------------------+
| local time | datetime.time |
+------------------+--------------------------------------------------------------------------------------+
| array | list |
+------------------+--------------------------------------------------------------------------------------+
3 changes: 2 additions & 1 deletion Doc/whatsnew/3.11.rst
Original file line number Diff line number Diff line change
Expand Up @@ -214,7 +214,8 @@ Other CPython Implementation Changes
New Modules
===========

* None yet.
* A new module, :mod:`tomllib`, was added for parsing TOML.
(Contributed by Taneli Hukkinen in :issue:`40059`.)


Improved Modules
Expand Down
15 changes: 15 additions & 0 deletions Lib/test/test_tomllib/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# SPDX-License-Identifier: MIT
# SPDX-FileCopyrightText: 2021 Taneli Hukkinen
# Licensed to PSF under a Contributor Agreement.

__all__ = ("tomllib",)

# By changing this one line, we can run the tests against
# a different module name.
import tomllib

import os
from test.support import load_package_tests

def load_tests(*args):
return load_package_tests(os.path.dirname(__file__), *args)
6 changes: 6 additions & 0 deletions Lib/test/test_tomllib/__main__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
import unittest

from . import load_tests


unittest.main()
120 changes: 120 additions & 0 deletions Lib/test/test_tomllib/burntsushi.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
# SPDX-License-Identifier: MIT
# SPDX-FileCopyrightText: 2021 Taneli Hukkinen
# Licensed to PSF under a Contributor Agreement.

"""Utilities for tests that are in the "burntsushi" format."""

import datetime
from typing import Any

# Aliases for converting TOML compliance format [1] to BurntSushi format [2]
# [1] https://github.com/toml-lang/compliance/blob/db7c3211fda30ff9ddb10292f4aeda7e2e10abc4/docs/json-encoding.md # noqa: E501
# [2] https://github.com/BurntSushi/toml-test/blob/4634fdf3a6ecd6aaea5f4cdcd98b2733c2694993/README.md # noqa: E501
_aliases = {
"boolean": "bool",
"offset datetime": "datetime",
"local datetime": "datetime-local",
"local date": "date-local",
"local time": "time-local",
}


def convert(obj): # noqa: C901
if isinstance(obj, str):
return {"type": "string", "value": obj}
elif isinstance(obj, bool):
return {"type": "bool", "value": str(obj).lower()}
elif isinstance(obj, int):
return {"type": "integer", "value": str(obj)}
elif isinstance(obj, float):
return {"type": "float", "value": _normalize_float_str(str(obj))}
elif isinstance(obj, datetime.datetime):
val = _normalize_datetime_str(obj.isoformat())
if obj.tzinfo:
return {"type": "datetime", "value": val}
return {"type": "datetime-local", "value": val}
elif isinstance(obj, datetime.time):
return {
"type": "time-local",
"value": _normalize_localtime_str(str(obj)),
}
elif isinstance(obj, datetime.date):
return {
"type": "date-local",
"value": str(obj),
}
elif isinstance(obj, list):
return [convert(i) for i in obj]
elif isinstance(obj, dict):
return {k: convert(v) for k, v in obj.items()}
raise Exception("unsupported type")


def normalize(obj: Any) -> Any:
"""Normalize test objects.
This normalizes primitive values (e.g. floats), and also converts from
TOML compliance format [1] to BurntSushi format [2].
[1] https://github.com/toml-lang/compliance/blob/db7c3211fda30ff9ddb10292f4aeda7e2e10abc4/docs/json-encoding.md # noqa: E501
[2] https://github.com/BurntSushi/toml-test/blob/4634fdf3a6ecd6aaea5f4cdcd98b2733c2694993/README.md # noqa: E501
"""
if isinstance(obj, list):
return [normalize(item) for item in obj]
if isinstance(obj, dict):
if "type" in obj and "value" in obj:
type_ = obj["type"]
norm_type = _aliases.get(type_, type_)
value = obj["value"]
if norm_type == "float":
norm_value = _normalize_float_str(value)
elif norm_type in {"datetime", "datetime-local"}:
norm_value = _normalize_datetime_str(value)
elif norm_type == "time-local":
norm_value = _normalize_localtime_str(value)
else:
norm_value = value

if norm_type == "array":
return [normalize(item) for item in value]
return {"type": norm_type, "value": norm_value}
return {k: normalize(v) for k, v in obj.items()}
raise AssertionError("Burntsushi fixtures should be dicts/lists only")


def _normalize_datetime_str(dt_str: str) -> str:
if dt_str[-1].lower() == "z":
dt_str = dt_str[:-1] + "+00:00"

date = dt_str[:10]
rest = dt_str[11:]

if "+" in rest:
sign = "+"
elif "-" in rest:
sign = "-"
else:
sign = ""

if sign:
time, _, offset = rest.partition(sign)
else:
time = rest
offset = ""

time = time.rstrip("0") if "." in time else time
return date + "T" + time + sign + offset


def _normalize_localtime_str(lt_str: str) -> str:
return lt_str.rstrip("0") if "." in lt_str else lt_str


def _normalize_float_str(float_str: str) -> str:
as_float = float(float_str)

# Normalize "-0.0" and "+0.0"
if as_float == 0:
return "0"

return str(as_float)
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
arrr = [true false]
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[[parent-table.arr]]
[parent-table]
not-arr = 1
arr = 2
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
a=true
[[a]]
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
a=[1
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
v=[1,
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
v=[
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"backslash is the last char\
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
val=falsE
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
val=trUe
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
"only 28 or 29 days in february" = 1988-02-30
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
a = false
a.b = true
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[[tab.arr]]
[tab]
arr.val1=1
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[a.b.c.d]
z = 9
[a]
b.c.d.k.t = 8
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[a.b.c]
z = 9
[a]
b.c.t = 9
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
arrr = { comma-missing = true valid-toml = false }
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
table1 = { table2.dupe = 1, table2.dupe = 2 }
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
table = { dupe = 1, dupe = 2 }
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
a={b=1
2 changes: 2 additions & 0 deletions Lib/test/test_tomllib/data/invalid/inline-table/mutate.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
a = { b = 1 }
a.b = 2
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[tab.nested]
inline-t = { nest = {} }

[tab]
nested.inline-t.nest = 2
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
inline-t = { nest = {} }

[[inline-t.nest]]
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
inline-t = { nest = {} }

[inline-t.nest]
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
a = { b = 1, b.c = 2 }
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
tab = { inner.table = [{}], inner.table.val = "bad" }
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
tab = { inner = { dog = "best" }, inner.cat = "worst" }
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
a={
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# form feed ( ) not allowed in comments
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
escaped-unicode = "\uabag"
1 change: 1 addition & 0 deletions Lib/test/test_tomllib/data/invalid/invalid-hex.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
hex = 0xgabba00f1
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
[fwfw.wafw
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fs.fw
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
why-no-value=
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
fs.
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
a=1
[a.b.c.d]
Loading

0 comments on commit 591f675

Please sign in to comment.