Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEP ???: TOML #1

Closed
wants to merge 23 commits into from
Closed

PEP ???: TOML #1

wants to merge 23 commits into from

Conversation

hauntsaninja
Copy link
Owner

No description provided.

pep-9999.rst Outdated Show resolved Hide resolved
@hukkin
Copy link

hukkin commented Jan 2, 2022

A general observation: If we don't end up using the toml name then it may be less relevant to discuss all the rejected features of the uiri/toml library in detail. If we do use that name, then yeah it may be worth discussing all the various ways in which we break existing users 😄

@hukkin
Copy link

hukkin commented Jan 2, 2022

One concern I have regarding Tomli-W library (whether it goes into the standard library or not): It currently only dumps to writable binary file objects. This is great because it means users can not use an encoding other than UTF-8 and users of some exotic platforms where os.linesep == "\r" can not by accident write invalid newline sequences.

It also means that we always use "\n" as a newline sequence, even on Windows where typically "\r\n" is used instead. This is great IMO because it is consistent, but not sure if this is what users expect? We could:

  • change nothing (if you really want to write "\r\n"s you can always use dumps and write to file separately with universal newlines enabled)
  • add keyword argument for using "\r\n"
  • check if os.linesep == "\r\n" and if yes, then use "\r\n", else use "\n" (a magical platform dependent side-effect, I'm not a fan of this one)

pep-9999.rst Outdated Show resolved Hide resolved
@hauntsaninja
Copy link
Owner Author

hauntsaninja commented Jan 3, 2022

Yup, makes sense. The draft currently proposes tomllib, but if we change to toml, we'll definitely need to go into all the gory details.

As it stands, I think we just need:
a) Enough discussion of the design space to convince the powers that be that the current choices are well considered and all the due diligence has been done (which is much of what I'm attempting to do in "Rejected Ideas")
b) A quick overview of the incompatibilities that rule out a takeover of the toml name (and linking to your discuss post or somewhere for more details). I'll add TOMLDecodeError to that list.

I don't feel strongly about newline sequences used in the document. I'm happy with no change, and as you say, there's a fine workaround.

Somewhat related: if we always use \n as newline sequence + given we normalise all input newlines to \n, one option I was thinking about is if dumps always escaped \r\n and always output single \n as a literal \n with multiline strings. This would be in between the C2 and C3 options you mention in hukkin/tomli#141 (comment) and could be a sane default without opening the door to arbitrary amounts of formatting options. Although maybe I'm missing something and this is a bad idea.

@hukkin
Copy link

hukkin commented Jan 3, 2022

Somewhat related: if we always use \n as newline sequence + given we normalise all input newlines to \n, one option I was thinking about is if dumps always escaped \r\n and always output single \n as a literal \n with multiline strings. This would be in between the C2 and C3 options you mention in hukkin/tomli#141 (comment) and could be a sane default without opening the door to arbitrary amounts of formatting options. Although maybe I'm missing something and this is a bad idea.

Soo if say we have the following TOML

s = "foo\nbar"

a load/dump roundtrip will then write

s = """
foo
bar"""

Even if this sort of worked when using this particular parser/writer exclusively, at this point TOML semantics have changed, and any other parser is free to translate the newline sequence to "\r\n".

@hauntsaninja
Copy link
Owner Author

hauntsaninja commented Jan 3, 2022

Yeah, "works with the same parser / writer & TOML and Python are a little fast and loose when it comes to carriage return anyway" is a pretty different promise from "always get the bytes you want". So scratch that :-)

Copy link

@hukkin hukkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work once again!

Here's a quick drive-by review. I'll try to do more in-depth, maybe filling in some of the TODOs in the future.

pep-9999.rst Outdated Show resolved Hide resolved
pep-9999.rst Outdated Show resolved Hide resolved
pep-9999.rst Outdated Show resolved Hide resolved
hauntsaninja and others added 3 commits January 4, 2022 11:54
Co-authored-by: Taneli Hukkinen <3275109+hukkin@users.noreply.github.com>
@hauntsaninja
Copy link
Owner Author

Thanks, made the updates!

pep-9999.rst Outdated Show resolved Hide resolved
pep-9999.rst Outdated Show resolved Hide resolved
Co-authored-by: Taneli Hukkinen <3275109+hukkin@users.noreply.github.com>
pep-9999.rst Outdated

The TOML format is the format of choice for Python packaging, as evidenced by
:pep:`517`, :pep:`518` and :pep:`621`. Including TOML support in the standard
library helps avoid bootstrapping problems for Python build tools. Currently
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should unfold what the bootstrapping problems are and why vendoring is necessary:

A PEP517 build requires reading TOML metadata but the standard library is unable to parse it. How do you build the first TOML parser if building it requires a TOML parser? How do you build a build frontend for building the TOML parser if building it also requires TOML? etc.

@encukou
Copy link

encukou commented Jan 5, 2022

Overall, this looks great! Thank you for writing it up!
I've sent my proposed changes as a PR: #3

encukou and others added 2 commits January 6, 2022 18:30
Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Taneli Hukkinen <3275109+hukkin@users.noreply.github.com>
pep-9999.rst Outdated
Comment on lines 69 to 70
def load(fp: SupportsRead[bytes], /, *, parse_float: Callable[[str], Any] = float) -> dict[str, Any]: ...
def loads(s: str, /, *, parse_float: Callable[[str], Any] = float) -> dict[str, Any]: ...
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def load(fp: SupportsRead[bytes], /, *, parse_float: Callable[[str], Any] = float) -> dict[str, Any]: ...
def loads(s: str, /, *, parse_float: Callable[[str], Any] = float) -> dict[str, Any]: ...
def load(fp: SupportsRead[bytes], /, *, parse_float: Callable[[str], Any] = ...) -> dict[str, Any]: ...
def loads(s: str, /, *, parse_float: Callable[[str], Any] = ...) -> dict[str, Any]: ...

I think the default argument is not relevant, should be considered an implementation detail, and seems to be a convention to elide it.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think it's more useful than the type annotation :)
If removed, the default behavior should be mentioned in the prose (even though it's straightforward).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If removed, the default behavior should be mentioned in the prose (even though it's straightforward).

Yeah I think I'd prefer doing this.

The reason why I'd want to consider this a non-annotated implementation detail is that I fear one day we may find out there is some special case where float(<insert-special-TOML-float-here>) does not return the appropriate result or even raises ValueError. If it happens it'd be best to set the default to None and make the None correspond to a slightly customized float callable.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, seeing the change to the text... The same argument can be made for the recommendation to use Decimal, can't it?

Copy link

@hukkin hukkin Jan 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that decimal.Decimal could in theory fail to parse a TOML float? Yeah agreed. But at least that's only a recommendation, not baked into the API specification. Do you think we should change the wording? E.g. "a callable that returns a decimal.Decimal can be used in cases where precision is important".

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made this change and adjusted the wording of the decimal.Decimal bit!

pep-9999.rst Outdated Show resolved Hide resolved
@hukkin
Copy link

hukkin commented Jan 8, 2022

@encukou @hauntsaninja mostly unrelated to this PEP, but since you are both here: I tried to give both of you access to Tomli GitHub and PyPI to address some recent concerns of me getting hit by a bus. 😄 Would be great if at least one of you (optimally both) accepted the invites but if not, no problem, I'll find someone else.

I have no expectation of you having to do any work whatsoever. Tomli's scope is small and I enjoy and can manage maintaining it alone. But if I ever disappear for say 6+ months, feel free to take over and start making commits and releases to prevent what happened to uiri/toml.

@hauntsaninja
Copy link
Owner Author

Thanks! I hope you don't get hit by a bus 😄

Since we've had a couple rounds of editing, I went ahead and opened the PR against python/peps: python#2218

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants