Add syntax for tuples and re-enforce homogenous arrays. #154

mojombo · 2013-03-01T07:44:57Z

Hurray for tuples! This PR also brings back homogenous arrays, which I would prefer, and the addition of tuples solves any decent use cases for mixed types.

(1, "tom", "staff")  # Perhaps some UNIX account info.
(2.48, 9.37, 28.81)  # Or maybe x/y/z coordinates.

BurntSushi · 2013-03-01T14:14:01Z

Awesome. 👍

One clarification I'd like to suggest: are tuples smaller than length 2 allowed? I propose that they not to be allowed. Tuples of length 1 carry no extra information than a bare value, and tuples of length 0 are a bit weird. (A unit type.)

Also, I think it might be worth adding in invalid example based on the size of tuples:

[ (1, 2), (1, 2, 3) ]    # NOPE

Which shows that tuples have length encoded into their type.

tnm · 2013-03-01T18:36:06Z

I'd probably not allow tuples of length 1 for sanity reasons, although I suspect implementations would resolve the tuples fine.

That said, incidentally, Python (which has first-class tuples) resolves an (apparently) syntactic 1-tuple (which is in fact, not a tuple) like this —

>>> (1,2).__class__
<type 'tuple'>

>>> (1).__class__
<type 'int'>

>>> ("math").__class__
<type 'str'>

rossipedia · 2013-03-01T18:52:56Z

That's because that's not a tuple actually. In Python (1) is a parenthesized expression evaluating to the expression contained in the parentheses. If you want a single element tuple, you need (1,):

>>> (1).__class__
<type 'int'>
>>> (1,).__class__
<type 'tuple'>

There are advantages to allowing single-length tuples, the same as there are for single-length arrays, mainly that in code you can treat the configuration value as a sequence, regardless of how many elements it has

tnm · 2013-03-01T18:54:14Z

@rossipedia — Aye, that's what I was getting at regarding the sanity reasons (clarified the comment).

rossipedia · 2013-03-01T18:57:27Z

I'm not sure I understand what sanity reasons you mean. I'd think that requiring the type of a configuration element to change based on how many elements it has would make code working with that configuration element more complicated, and that's a Bad Thing™.

tnm · 2013-03-01T19:02:05Z

Hm. Yeah the more I think about it, I suppose I don't see the harm in unit tuples, although I don't really see much usefulness in the config format. We might need to clarify the actual text since we say "They are represented by a comma separated list inside of parentheses."

BurntSushi · 2013-03-01T19:05:21Z

Don't forget about the empty tuple (). Personally, I think it's simpler to just say, "Tuples of length less than 2 are not allowed." But I truthfully don't have a strong opinion either way---a clarification is sufficient IMO.

rossipedia · 2013-03-01T19:15:09Z

Well, since this isn't actually an execution language, we don't need to worry about ( and ) denoting expressions, we can just go ahead and reserve them for tuples, with items separated by commas (with whitespace/eol and trailing comma ala arrays allowed).

+1

pygy · 2013-03-02T23:50:14Z

Also, I think it might be worth adding in invalid example based on the size of tuples:
[ (1, 2), (1, 2, 3) ]    # NOPE
Which shows that tuples have length encoded into their type.

In some languages, the type of the type of a tuple depends not only on the number of elements, but als on their types. For example, in Julia:

julia> isa((1,"e",3.4), (Int64,ASCIIString,Float64))
true

julia> isa((1,"e",3.4), (Any, Any, Any))
true

julia> isa((1,"e",3.4), Tuple)
true

As you can see, it offers some leeway, and it also does for arrays:

julia> [(1,"e",3.4)]
1-element (Int64,ASCIIString,Float64) Array:
 (1,"e",3.4)

julia> ar = Array((Any,Any,Any),0)
0-element (Any,Any,Any) Array

julia> push!(ar,(2,"ER",4.5))
1-element (Any,Any,Any) Array:
 (2,"ER",4.5)

How strict do you want to be regarding type homogeneity?

BurntSushi · 2013-03-03T00:30:51Z

@pygy Indeed. This is clarified in the commit provided by @mojombo :

[ (1, "red"), (2.1, 5.9) ] # NOPE

See #131 for more details. Maybe the spec should be clearer, but I thought the above example was enough.

How strict do you want to be regarding type homogeneity?

The point is to make arrays completely homogeneous with respect to the type of value it contains. The point of adding tuples is to provide a way to create arrays with non-homogeneous data. i.e., an array of (Int, String). The point of doing all of this is to provide a nice way to write well-typed structured data in TOML that is easily handled by a wide variety of languages, particularly of the strong and static variety.

Note that TOML has no explicit Any type, although a pure polymorphic type is implied with the use of [] with arrays as defined in this PR.

But TOML doesn't have to be strict like this for all types. Of particular note are anonymous hashes in #50.

pygy · 2013-03-04T10:31:01Z

Indeed, I had missed that part.

pygy · 2013-03-04T16:18:00Z

Should tuples behave like arrays regarding white space and comments?

BurntSushi · 2013-03-04T16:19:03Z

@pygy Absolutely.

ambv · 2013-03-08T10:18:34Z

Yay for yet another type in the minimal language spec. </sarcasm>

Basically you want homogeneous lists.
Then you realize there are valid use cases for heterogeneous lists.
You add a new type to a minimal and obvious language. Wat?

Don't add tuples, this complicates things. Instead, minimize. Proclaim that homogeneous lists is a thing which application should enforce, not TOML. Parsers can implement it as a "strict" variant, whatever.

Tuples complicate things because you have to decide on all sorts of corner cases which will confuse the users:

Is [("a", [1, 2]), ("b", ["c", "d"])] valid?
Is [("a", [1, 2]), ("b", [1])] valid?
What if a user converts a list to a tuple? Havoc and breakage? E.g. explain to people why (1, 2, 3) is incompatible with [1, 2, 3].
Can a tuple hold a tuple? If so, do we check type compatibility recursively or stay flat and ignore what's inside? This is what TOML currently suggests for lists of lists.
I could go on.

And there's this peculiar idea to forbid tuples shorter than 2 elements. This makes auto-generating values tricky and will confuse users. There is also no need for it since parentheses aren't valid in other contexts. And people will hack around it by using lists instead. But they won't be able to since a list is homogeneous. Not before long you'll start seeing [[1], ["tom"], ["staff"]] in the wild as a way to circumvent type checks.

I would minimize instead. Keep it simple. And obvious.

pygy · 2013-03-08T10:32:12Z

I agree with @ambv on this one.

Config files are unlikely to be ported as-is from app to app (and thus from language to language), and type strictness will confuse the least technically inclined, and annoy others.

BurntSushi · 2013-03-08T13:18:39Z

Don't add tuples, this complicates things. Instead, minimize. Proclaim that homogeneous lists is a thing which application should enforce, not TOML. Parsers can implement it as a "strict" variant, whatever.

Then those parsers are not compliant with the spec.

Tuples complicate things because you have to decide on all sorts of corner cases which will confuse the users:

None of the things you listed are "corner" cases. 1 is invalid because the types of tuples are different. 2 is valid because the length of a list does not affect its type. I don't understand 3; tuples and arrays are two different kinds of data. The whole point is that they serve two different purposes: arrays for homogeneous data and tuples for heterogeneous data. 4 is clarified in this proposal; tuples are ordered types, which means they are typed by the order of their component types.

And there's this peculiar idea to forbid tuples shorter than 2 elements. This makes auto-generating values tricky and will confuse users. There is also no need for it since parentheses aren't valid in other contexts. And people will hack around it by using lists instead.

Tuples shorter than 2 elements don't have to be banned, but I suggested it because they are peculiar things. And there's no reason to hack around such things. Tuples of 1 element have the same utility as just the bare element from the point of view of the type.

Not before long you'll start seeing [[1], ["tom"], ["staff"]] in the wild as a way to circumvent type checks.

Did you read this proposal? That is allowed in the spec right now. If this proposal is accepted, then that thing won't be allowed. The whole point of this proposal is to type arrays by the type of value they contain. Similarly for tuples.

You talk about confusing the user, but such things can be avoided by a parser that gives helpful error messages. A parser that doesn't give helpful error messages will confuse the user regardless of the spec.

Moreover, your comment doesn't really address the problem trying to be solved by this proposal: provide a way to write well-typed structured data and to allow static languages to easily play along. Making static languages use parsers that only support fully homogeneous arrays makes them non-compliant with the spec and strictly less useful, since tuples won't exist as a way to make heterogeneous data. This is consistent with one of the objectives of the spec:

TOML should be easy to parse into data structures in a wide variety of languages.

ambv · 2013-03-08T13:55:57Z

Then those parsers are not compliant with the spec.

TOML is a configuration file format. Your application will be using it to hold domain-specific information. What I'm saying is that array homogeneity is a domain-specific need. A parser might provide a strict option which enables that. Just as you'd validate whether TCP ports are between 1 - 65535. Such validation doesn't make your application not compliant to the TOML spec.

The whole point of this proposal is to type arrays by the type of value they contain.

Yes, I get that. The confusion I referred to is user-side, precisely because you need to inform non-programmers that there's a difference between [] and (). This is a valid question to address: "My deployment failed because I used the wrong kind of parentheses. Why couldn't you make it so that there's only one kind?"

Moreover, your comment doesn't really address the problem trying to be solved by this proposal: [...] to allow static languages to easily play along.

I must have failed to read the proposal because there's nothing in it that suggests that. It would also help to explicitly name those languages. Do those languages provide a parser for JSON or XML? If a language can specify a tuple, it can also specify a non-typed array.

BurntSushi · 2013-03-08T14:28:53Z

Yes, I get that. The confusion I referred to is user-side, precisely because you need to inform non-programmers that there's a difference between [] and (). This is a valid question to address: "My deployment failed because I used the wrong kind of parentheses. Why couldn't you make it so that there's only one kind?"

I agree that's a valid question. Truthfully, I don't know whether people will be confused by such things or not. As I said, one hopes that if your application needs to support such users, then it will have appropriate error messages.

I must have failed to read the proposal because there's nothing in it that suggests that. It would also help to explicitly name those languages.

Any static and strongly typed language.

Do those languages provide a parser for JSON or XML? If a language can specify a tuple, it can also specify a non-typed array.

Of course they do. Look at the TOML implementation list right now. There are loads of parsers for strong and static languages. I didn't say that static languages can't handle TOML without this proposal; I said it would be easier.

I will also re-emphasize that it is nice to have well-typed structured data in a configuration file. You may claim that this leads to user confusion, but it can also lead to preventing the user from typing malformed data. (This can be provided by the application, as you say, but I think it is a worthy enough goal for it to be included in the spec itself.)

dahu · 2013-03-09T02:39:40Z

I agree with ambv here. Adding tuples is a sign that the original decision of enforcing homogenous arrays was wrong. I agree that this should be left as an application check after the toml file has been parsed. If your static language has difficulty parsing a mixed type array then it will be equally difficult to parse a tuple. This is a non-argument. Reduce the complexity in this minimal data interchange format by losing the tuple idea and allowing heterogenous arrays.

BurntSushi · 2013-03-09T02:55:16Z

If your static language has difficulty parsing a mixed type array then it will be equally difficult to parse a tuple.

Not at all. Static languages can represent tuples as an appropriate type (like, say, any particular construction of a product type), which is typically distinct from an array. It's not a non-argument, because an assumption of homogeneity or heterogeneity buys you stuff in a static language. It allows the programmer to choose an appropriate data type to represent the TOML data. If TOML only provides heterogeneous arrays, then you never get that homogeneity assumption which restricts your choices in a static language.

The idea here is to push those assumptions into the spec. The result does not benefit dynamic languages, but it does not harm them either. (e.g., A dynamic language with heterogeneous arrays could represent arrays and tuples in TOML in precisely the same way.) The result does benefit static languages. It can also benefit the user by catching malformed data. And as ambv pointed out, it can also detract from the user by having both the [] and () syntactic categories.

dahu · 2013-03-09T03:05:07Z

And my argument is, the decision to represent something as an array should be made by the application. Parse it as your most forgiving type, and allow it to be cast out as the more restrictive type on processing. Your language's toml parser can have convenient methods to make this simple for the app dev. The application says: I want this section of data to be a homogenous array of values, toml-parser. Make it so. The parser slurps up the data permissively and then the non-parsing side of the tom-parser (the rendere... i'm lacking terminology here) re-casts the data as the type requested. My description here is vague and hand-wavy because I haven't fully considered the call interface. However, I still believe it is better to reduce the file-format complexity and make your tools smarter. For languages that natively support mixed type arrays, they have less smarts they need to build into their toml-parsers. I guess you could still implement the type-checking API into all toml-parsers, if you wanted to.

BurntSushi · 2013-03-09T03:09:53Z

My description here is vague and hand-wavy because I haven't fully considered the call interface.

I know how to do what you ask. In fact, I've already done it for the current spec. It works great. (It's a decent demonstration of Go's reflection facilities IMO.)

But this isn't a win-win situation. Allowing the user to enforce homogeneous arrays means they lose out on the ability for mixed data in a well-typed manner. It's all still doable, but like I've said, inconvenient and possibly less safe depending on the language used.

However, I still believe it is better to reduce the file-format complexity and make your tools smarter.

I like simplicity. I've been an advocate for it on this issue tracker. But I also like safety and convenience.

For languages that natively support mixed type arrays, they have less smarts they need to build into their toml-parsers. I guess you could still implement the type-checking API into all toml-parsers, if you wanted to.

The type checking does add a bit more complexity to the parser. But not too much IMO. It took me a couple extra hours to add it in on a separate branch. (But I had these grand plans from the beginning.)

dahu · 2013-03-09T03:23:38Z

You misunderstood one piece of my thinking - I wasn't suggesting that homogeneity be an all or nothing affair. I was suggesting that, per key, the app dev could get the toml-parser to validate that a collection was indeed homogenous and return it in the most efficient data structure (for the language/situation) accordingly.

hand waving ahead:

config = toml.parse(file)
config['ports'].validate_as('int').to_array

or however it is you might achieve that in the real world.

BurntSushi · 2013-03-09T03:31:26Z

You misunderstood one piece of my thinking - I wasn't suggesting that homogeneity be an all or nothing affair. I was suggesting that, per key, the app dev could get the toml-parser to validate that a collection was indeed homogenous and return it in the most efficient data structure (for the language/situation) accordingly.

I didn't misunderstand. That's precisely how my parser operates. :-)

config['ports'].validate_as('int').to_array

This is exactly the kind of thing I'd classify as inconvenient. Instead of type safety being baked into a parser (one-time effort), the type safety has to be redone in every client use of it.

We are having the classic argument of where safety should live. I'm advocating for pushing some of it into the spec. It also makes working in a static language more convenient.

dahu · 2013-03-09T03:34:32Z

But the problem with pushing it into the spec is that it complicates it and therefore the config files written in it. Those files can be touched by non-coders. My thinking is, leave those files as simple and intuitive as possible. Give your toml-parser an API that makes coercion and validation as simple as possible. Your coder knows about these things and is the right person for owning this responsibility.

BurntSushi · 2013-03-09T03:39:35Z

We are in agreement that adding another syntactic category will add complexity. But it must be evaluated as a trade off. The pros are more safety for all parsers, more safety against users typing malformed data and more convenience in static languages. The cons are more complexity in the spec/parser and user confusion.

dahu · 2013-03-09T03:45:01Z

I really don't see the convenience argument. Surely it's okay for the app dev to explicitly enquire/demand of the toml-parser that a set of keys be homogenous. That homogeneity is an aspect of his specific application. Indeed, it is an aspect that may change over time. Version 2.0 might see the need to make some of those collections heterogenous.

I look at this as a pyramid. Parser writers are generally more fastidious than app devs and they in turn more so than users. Put the responsibility of making the parser flexible, smart and a pleasure to use on the parser writers. Put the responsibility of ensuring type correctness and data validation on the app dev. Let the user blunder along with good error messages to guide them to safety and correctness.

BurntSushi · 2013-03-09T03:50:23Z

I really don't see the convenience argument.

Imagine that TOML had only three data types: hashes, arrays and strings. Do you see how it is convenient to add integers, floats, bools and datetimes?

In the same sense, but not the same magnitude, having real arrays and tuples is more convenient. It pushes type information and safety into the spec, and therefore doesn't require the client to have to verify the types themselves.

I look at this as a pyramid. Parser writers are generally more fastidious than app devs and they in turn more so than users. Put the responsibility of making the parser flexible, smart and a pleasure to use on the parser writers. Put the responsibility of ensuring type correctness and data validation on the app dev. Let the user blunder along with good error messages to guide them to safety and correctness.

But you're missing trade-offs. Taken to the extreme, your only primitive data type in TOML would be a string. TOML includes some types because it moves safety and convenience into the parser.

dahu · 2013-03-09T04:03:54Z

I remain unconvinced by this argument, and you're straw-manning by suggesting those extremes. I was not suggesting that we remove types from the toml spec. I was suggesting that we don't add the tuple type. All up, I favour a single heterogenous array type. I think we've articulated our opinions and perspectives well enough here for now. Let's see how it turns out.

BurntSushi · 2013-03-09T04:50:06Z

and you're straw-manning by suggesting those extremes.

No. I'm not misrepresenting your argument. I'm merely trying to show that the decision is about a trade-off of safety, convenience and complexity, and not one of some pyramid of responsibility. Because invariably, the spec takes responsibility for some types. (Implying that app writers don't have full responsibility of types.)

pygy · 2013-03-09T07:24:58Z

Edit: This was written offline (I'm on the road), and sent before I could check if it was still relevant,... and I don't have the time to read the whole thread right now. Sorry for the noise if it has been covered meanwhile.

Moreover, your comment doesn't really address the problem trying to be solved by this proposal: provide a way to write well-typed structured data and to allow static languages to easily play along.

On the other hand, enforcing type homogeneity in a dynamic language means
that you have to write a whole type checker.

I will also re-emphasize that it is nice to have well-typed structured data in a configuration file. You may claim that this leads to user confusion, but it can also lead to preventing the user from typing malformed data.

The real solution to this kind of problem is a schema validator. Type
stricness still allows to fudge the configuration. I've added a lightweight
proposal in #116, assuming no type checking in arrays.

Rather than type strictness, I'd require compliant parsers to support some
form of schema validation (to be defined).

-- Pierre-Yves

dahu · 2013-03-09T09:49:48Z

@pygy, I had considered a schema validator but thought it might be laughed off. I prefer this approach. hetero arrays (and no tuples) with validating schemas (permissive, as described in #116 so that non-specified members are valid by default)

BurntSushi · 2013-03-09T16:52:48Z

@pygy - The type checker is more work, but as I mentioned, I don't think it's much more work. It took me an hour or two to add myself (but I had grand plans from the beginning).

The real solution to this kind of problem is a schema validator. Type
stricness still allows to fudge the configuration. I've added a lightweight
proposal in #116, assuming no type checking in arrays.

I think schema validation is a great idea. But as I've mentioned, sometimes safety is worth pushing into the spec. Also, enforcing homogeneity in arrays without tuples is much less expressive (which is why tuples go hand-in-hand with homogeneous arrays).

dahu · 2013-03-10T00:32:45Z

With a schema validator, how about we lose arrays altogether and just have tuples?

ambv · 2013-03-10T10:21:57Z

@dahu, and call them arrays? Voila! ;-)

dahu · 2013-03-10T10:25:30Z

You got the sarcasm there then. ;-)

Ghoughpteighbteau · 2013-03-10T16:45:17Z

Personally I fall on the side of strongly typed languages. I feel the comparison between tuples and arrays to be laughable.

Here are two things to keep in mind. @dahu is right, there is definitely some enforcement that is the applications responsibility. At the same time, these kinds of explicit structures prevent accidental input errors, and I can back this up with a real world example.

StarSector uses json to define it's ship files. Here's an example:

  "bounds": [
    -60,26,
    -60,-26,
    -14,-31,
    -2,-45,
    40,-46,
    57,-1,
    59,17,
    47,40,
    35,34,
    -49,35
  ],

Before the tools were developed to place these bounds with a GUI, someone was damn fool enough to do this:

  "bounds": [
    -33.0,
    15.0,
    -15.0,
    -2.0,
    3.0,
    43.0,
    43.0,
    9.0,
    -35.0,
    -34.0,
    -18.0,
    -21.0,
    -34.0,
    -14.0,
    -17.0,
    -29.0,
    -40.0,
    -26.0,
    -26.0,
    15.0,
    19.0,
    23.0,
    15.0,
    13.0,
    -2.0,
    -4.0
  ],

Which of course lead to the worlds first trans-dimentional clam!

For the record, this error went undetected for 8 months.

Two insertion errors blew things up in an undetectable way. So why didn't the author of the ship format enforce something like this?:

  "bounds": [
    [-33.0,15.0],
    [-15.0,-2.0],
    [3.0,43.0],
    [43.0,9.0],
    [-35.0,-34.0],
    [-18.0,-21.0],
    [-34.0,-14.0],
    [-17.0,-29.0],
    [-40.0,-26.0],
    [-26.0,15.0],
    [19.0,23.0],
    [15.0,13.0],
    [-2.0,-4.0]
  ],

because he wrote starsector in java of course! (the tragedy keeps on coming right?) His parser simply didn't provide him an easy way to translate those arrays into simple points. So he did what was simplest for him.

The proposal has it's advantages, most notably in how the markup would look:

   bounds = [(-33.0,15.0)
            ,(-15.0,-2.0)
            ,(3.0,43.0)
            ,(43.0,9.0)
            ,(-35.0,-34.0)
            ,(-18.0,-21.0)
            ,(-34.0,-14.0)
            ,(-17.0,-29.0)
            ,(-40.0,-26.0)
            ,(-26.0,15.0)
            ,(19.0,23.0)
            ,(15.0,13.0)
            ,(-2.0,-4.0)
            ]

And how simply it would parse into native structures or pojo's.

dahu · 2013-03-10T21:45:41Z

Interesting example, @Ghoughpteighbteau :-)

However, nothing here convinces me that we can't just keep a single simple syntax in the toml files, and provide the app dev with tools necessary to get where she wants to be.

So, consider the equivalent data structure:

bounds = [[-33.0,15.0]
,[-15.0,-2.0]
,[3.0,43.0]
,[43.0,9.0]
,[-35.0,-34.0]
,[-18.0,-21.0]
,[-34.0,-14.0]
,[-17.0,-29.0]
,[-40.0,-26.0]
,[-26.0,15.0]
,[19.0,23.0]
,[15.0,13.0]
,[-2.0,-4.0]
]

With an appropriate schema validator that ensured it was an array of 2-value arrays of floats. Or however you want to describe that. Imaginably, the schema might even have specs in it to control the casting... No, I don't like that, on reflection. That would not be as language neutral. Issues of casting should reside in the toml-parser for each language. Dynamic languages may not need any casting at all whereas the strongly typed languages might need a bit of a helping hand. So, to my original thinking:

Let's assume Java (but that is not a strength, so I will hand-wave). The toml-parser there would slurp up (internally) the toml file into an array-of-arrays (or whatever best suits Java as the internal representation - perhaps that's a tuple type?) while checking the validating schema for correctness. Then the app-dev says, give me the bounds as an array of Points (or whatever actual type they should be).

I believe it's the job of the toml-parser and the app-dev to cast the parsed form into the required form. At the file-parsing face, the toml-parser for all implementations would look fairly similar. It's only at the app-dev facing side of the toml-parser that extra casting code would be required for strongly typed languages.

BurntSushi · 2013-03-10T22:40:46Z

With an appropriate schema validator that ensured it was an array of 2-value arrays of floats.

The utility of schema validators is not in question. Everyone knows that a schema can be used to artificially restrict values based on a number of criteria. Integer/float/datetime ranges, enumerations, array lengths, sum types, etc. The list goes on.

Saying that "it can be pushed into a schema validator" isn't relevant here. What's relevant is the balance we all want to strike between complexity, safety and convenience in the specification.

I believe it's the job of the toml-parser and the app-dev to cast the parsed form into the required form. At the file-parsing face, the toml-parser for all implementations would look fairly similar. It's only at the app-dev facing side of the toml-parser that extra casting code would be required for strongly typed languages.

There's more to the story, since you support a variety of types included in the TOML specification, which means that the client of a TOML parser is not completely responsible for casting types. The parser takes some responsibility from the spec.

At this point, the typing implications of this proposal have been made clear. I believe that further discussion should be based on the trade offs that I and others have described. The ones that I can currently think of are:

How much will the user be confused by having both [] and () syntax? (Will users of TOML data be able to distinguish two syntactic categories based on the type of their data?)
How much of a burden is it on parser writers to implement the proposal? (A bit more advanced type checker is required to make sure arrays and tuples are well-typed. This is mostly due to the fact this proposal makes arrays and tuples composite types.)
How much convenience do clients of TOML parsers get by expecting structured well-typed data? (e.g., Avoiding supremely general data types in static languages, since static languages usually do not support native heterogeneous arrays.)
How much benefit do users get by being warned of typing malformed data by definition of TOML as opposed to a particular client deciding to be nice to their user?

Some of these points have already been discussed, and I'm sure there are others I've missed. But I believe an evaluation of these trade offs is the appropriate way to decide whether this proposal should be accepted or not. Therefore, the discussion should be focused on those points. Things like "it can be in the schema validator" are responses to any proposal that adds safety or convenience via types into the specification, including already existing types.

pygy · 2013-03-11T08:56:44Z

I just realized that parser have to handle mixed content in hashes, so having mixed type arrays isn't much more works.

The point of schema validators is that they allow to enforce type strictness, if that's your thing, with much more control than what the TOML spec provides or could provide if this proposal was accepted (because you have domain specific knowledge), and a whole bunch of other things, like value constrains.

BurntSushi · 2013-03-11T16:07:57Z

I just realized that parser have to handle mixed content in hashes, so having mixed type arrays isn't much more works.

They are indeed cumbersome in a static language, but it's worth it. Homogeneous maps put severe restrictions on the ability to express data concisely.

Ghoughpteighbteau · 2013-03-11T20:05:18Z

I think there is a percentage that will be confused, but syntax is pretty easy to glean from seeing it used elsewhere. This is only an issue when authoring new configuration files, not editing existing ones.
I don't know. I haven't written one, so I'd love for someone else to fill in.
Can't really say, myself.
Depends on how expressive and explicit it is. The thing about configuration files is that they get manipulated allot, by non technical people. Programmers use a toml parser are likely to give generic errors

(this config file is malformed... somewhere...)

while the toml parsers are likely to give explicit errors accompanied by line numbers

 (Unexpected '[' on line 34, are you missing a comma?.) 

(element 5 '("linear", 50, 57,0)' at line 37 does not match expected pattern (String, Integer, Integer).)

This is a bit off topic, but I think there is an underlying worry that toml will not be able to express some concepts if array's are not heterogeneous. I understand that worry, but I think all use-cases for that are covered here: #153

RichardHightower · 2014-04-17T19:02:55Z

#213

I guess I will add my comment here.

Add to table, int, float, date, one more type... tuple.

You have table which is a k/v map.
You have array which all items have to be on type.

Add a list.

(1, "foo", 1979-05-27T07:32:00Z)

Sometimes things make sense in tuples. Disparate types are good for expressing many concepts.

Boon will have the 6th TOML implementation for Java.
I am writing a lot of config files, and I find JSON aggravating.

And I agree YAML has jumped the shark.

Off topic:
Boon will have tuple.
I tend to marshal JSON arrays instead of JSON objects to reduce the footprint of the JSON feed, which matters when you have a 10,000,000 user app.
I can see using toml in places where I normally might use JSON, not just config.

So in short... I agree you need Tuple. I really like that arrays are homogeneous. I also really like that my browser has spell correct because apparently I do not know how to spell homogeneous.

http://rick-hightower.blogspot.com/2014/04/toml-what-if-plist-json-and-windows-ini.html

mojombo · 2014-07-16T00:40:11Z

Closing in favor of Inline Tables. Check them out on #235.

Add syntax for tuples and re-enforce homogenous arrays.

e5e388f

This was referenced Mar 1, 2013

add a new syntactic category: tuples #131

Closed

arrays can mix multiple data types #28

Closed

BurntSushi mentioned this pull request Mar 4, 2013

not a compliant parser uiri/toml#4

Closed

BurntSushi mentioned this pull request Mar 6, 2013

Added a Julia implementation #168

Merged

BurntSushi mentioned this pull request Mar 9, 2013

Added Schema validator specification #116

Open

3 tasks

BurntSushi mentioned this pull request Apr 17, 2014

add one more type a tuple with parens. Pretty please. #213

Closed

wycats mentioned this pull request Jun 24, 2014

Tuples and Inline Tables: A Motivation #219

Closed

mojombo closed this Jul 16, 2014

mojombo deleted the tuples branch February 5, 2018 00:33

rmunn mentioned this pull request Aug 2, 2018

Mixed-type arrays still legal, even though the intent appears to have been to forbid them #553

Closed

workingjubilee mentioned this pull request Sep 11, 2019

Make arrays heterogeneous #665

Closed

yyny mentioned this pull request Feb 17, 2020

Optionally supported implementation-defined values #707

Closed

rbutoi mentioned this pull request May 21, 2020

feat(directory): Add directory substitutions starship/starship#1183

Merged

8 tasks

Add syntax for tuples and re-enforce homogenous arrays. #154

Add syntax for tuples and re-enforce homogenous arrays. #154

Conversation

mojombo commented Mar 1, 2013

BurntSushi commented Mar 1, 2013

tnm commented Mar 1, 2013

rossipedia commented Mar 1, 2013

tnm commented Mar 1, 2013

rossipedia commented Mar 1, 2013

tnm commented Mar 1, 2013

BurntSushi commented Mar 1, 2013

rossipedia commented Mar 1, 2013

pygy commented Mar 2, 2013

BurntSushi commented Mar 3, 2013

pygy commented Mar 4, 2013

pygy commented Mar 4, 2013

BurntSushi commented Mar 4, 2013

ambv commented Mar 8, 2013

pygy commented Mar 8, 2013

BurntSushi commented Mar 8, 2013

ambv commented Mar 8, 2013

BurntSushi commented Mar 8, 2013

dahu commented Mar 9, 2013

BurntSushi commented Mar 9, 2013

dahu commented Mar 9, 2013

BurntSushi commented Mar 9, 2013

dahu commented Mar 9, 2013

BurntSushi commented Mar 9, 2013

dahu commented Mar 9, 2013

BurntSushi commented Mar 9, 2013

dahu commented Mar 9, 2013

BurntSushi commented Mar 9, 2013

dahu commented Mar 9, 2013

BurntSushi commented Mar 9, 2013

pygy commented Mar 9, 2013

dahu commented Mar 9, 2013

BurntSushi commented Mar 9, 2013

dahu commented Mar 10, 2013

ambv commented Mar 10, 2013

dahu commented Mar 10, 2013

Ghoughpteighbteau commented Mar 10, 2013

dahu commented Mar 10, 2013

BurntSushi commented Mar 10, 2013

pygy commented Mar 11, 2013

BurntSushi commented Mar 11, 2013

Ghoughpteighbteau commented Mar 11, 2013

RichardHightower commented Apr 17, 2014

mojombo commented Jul 16, 2014