Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I think TOML should support file inclusion… #81

Closed
wants to merge 2 commits into from
Closed

I think TOML should support file inclusion… #81

wants to merge 2 commits into from

Conversation

xorgy
Copy link

@xorgy xorgy commented Feb 24, 2013

…but I'm not sure how best to format the statements to maximize awesomeness and minimize parser complexity.

Any ideas?

@davidcelis
Copy link

This doesn't seem minimal to me

@xorgy
Copy link
Author

xorgy commented Feb 24, 2013

It's a fairly simple thing to add, although it would compromise the "minimal" factor to some extent, it is essential for config files.

@DouweM
Copy link

DouweM commented Feb 24, 2013

I like the idea. In line with my proposal for references (#77), what do you think of this syntax?

[poultry]
@poultry.toml

That way we could override or add specific keys, plus it'd work alongside references:

## included.toml
key1 = "value1"
## main.toml
[referenced]
key2 = "value2"

[main]
@included.toml
&referenced
key3 = "value3"

This would result in the main hash having keys key1, key2 and key3.

@amadawn
Copy link

amadawn commented Feb 24, 2013

+1 for the idea. What about using "%include filename" (as in mercurial's ini-like configuration files)?

@ocxo
Copy link

ocxo commented Feb 24, 2013

I like templating in theory but I'd like to see this feature tabled for the 1.0 spec. Maybe this is something to reopen later but I think it adds unneeded complication to the structure.

@xorgy
Copy link
Author

xorgy commented Feb 24, 2013

The @poultry.toml seems decent, especially if the &reference stuff is actually being rolled in with that format.

Also, inclusion is arguably easier to support than a lot of things, it simply involves calling the parser recursively on the included file, and putting the structure in where the inclusion statement was.

@xorgy
Copy link
Author

xorgy commented Feb 24, 2013

Of course, I'm not trying to be noisy, I realize that there are almost as many forks on mojombo/toml as there are commits, feel free to ignore this until you're looking for trouble.

@liuggio
Copy link

liuggio commented Feb 25, 2013

+1 for @ as import
+1 for & as reference

@mrflip
Copy link

mrflip commented Feb 25, 2013

Agree with @davidcelis, this doesn't seem minimal.

  • &referenced is exactly what is most dangerous about YAML, and what causes the most parser headaches.
  • Inclusion means my config files have raw filesystem access in the control path. Config files should be dumb cargo.
  • Inclusion also presents much ambiguity. What if the file was read over the network? Is '..' supported? (hope not.) What do I do on Windows?
  • The proposals you showed have the inclusion in a keygroup, which also opens up room for ambiguity.

-1 to any notion of reference, and to any notion of file inclusion

@liuggio
Copy link

liuggio commented Feb 25, 2013

@mrflip I don't agree
risk vs benefit?
IMHO import is quite important and reference is something that TOML must have,
duplication is not a good practice, and definition of parameters in the correct place is another good practice.

in a real world example (I used %% for referece another key)

[statsd]
hostname = "127.0.0.1"
port = 123

[logger_manager]
argument = [%statsd%, "another arg here"]

[event_manager]
argument = [%logger_manager%]

@ambv
Copy link
Contributor

ambv commented Feb 25, 2013

@mrflip I disagree as well. Without file inclusion the format will have limited for real world structured configuration.

Imagine a web server configured with TOML without being able to say: this file is the default configuration file (managed by your operating system), your customizations should be put in a separate directory.

Inclusion could be done on the application-side too but ISTM that the whole point of TOML is specifying a minimal robust format that's reusable for a large number of use cases.

That being said, inclusion should also:

  • let me specify wildcards as well
  • let me update the default hash too

Example nginx.toml based on actual configuration:

user = "www-data"
worker_processes = 4
pid = "/var/run/nginx.pid"

[events]          
worker_connections = 768
multi_accept = true

@/etc/nginx/conf.d/*.conf
@/etc/nginx/sites-enabled/*

Note that those two includes should actually affect the main hash context so that the user can override default configuration.

As for sites-enabled/, a file could look like

[server.lukasz_langa_pl]
server_name = "lukasz.langa.pl"
server_tokens = off

[server.lukasz_langa_pl.root]
@/etc/nginx/proxy_params
location = "/"
proxy_redirect = false
proxy_pass = "http://langapl"

These kinds of use cases are real and common for server-side software.

On a completely unrelated note, on Windows we'll either:

@ghost
Copy link

ghost commented Feb 25, 2013

From a security perspective, need to be very careful with this one - this takes a simple file and turns it into a potential attack vector. Really don't want to see something like this working in the wild:

[test]
@/etc/passwd

@xorgy
Copy link
Author

xorgy commented Feb 25, 2013

@adamcaudill If you don't trust the people writing your configs, don't you have bigger security issues to worry about?

Why not just cat /etc/passwd, rather than trying to overflow it into some config variable that will be read by some network process?

Also, there's nothing sensitive in /etc/passwd, unless you're running something straight out of the eighties.

@ocxo
Copy link

ocxo commented Feb 25, 2013

Reading arbitrary files is a real concern. The potential for exploit outweighs the benefits in my mind.

@ghost
Copy link

ghost commented Feb 25, 2013

@aaronhamilton I don't trust anyone :)

I wouldn't assume that TOML will only be used from trusted sources - that may be the intention, but we all know people will find other uses. File inclusion is asking for security issues - and it's something that will be easy for implementers to get wrong. It's a small feature that adds a lot of attack surface.

@tnm
Copy link
Contributor

tnm commented Feb 25, 2013

I'm against file inclusion — it is just begging for security vulnerabilities in implementations. The TOML spec should be aggressively designed to minimize any attack vectors.

@xorgy
Copy link
Author

xorgy commented Feb 25, 2013

I don't see any real security implications, granted you may want to have a mechanism for your library or integrated implementation to limit the scope of inclusion(I have this handled by mandatory access control anyway).

Leaving out essentials for the sake of simplicity or cowardice will limit the adoption of the standard.

@ocxo
Copy link

ocxo commented Feb 25, 2013

:trollface:

@ghost
Copy link

ghost commented Feb 25, 2013

@aaronhamilton What you call cowardice, I call wisdom. Adding features that increase attack surface should be done only after ensuring adequate mitigations are in place; in a spec, great care should be taken to lead implementers to a secure path - this does not. Blinding adding features without a full understanding of the implications is not wise.

I'm paid to find and fix security issues, this is what I do. I can promise you, implementers will get the security controls of this feature wrong. When that happens, systems will be compromised.

I'm not saying that it's impossible to add something like this in a secure form - just that people will get it wrong.

@xorgy
Copy link
Author

xorgy commented Feb 25, 2013

@adamcaudill Assuming file inclusion is a necessity, what steps could be taken to mitigate security risk without introducing even more undue complexity?

@jprichardson
Copy link

At first I was really for this. Then I thought more about it more and thought for my JSON config files that I have never needed to include another file. The only time that I've ever needed to include another file is in some Redis config files.

But there is no reason the application couldn't provide that functionality of including another file.

[data.settings]
somedata = "@myfile.toml" #application responsible for loading other TOML file, not the TOML library

Let's keep this simple. I'm against it.

@davidcelis
Copy link

@aaronhamilton But is this really a necessity? I don't think I've ever had to include other files into my JSON or YAML configuration files. Is this really common practice?

@xorgy
Copy link
Author

xorgy commented Feb 26, 2013

@davidcelis Yes, unfortunately there are a lot of circumstances under which inclusion is a major asset, just look at a nanorc, a postfix configuration, or any number of other places.

I'd rather not have to run a build system on my configs.

Also, it may be pertinent to note that in an include statement, only toml files may be included, furthermore those toml files need to be in the permissions of the parsing process, this doesn't seem like such a big deal.

@millermedeiros
Copy link

I'm against includes. It can be done by another layer of abstraction and won't work on all environments (eg. loading config files with Ajax). I'd rather have something that works everywhere (like JSON).

@xorgy
Copy link
Author

xorgy commented Feb 26, 2013

Perhaps I misunderstood, TOML is a configuration language, a language to be written and read by humans for the express purpose of configuring software. If we needed a data interchange format, we would use JSON.

Why is TOML being compared to JSON as though it even makes sense to parse your configurations directly in a browser(why not just send an IV in JSON)?

Configuration «files» exist on filesystems.

@BurntSushi
Copy link
Member

@aaronhamilton Your characterization lacks one important trait: minimalism.

File inclusion is not minimal. It necessitates a larger spec:

  • What path separators are supported?
  • What are the scoping rules of data included?
  • Do keys in included files get to override keys in the parent file?

Some of these questions have obvious answers, but I submit that some of them don't. All them need to be addressed in the spec, which increases the complexity of TOML and detracts from its minimalism.

@xorgy
Copy link
Author

xorgy commented Feb 26, 2013

It's very simple,

You open the file with the path selected on the line beginning with the @, you run the parser on it and if no errors are present you put a map in its place.

In a simple implementation, this is done with three lines of code.

  • I suppose it would support the path separators supported by the host's implementation of the open() syscall or whatever method is being used to address the files(This is perhaps a showstopper if people don't like undefined behaviour, or want to use include statements but don't know all the platforms for which their specific config file will be used, something to discuss).
  • The scope should be the current scope, anything defined in the included file is included as though it were inline(anything wrong with this?)
  • If keys are overridden by the included file, and the inclusion statement is after their definition in the host file, then it acts like any other definition, the latter definition will override the current one.

I don't think this is sufficiently complex that people should worry, the idea is that the inclusion acts like any other assignment operation.

@BurntSushi
Copy link
Member

@aaronhamilton

The path separators supported by the host's implementation of the open() syscall or whatever method is being used to address the files.

I don't think the spec should be defined by operating system specific separators. Otherwise, a TOML file that works on one OS will fail spectacularly on another.

If keys are overridden by the included file, and the inclusion statement is after their definition in the host file, then it acts like any other definition, the latter definition will override the current one.

As per the spec: Be careful not to overwrite previous keys. That's dumb. And should produce an error.

If included files can override keys, then that's divergent from the current spec.

@millermedeiros
Copy link

@aaronhamilton I write mostly front-end applications, and I do require a lot of configuration for my apps. Not requiring a build step during development is a huge pro for me. If you want includes you can just use some other kind of file that implements this feature, or create your own pre-processor that does the inclusion. Including other files is beyond the scope of a simple configuration file and will reduce portability of parsers (different VMs/OS will require different APIs).

@xorgy
Copy link
Author

xorgy commented Feb 26, 2013

Perhaps it's best to keep everything lowest-common-denominator so that nobody complains about the format not implementing a feature in the way their system usually does.

It's just rather smelly that I have to maintain some nonstandard version just to implement a feature I would expect in any configuration format.

@xorgy
Copy link
Author

xorgy commented Feb 26, 2013

@BurntSushi Thanks for that last bit, I'd forgotten that bit.
After wrestling with default configs a lot, I personally think overwriting values should be tolerated... but I guess that's not the style here.

This gets tougher each time somebody jabs at it.

@BurntSushi
Copy link
Member

After wrestling with default configs a lot, I personally think overwriting values should be tolerated... but I guess that's not the style here.

This also conflicts with minimalism. Overwriting values might be reasonable for keys that aren't keygroups, but overwriting a key with a keygroup or vice versa is almost never what you want. The spec could handle each case independently, but you know where that leads. :-)

Not being able to overwrite keys prevents a configuration style that naturally supports inheritance, but it also prevents the user from from making a bonehead move and not knowing about it.

@xorgy
Copy link
Author

xorgy commented Feb 26, 2013

I guess it makes too much sense to me, as a programmer, that you might overwrite a key with a new hash...
Perhaps there could be some sort of exception mechanism in an implementation which would give a host program a chance to react to said boneheadedness..

@mojombo
Copy link
Member

mojombo commented Feb 26, 2013

I think file inclusion is outside the scope of TOML (from a minimalism standpoint). It's also a huge security risk, and it would be nice for users of TOML to know that they're not going to get anything but the TOML file they've requested.

As an alternative solution, I'd suggest that applications that need more robust config handling implement it themselves with a pattern like:

include = [ "a.toml", "b.toml" ]

At this point, the application knows that include files will be specified with that key. It can then use whatever precautions are warranted (restriction to specific directories, etc) to go fetch those TOML files and read them in. In this case, you also have the benefit of handling merging of those config files, so you could either allow or disallow overwriting of keys as fits your use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.