Skip to content

Explore supporting option arguments with commas.#7863

Closed
marler8997 wants to merge 1 commit intodlang:masterfrom
marler8997:optionCommas
Closed

Explore supporting option arguments with commas.#7863
marler8997 wants to merge 1 commit intodlang:masterfrom
marler8997:optionCommas

Conversation

@marler8997
Copy link
Contributor

@marler8997 marler8997 commented Feb 10, 2018

An exploration on what it would take to support commas in command-line options that take multiple arguments.

The current proposed syntax is to replace the initial = character with a comma ,, and then seperate additional values with more commas, i.e.

dmd -i=foo -i=bar -i=baz
OR
dmd -i,foo,bar,baz

By having a separate syntax for "comma mode" and "one value mode", there is no need to support "escaping commas" in the case that the option value has a comma.

All the logic is in the OptionRange struct. It takes an option's argument as a null-terminated string and returns a range of null-terminated strings representing each individual argument. So the following code:

dosomething(myoption);

becomes this:

foreach(option; optionRange(myoption))
    dosomething(option);

The OptionRange struct takes a modest amount of code. Each option that uses it will require an additional line of code for the foreach loop header and possibly an extra level of nesting for the loop body.

Runtime overhead when using normal -i= is virtually none. In the "comma case" strdup is called once on the option arguments so that each comma can be replaced by a terminating '\0' so that each individual argument will be null-terminated.

Note that OptionRange can be modified to NOT GUARANTEE null-termination, in which case no allocation is necessary, however, I haven't identified any command-line options that could use this right now.

The current implementation has applied this new optionRange to -I, -L, -J and -i for demonstration, i.e.

dmd -I,foo,bar -L,-a,-b,-c -J,somepath,anotherpath -i,foo,bar

@dlang-bot
Copy link
Contributor

dlang-bot commented Feb 10, 2018

Thanks for your pull request and interest in making D better, @marler8997! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the annotated coverage diff directly on GitHub with CodeCov's browser extension
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

@timotheecour
Copy link
Contributor

timotheecour commented Feb 10, 2018

Arguments for supporting commas

  1. other compilers do it to, eg clang++ (-Wl,) and ldc2:
-mattr=<a1,+a2,-a3,...>
-debuglib=<lib1,lib2,...>
-defaultlib=<lib1,lib2,...>
  1. makes copying arguments between tools easier (eg from linker to dmd):
# suppose this is used for linking (eg separate linking vs compilation)
clang++ -Wl,-Ldir,-lbar,-lld args2... 
dmd      -L=-Ldir,-lbar,-lld args... # with this PR: just change `-Wl,` to `-L=`
dmd -L-Ldir -L-lbar -L-ld args... # before PR: each linker argument must be prefixed with -L
  1. grouping arguments together is easier and results in fewer character for humans to type (matters when prototyping on cmd line)

src/dmd/mars.d Outdated
}

// make a copy of args so we can replace commas ',' with '\0'
auto argsCopy = strcpy(null, args);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strcpy will seg fault here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I meant to do strdup...

@marler8997 marler8997 force-pushed the optionCommas branch 2 times, most recently from 690f0a5 to 8d37bfe Compare February 11, 2018 05:10
@marler8997
Copy link
Contributor Author

marler8997 commented Feb 12, 2018

I find the best way for me to evaluate whether a feature like this is justified is to list the pros/cons. @timotheecour has listed some pros, I'll go ahead and provide a couple cons:

  1. Option Arguments that can have commas

This PR has chosen the comma , character as the delimiter. However, if there is an argument that itself can contain commas, then this would break its current syntax since the comma would now have a new meaning. An analysis of the command-line options that support multiple arguments should be done to determine if this is OK, or if some other approach needs to be taken (i.e. different delimiters/escape mechanisms).

i.e. -i and -version arguments cannot contain commas, but what about -I or -J? Should we support commas in filenames? Maybe filename options should use a different delimiter, or maybe filename options shouldn't support this feature at all. Things to consider.

  1. More code to maintain

This feature doesn't require much code but it's something to consider in the evaluation.

@JinShil
Copy link
Contributor

JinShil commented Feb 22, 2018

I suppose you could implement escaping logic to deal with significance of a , in the command line arguments. For example, two consecutive commas would translate to a single comma significant to the argument, but would not be an argument delimiter.

For example: -L=dir1,"Doe,, John",dir2 would be equivalent to -L=dir1 -L="Doe, John" -L=dir2. I believe this would be extremely rare, but at least we'd have an out if the situation presented itself.

@marler8997
Copy link
Contributor Author

marler8997 commented Feb 22, 2018

Yeah supporting double commas as an escape would work. I don't really like it but I can't think of anything better. I suppose that if we do support commas, we could say that the comma interface is only for "hand-typing" arguments, and that tools should split their arguments into separate options so they don't have to sanitize their arguments to escape commas. Either that or come up with a better idea.

IMO, this is the biggest argument "against" this idea. It breaks backwards compatibility for anything that happened to have commas in their arguments. However, I don't know how often, if ever this happens.

@timotheecour
Copy link
Contributor

How about a leading comma to indicate multi args?
-L,foo,bar
analog to clang -Wl,foo,bar

@marler8997
Copy link
Contributor Author

This "start wth a comma" idea is backwards compatible. This is the best idea I've seen so far.

@marler8997
Copy link
Contributor Author

Ok, I've updated the PR and the description to use -Option,value,value,value,... instead of -Option=value,value,value.

@marler8997 marler8997 force-pushed the optionCommas branch 4 times, most recently from 7666a23 to a57cfe9 Compare February 22, 2018 22:07
@JinShil
Copy link
Contributor

JinShil commented Mar 4, 2018

It is my understanding that @WalterBright is the keeper of the command line interface. I'm marking this as "Need Approval" and assigning it to him to invoke an up or down decision on this. I, personally, am indifferent.

@WalterBright
Copy link
Member

I generally oppose this, for reasons pointed out previously and outlined in the n.g.:

Implementation is trivial

It's not trivial. First, it's 100 lines of code (with no comments) just for that, and there are templates and behaviors with memory allocation. Moving along that path, we're gradually reinventing std.getopt which is 1814 lines (with comments).

Generally, people will be driving dmd with a makefile, dmd.conf, or other response file. It's complicated enough already, and gets constantly worse. Once in, we're stuck with it forever. I just don't feel it is worthwhile spending time on this.

@marler8997
Copy link
Contributor Author

It's not trivial. First, it's 100 lines of code (with no comments) just for that, and there are templates and behaviors with memory allocation. Moving along that path, we're gradually reinventing std.getopt which is 1814 lines (with comments).

These are very weak arguments that are not being applied anywhere else in the compiler. It seems like you are "doubling-down" on your original reason for rejecting the idea instead of admitting the implementation is actually very simple.

Generally, people will be driving dmd with a makefile, dmd.conf, or other response file. It's complicated enough already, and gets constantly worse. Once in, we're stuck with it forever. I just don't feel it is worthwhile spending time on this.

This took me 30 minutes to implement, please stop using "time" and "complexity" as an argument against this when it's already been demonstrated these are not issues here.

The argument I do understand is that the feature is not "useful". Reading between the lines of your response, this seems the most likely reason for you rejecting the idea, in which case I accept your rejection since I'm not sure myself how useful it really is. I can personally say that for me it would be "somewhat useful", but definitely not necessary. Closing.

@marler8997 marler8997 closed this Mar 4, 2018
@WalterBright
Copy link
Member

I understand you're annoyed at my response. I probably would be, too. But I did want to stop this before too much time was invested in it. I'm glad it's only 30 minutes. Your time is valuable. I also value that you took the initiative to do this sort of thing.

This took me 30 minutes to implement, please stop using "time" and "complexity" as an argument against this when it's already been demonstrated these are not issues here.

That's only the start of the time it takes. The code needs work, meaning it'll take time to review it and write things up, more time for you to implement it, back again for review. Once in, it consumes the time of everyone who maintains it. It consumes time needed to document this functionality in the changelog and the web site. As the old joke goes, the last 10% takes 90% of the time.

Meanwhile, we have a huge backlog of PRs needing review and bugzilla issues that are blocking people.

@marler8997
Copy link
Contributor Author

marler8997 commented Mar 4, 2018

Hey, it only took me a half hour because I'm using a brilliant language!

I think the reason I got annoyed is that I was under the impression that the main argument against this change was that it would be too complex. This argument seemed weak to me so I put in work to demonstrate that it wasn't complex thinking this would tip the scale. However, I think now it's clear this wasn't the main concern that needed to be addressed making it a moot point and a waste of time.

This is a common pattern I've seen in technical debate. People will push the discussion towards their opponent's weak arguments (for obvious reasons). What's odd is that I find people will strongly defend their own weak arguments, not because they're "defendable" but because those are the ONLY ones being attacked. This makes each side look absurd to the other when in reality there are strong arguments for each side.

Rejecting ideas as soon as possible saves everyone time. However, rejecting an idea with weak arguments ends up creating unhelpful debate. Even if you provide both the strong and weak arguments, the opposition will always focus on the weak ones so most times its better not to include them. Intuitively we think it makes our case stronger, but in a debate it's just more surface area for the opposition to attack.

NOTE: weak arguments are good to provide when there is no one else to argue against you. Great for things like documentaries where only one side is presented ;)

I've had many a times where I know something is not a good idea but I fail to clearly state the main reasons which creates unnecessary debate and dissension. Sometimes you just have a "gut feeling" about something that is probably based on valid reasons put together by your subconscious, but it's not always easy to identify those important reasons. We're humans who have to use heuristics when it comes to making decisions since we can't keep track of every single experience/argument in our heads all at once.

If there's some value to salvage from this, I think it's that it helps to take some time to identify the main reasons behind our decisions so we waste less time debating over the less important things. If we can't identify good reasons for a decision, or it will take too much time and effort to do so, it's better to leave out the minor reasons so you don't encourage people to waste their time/effort to argue against points that don't matter in the end.

I've include some examples of what I believe to be good general canned responses to ideas to shut down ideas without creating unhelpful debate:

I see some value in this idea but not enough to warrant the cost to maintain it

This is an interesting idea, however, I don't think it has enough value. We could spend time debating the pros and cons but I'm fairly confident from my initial impression that it won't come out on top. I'm not saying it's a bad idea but I do think our time would be better spent elsewhere.

@WalterBright
Copy link
Member

Some good ideas. Thanks!

@MartinNowak
Copy link
Member

Well, add phobos and use splitter(','), just trolling ;).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants