-
-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String Interpolation #165
String Interpolation #165
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,161 @@ | ||
# String Interpolation | ||
|
||
| Field | Value | | ||
|-----------------|-----------------------------------------------------------------| | ||
| DIP: | (number/id -- assigned by DIP Manager) | | ||
| Review Count: | 0 (edited by DIP Manager) | | ||
| Author: | Walter Bright walter@digitalmars.com | | ||
| Implementation: | (links to implementation PR if any) | | ||
| Status: | Will be set by the DIP manager (e.g. "Approved" or "Rejected") | | ||
|
||
## Abstract | ||
|
||
Instead of a format string followed by an argument list, string interpolation enables | ||
embedding the arguments in the string itself. | ||
|
||
|
||
## Contents | ||
* [Rationale](#rationale) | ||
* [Prior Work](#prior-work) | ||
* [Description](#description) | ||
* [Breaking Changes and Deprecations](#breaking-changes-and-deprecations) | ||
* [Reference](#reference) | ||
* [Copyright & License](#copyright--license) | ||
* [Reviews](#reviews) | ||
|
||
## Rationale | ||
|
||
While the conventional format string followed by the argument list is perfectly fine for | ||
short strings and a small number of arguments, it tends to break down with longer strings | ||
with many arguments. Omitting an argument, having an extra argument, and having a mismatch | ||
between a format specifier and its corresponding argument are common errors. By | ||
embedding the argument in the format string tends to eliminate these errors. It's easier | ||
to read and visually easier to review for correctness. | ||
|
||
## Prior Work | ||
|
||
* Interpolated strings have been implemented and well-received in many languages. | ||
For many such examples, see [String Interpolation](https://en.wikipedia.org/wiki/String_interpolation). | ||
* Jason Helson has submitted a DIP [String Syntax for Compile-Time Sequences](https://github.com/dlang/DIPs/pull/140). | ||
* [Adam's string interpolation proposal](http://dpldocs.info/this-week-in-d/Blog.Posted_2019_05_13.html) | ||
|
||
## Description | ||
|
||
``` | ||
writefln(i"I ate %apples and %{d}bananas totalling %(apples + bananas) fruit."); | ||
``` | ||
gets rewritten as: | ||
``` | ||
writefln("I ate %s and %d totalling %s fruit.", apples, bananas, apples + bananas); | ||
``` | ||
It will also work with printf: | ||
|
||
``` | ||
printf(i"I ate %{d}apples and %{d}bananas totalling %{d}(apples + bananas) fruit.\n"); | ||
``` | ||
becomes: | ||
``` | ||
printf("I ate %s and %d totalling %s fruit.\n", apples, bananas, apples + bananas); | ||
``` | ||
|
||
The `{d}` syntax is for when the format specifier needs to be anything other that `s`, | ||
which is the default. What goes between the `{` `}` is not specified so this capability | ||
can work with foreseeable format specification improvements without needing to update | ||
the core language. It also makes interpolated strings agnostic about what the format | ||
specifications are, as long as they start with `%`. | ||
|
||
|
||
The interpolated string starts as a special string, `InterpolatedString`, which is the same as a | ||
`DoubleQuotedString` but with an `i` prefix and no `StringPostFix`. This appears in the grammar | ||
as an `InterpolatedExpression` which is under `PrimaryExpression`. | ||
|
||
`InterpolatedExpresssion`s undergo semantic analysis similar to `MixinExpression`. | ||
The string scanned from left to right, according to the following grammar: | ||
|
||
``` | ||
Elements: | ||
Element | ||
Element Elements | ||
|
||
Element: | ||
Character | ||
'%%' | ||
'%' Argument | ||
'%' FormatString Argument | ||
|
||
FormatString: | ||
'{' FormatString '}' | ||
CharacterNoBraces | ||
|
||
CharacterNoBraces: | ||
CharacterNoBrace | ||
CharacterNoBrace CharacterNoBraces | ||
|
||
CharacterNoBrace: | ||
characters excluding '{' and '}' | ||
|
||
|
||
Argument: | ||
Identifier | ||
Expression | ||
|
||
Expression: | ||
'(' Expression ')' | ||
CharacterNoParens | ||
|
||
CharacterNoParens: | ||
CharacterNoParen | ||
CharacterNoParen CharacterNoParens | ||
|
||
CharacterNoParen: | ||
characters excluding '(' and ')' | ||
``` | ||
|
||
The `InterpolatedExpression` is converted to a tuple expression, where the first element | ||
is the transformed string literal, and the `Argument`s form the rest of the elements. | ||
|
||
The transformed string literal is constructed as follows: | ||
|
||
If the `Element` is: | ||
|
||
* `Character`, it is written to the output string. | ||
* `'%%'`, a '%' is written to the output string. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Shouldn't it better be that %% stays as a %% in the resulting format string. Or else one will need to put %%%% in the interpolated string to get a % in the result of the writef or the printf. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How would you write a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If during transformation of the interpolated string the %% becomes a %, then the format string will contain an isolated % which is an error in a format string. A double percent in the interpolated string has to stay a double percent in the format string, or else you would have to put 4 % chars. |
||
* `'%' Argument` then '%s' is written to the output string. | ||
* `'%' '{' FormatString '}' Argument` then '%' `FormatString` is written to the output string. | ||
|
||
If the `Argument` is an `Identifier` it is inserted in the tuple as an `IdentifierExpression`. | ||
If the `Argument` is an `Expression` it is lexed and parsed (including the surrounding parentheses) | ||
like `MixinExpressions` and inserted in the tuple as an `Expression`. | ||
|
||
Compile time errors will be generated if the `Elements` do not fit the grammar. | ||
|
||
### Limitations | ||
|
||
Interpolated string formats cannot be mixed with conventional elements: | ||
|
||
``` | ||
writefln(i"making %bread using %d ingredients", 6); // error, %d is not a valid element | ||
``` | ||
|
||
Interpolated strings won't work with `*` format specifications that require extra arguments. | ||
This will produce a runtime error with `writefln` and undefined behavior with | ||
`printf`, because the arguments won't line up with the formats. The compiler does not check | ||
the formats for validity. | ||
|
||
No attempt is made to check that the format specification is compatible with the argument type. | ||
Making such checks would require that detailed knowledge of `printf` and `writef` be hardwired | ||
into the core language, as well as knowledge of which formatting function is being called. | ||
|
||
|
||
## Breaking Changes and Deprecations | ||
|
||
Since the interpolated string is a new token, no existing code is broken. | ||
|
||
## Reference | ||
|
||
## Copyright & License | ||
Copyright (c) 2019 by the D Language Foundation | ||
|
||
Licensed under [Creative Commons Zero 1.0](https://creativecommons.org/publicdomain/zero/1.0/legalcode.txt) | ||
|
||
## Reviews |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should they all be
d
's?