-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SUGGESTION] String interpolation in Syntax 2 #159
Comments
I do agree that
Additionally, i think that string formatting should be opt-in, for exame via a string literal prefix. Such as
IMO, string literals should always be a compile-time constant, and never a dynamic type by default. As such, formatting should be opt-in to indicate the extra cost of formatting, and to indicate that the resulting literal types are different. |
The opt-in (e.g,. |
Thanks! I'm open to changing this syntax, but note that it's essential not to look at string interpolation in isolation. I view it as one place we do "capture" which to me means "take an expression in context and store a copy of its value for use later": I was writing such a long answer that I turned it into a Design note: Capture. I'll keep this open for now because of the suggestion of having a string literal prefix to opt into interpolation. However, I think that suggestion is also related to raw string literal, which are currently an opt-out in the other direction (strings already allow special handling and if you want to disable it you opt out by saying "no, instead of the default I want a non-preprocessed raw string here" whereas this suggestion is to do the opposite for interpolation and opt into enabling it by saying "no, instead of the default I want an interpolated non-raw string here"). |
This seems a little weird to me, but I'm having a hard time describing exactly why that is. I guess it's because of years of history for us veterans that normal C++1 string literals just by default have compile-time processing for some escape sequences, simple termination at the first ", and no embedded newlines. It's thus an opt-in with extra syntax to get a more complicated user-defined termination condition, and embedded newlines. It's a bit of an effort for us to view it as "everything from the start sequence to the end sequence is raw". My thoughts right now are that one way to view the opt-in request for variable interpretation in string literals is that it should be "don't pay for what you don't use", and thus you need to opt into the extra expense of runtime string interpretation. I'm not sure which method is best for that. Perhaps it's just a simple conversion of the C# syntax to a postfix syntax: On the other hand, maybe selecting string literal behavior is one place where a prefix is required because it affects parsing. If you need a
The |
IMO, Cpp2 should still follow the "don't pay for what you don't use" philosophy. To me, it does not really make sense to make an expensive, non compile-time, less used option - the default, and the more used, compile-time, zero-cost - the opt-in. Why are we forcing the user to opt-out of a feature to get the more convenient and non-allocating, non-copying version? Additionally, this would require the user to worry about the lifetime of a literal (i.e. you cannot treat literals as static constants), making it hard to use with old compile-time APIs that expect If we want to prevent the use of As for |
I've been mulling this feedback over, and here's where I landed as a path to pursue for now:
Right, and the current implementation does follow the zero-overhead "don't pay for what you don't use" principle: If you don't write an interpolation, there is no overhead. That is, The main advantage I can see for allowing a prefix or suffix outside the string to enable interpolation is to make it a bit more visible, but I do think the And I worry that if we allow interpolation only inside a I'll keep thinking about this, and I'll especially be on the lookout for experience with the current design as I and others write more code with it. As always with experiments, the decision to pursue a particular design path is always tentative, to see where it leads and be open to new information discovered by trying it out; if the path turns out to lead to issues, we backtrack with that new information and try a different branch in the design space tree walk (i.e., the design space tree walk is typically depth-first). For now, I'll keep pressing down this path to see where it leads as a reasonable direction to experiment with, so I'll close this for now and we can reopen it when there's new data. Thanks for understanding, and again for the input! |
There is a minor overhead on the compile side of looking for replacements in the string. That's probably small enough not to worry about, but I guess we'll see.
I'm still trying to wrap my head around differences in the types between "This is a string with no replacements." and "This is a string with (replacements)$". The first is a compile time constant, and the second is not. I'm not sure if the contents of the string itself is enough of an indicator of the type and performance difference. I guess we'll see as the project goes along. |
That reminds me of something I was thinking but forgot to write down: We already have a similar situation with capture in C++ today with lambdas, where a no-capture lambda Does that help? |
It's simply that whether or not a "string literal" is a compile-time constant string or a runtime-assembled variable string is based on which characters are used in the string, which has never been the case in the past. It's not necessarily bad, but it's definitely different.
It's definitely a pain point based on questions I've seen about how to make it work. However, it's not a security or correctness or even really an unexpected error pain point. Trying to assign a capturing lambda to a function pointer is an error that will be caught every time. I think as I've been writing this I've narrowed down my primary concerns in this area to ownership and performance issues. As long as these runtime generated strings can only be assigned to something that owns them and is going to destroy them properly when they go out of scope, such as a std::string, then it's not a safety issue, and just perhaps a performance issue. As long as both
FYI, I noticed while compiling this that currently
|
@hsutter I think that it isn't an issue with lambdas since the captures are placed at the start (i.e. they are a prefix), and as such are easy to see, so you do not need to examine the body of a lambda to see if it captures anything. Examining strings for interpolation syntax may require more effort than that, especially for lengthy strings. Considering that parentheses are fairly common in text, confusing But I do agree, it would be better to experiment with both the implicit and the explicit approaches to see if one is better than the other. If we do pursue the implicit approach though, I would advocate to look at the |
…comment in #159 Don't add `+ ""` or `"" + ` when interpolations are at the very beginning or end of the string, or adjacent.
@gregmarr Yes, I'd long noticed the @switch-blade-stuff Understood, but again when considering an alternate interpolation syntax, just remind people to also look at the other three places where capture happens (expression-scope functions, postconditions, and in the future source generation) and make sure it works well for all of them, not just strings. :) String interpolation shouldn't be that special, I think. Thanks! |
You're welcome. However, you need some whitespace before your
results in
which will end up being replaced to
I had an interesting thought. What if an interpolated string was a shortcut for
Taking some examples from cppreference
|
This is the same idea I had earlier, and imo it would be a good thing to use |
So it is. Maybe that's why I had been thinking about it. :) |
Fixed, thanks! |
@switch-blade-stuff Hmm, as long as the format-spec was a suffix and only legal in a string interpolation (not other captures, and this way capture would still be the same everywhere just a subset would be legal in the other contexts) that could be interesting. I'll put it in the queue of future things to look at. Thanks! |
…comment in hsutter#159 Don't add `+ ""` or `"" + ` when interpolations are at the very beginning or end of the string, or adjacent.
Apologies if this is obvious, but is string interpolation even capturing? My understanding is that string interpolation occurs at the definition, not at some later point, so no "capture" should be occuring because the inputted values are used immediately. It's like a special function that takes a variadic number of arguments, not a lambda where you need to specify if you want to copy a value as it is at definition rather than what it is when called. If my understanding is incorrect and it is indeed capturing somehow, I'd also propose then that there's technically two concepts here that don't need to be bundled into one:
I think that the argument around whether
Where we use the I'm also personally much more inclined towards using braces than parens, it's the standard C++1 and all other modern languages have adhered to. If we're talking about metrics of "reducing the amount that programmers have to learn", by using braces, we have one less quirk that programmers moving to-or-from C++2 have to remember. String interpolation is already a special-use syntax. |
Quick ack: I try to explain this in the ~3 minutes of the talk starting at 1:30:53 -- maybe the way I say it there could be helpful?
I think so, it's capturing by value where the string/lambda is declared and then using the value later. The main difference is that as a convenience it converts the captured thing to a string as needed (so it's like capturing an integer Example:
Right. Same as lambda capture occurs at the lambda capture occurs at the lambda's definition.
In string capture the values are read immediately where the string is defined, and captured (copies are stored) for later use when the string is used.
In lambda capture the values are read immediately where the lambda is defined, and captured (copies are stored) for later use when the lambda is used. Does that help? |
Currently,
()$
is used for string interpolation in Syntax 2, while{}
is already available in C++20 and without introducing any new symbols,{}
can be extended to support expressions for string interpolation in a way similar to f-strings in Python.{}
for string interpolations is available in C# and Python (two popular programming languages) besides C++20.Also
()$
is not an operator, it is an expression block inside a string,()$
is more like a language construct than a postfix unary operator, and it shouldn't be treated as a postfix unary operator, therefore it could be$()
instead of()$
, and it could be just()
or in a better way, it could be just{}
because{
and}
have less usage in strings than(
and)
.The text was updated successfully, but these errors were encountered: