Proposal: Support "raw" string literals #89
-
Copied from dotnet/roslyn#2239 Problem:Verbatim string literals are often used to copy in chunks of text such as SQL or XML. It's not uncommon for this text to contain embedded double-quotes, which then must be doubled-up in C# in order to be escaped. This makes it obnoxious if you have to copy and paste the text back and forth since it would have to be fixed each time. string xml = @"<?xml version=""1.0""?>
<root>
<item attr=""foo"" />
</root>"; Solution:By supporting a custom notation for starting and ending a string literal it enables evaluating the double quotes as part of the string that do not have to be escaped. The programmer would use a pattern at the beginning of the string which then must be matched to denote the end of the string. Proposed syntax would be based on verbatim string syntax: raw-string-literal: raw-string-literal-characters: The The following example is a raw string without a custom delimiter. It is terminated simply by the sequence string xml = @("<?xml version="1.0"?>
<root>
<item attr="foo" />
</root>"); The following example is a raw string with a custom delimiter. As the custom delimiter is specified as "foo" that sets the terminating sequence to be In this example the custom delimiter is "foo". Note that the attribute value is not considered as the end of the string. However, if it was followed by a parenthesis it would be and a different custom delimiter might have to be chosen. string xml = @(foo"<?xml version="1.0"?>
<root>
<item attr="foo" />
</root>"foo); Here are examples of raw strings containing the syntax for other raw strings: string raw1 = @(foo"string s = @("Hello!");"foo);
string raw2 = @(bar"string raw1 = @("string s = @("Hello!");"foo);"bar);
string raw3 = @(baz"string raw2 = @(bar"string raw1 = @("string s = @("Hello!");"foo);"bar);"baz); This syntax is loosely based on C++11 raw string literals, although I personally don't care for the parenthesis or delimiter appearing within the body of the string. I'm definitely not married to this syntax and variations are welcome. Existing UserVoice suggestion, using C++11-like syntax: Allow to have custom delimiters in raw string literal |
Beta Was this translation helpful? Give feedback.
Replies: 48 comments 20 replies
-
I like this, although I'm not sure about the syntax. It seems similar in nature to VB.NET's XML literals. |
Beta Was this translation helpful? Give feedback.
-
The syntax is largely borrowed from C++ with some tweaks. C++ raw strings look like this: const char* xml = R"foo(<?xml version="1.0"?>
<root>
<item attr="foo" />
</root>)foo"; My personal opinion is that it's bizarre that the raw string termination identifier appears within the string. While containing unescaped XML would certainly be a goal I wouldn't want to limit these strings to any specific form of content, such as HTML, SQL or JSON each of which has similar issues. |
Beta Was this translation helpful? Give feedback.
-
Absolutely. I was really just mentioning that this is similar in nature to VB's XML literals, but those literals don't require any special delimiters or structuring. What about something like this: string xml = @("foo(<?xml version="1.0"?>
<root>
<item attr="foo" />
</root>")@; So it's the pair of |
Beta Was this translation helpful? Give feedback.
-
My proposal covers that syntax. But that would mean that the sequence Note that the following two examples would produce the exact same string, the same string produced by the C++ example: string xml1 = @("<?xml version="1.0"?>
<root>
<item attr="foo" />
</root>");
string xml2 = @(foo"<?xml version="1.0"?>
<root>
<item attr="foo" />
</root>"foo); That |
Beta Was this translation helpful? Give feedback.
-
With the example: string xml = @(foo"<?xml version="1.0"?>
<root>
<item attr="foo" />
</root>"foo); You appear to be making the job of the compiler harder than it need be as, upon arriving at string xml = @(bar"<?xml version="1.0"?>
<root>
<item attr="foo" />
</root>"bar); |
Beta Was this translation helpful? Give feedback.
-
Why not use here documents for that? Instead of a new delimiter you could use any custom delimiter that doesn't appear in the string. Something like: string xml = @@bar
<?xml version="1.0"?>
<root>
<item attr="foo" />
</root>
bar; |
Beta Was this translation helpful? Give feedback.
-
I think having to ensure that the string can never contain the delimiter would make it harder on the developer than it should need to be. If you're copying/pasting a relatively large blob of formatted text you wouldn't want to have to scan the entire text to find out what would be a legal delimiter. That parsing requirement is exactly what has been imposed on C++ and it doesn't seem to have any issues with it. Also seems to be the common theme with heredocs. Can't say I'm a fan of a string format that doesn't use double-quotes or is whitespace sensitive. I see that D has a syntax that is somewhat between other heredocs and C++ raw strings by using a string xml = q"foo
<?xml version="1.0"?>
<root>
<item attr="foo" />
</root>
foo"; Of the heredoc syntax I'd probably prefer something along those lines the best: string xml = @@"foo
<?xml version="1.0"?>
<root>
<item attr="foo" />
</root>
foo"; |
Beta Was this translation helpful? Give feedback.
-
@CyrusNajmabadi has suggested a different string literal syntax. It's not quite "raw" strings since there is still an escape sequence involved but the goal is to solve similar issues in avoiding having to use escape sequences:
That would allow the following to be string literals: var s1 = @`This is a string that contains a double quote: " `;
// with an escape sequence
var s2 = @`This is a string that contains a double quote: " and a backtick: `` `;
// change char to no longer need escape sequence
var s3 = @-This is a string that contains a double quote: " and a backtick: ` -; One advantage to this syntax is that it would allow interpolation: var name = "Cyrus";
var json = $@`{{ "name": "{name}" }}`; Which could potentially also be supported in my proposed syntax: var name = "Halo";
var json = $@("{{ "name": "{name}" }}"); Although in both cases it is a tad awkward since you have to escape the curly brackets. So between the two the major differences are:
|
Beta Was this translation helpful? Give feedback.
-
I'd like to additionally propose Swift-like handling of the initial white space: if we have multiline raw literal, we want to allow ignored spaces at the beginning of each line. Swift resolves this in the following way:
|
Beta Was this translation helpful? Give feedback.
-
If you need to escape all braces, this is no longer a "raw string". probably it's better to escape interpolations instead. |
Beta Was this translation helpful? Give feedback.
-
Agreed. I honestly don't care for trying to make interpolation and "raw" strings work. You still end up having to escape something. The same is true in Swift. You still have to escape |
Beta Was this translation helpful? Give feedback.
-
I think strings with custom delimiters would need custom interpolators, too. Something like: |
Beta Was this translation helpful? Give feedback.
-
doesn't have to be a different character,
so the interpolation is started via also I don't think |
Beta Was this translation helpful? Give feedback.
-
I feel like Python has solved this pretty eloquently with their triple quoted strings and support for both
I can't think of a time where I ran into a problem trying to store something with this approach. |
Beta Was this translation helpful? Give feedback.
-
Is there any progress on this? Being able to paste in raw json content instead of either quoting all quotes or putting the content in separate files would be an immense time/confusion saver. Python's """ or some variation of heredoc would be fine with me. |
Beta Was this translation helpful? Give feedback.
-
You can already use resources to accomplish something similar. But that moves the content into another file that you have to navigate to in order to view the string. At this point most languages aside from C# seem to have added this feature, including C++ and Java. Its inclusion would be far from controversial. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
There's plenty of content in the 5-10 LOC range where splitting it into a separate file makes little sense and only makes it more difficult to track what is going on. Think SQL statements and the like. People do this already using C#'s verbatim strings, but you have to manually deal with escaping any quotes that might exist within the string. That's why this feature has been added to so many languages. C++ and Java have identical concerns to C# in this area. Yes, tooling could improve that experience within the IDE, but it doesn't improve the situation in source control where a reviewer now has to manually bounce between completely separate parts of the PR to determine what is going on. |
Beta Was this translation helpful? Give feedback.
-
I fail to see how having SQL in seperate files would make it harder to track. In my opinion it would make it easier. If you see a |
Beta Was this translation helpful? Give feedback.
-
Well we'll have to agree to disagree. Having tons of little |
Beta Was this translation helpful? Give feedback.
-
@ZacharyPatten Personally I would prefer my SQL to remain inside my function (and typically the only function it relates to), otherwise it feels similar to breaking up functions into their own files. The SQL execution and the SQL query go hand in hand in my view. As far as feeling the actual frustration, try using Postgres as a database with .NET (or as a custom store for Identity Framework) you will feel the pain! : Here is an example from a Microsoft MVP on the issue. Not to say this should be the only reason for such a change, but I don't see the downside to it. |
Beta Was this translation helpful? Give feedback.
-
@ZacharyPatten Just to add, I see nothing wrong with your idea also, we are probably just seeing it from a different use case. |
Beta Was this translation helpful? Give feedback.
-
I'd also suggest that if you wanted to organize those files separately but not have to deal with loading them as files or as resources at runtime that a source generator would probably be a decent solution. You'd define a partial class annotated with an attribute with the directory and file pattern and the generator could embed the contents of each file as a string member or constant. That'd probably work better than the approach with project resource files today (where the tooling is not well suited to multiline content). But it still leaves the content as separate files. To each their own. I know from my experience that people are already using verbatim strings to embed this content into their source. It works alright but requires escaping double quotes and doesn't round-trip particularly well. |
Beta Was this translation helpful? Give feedback.
-
I think the tripple-quote method looks the cleanest. It's one way forward that works with both @ and $. This would solve the biggest issue, being double-quotes. Once that's implemented, we could move onto more edge cases later. (I absolutely disagree with the use of resource strings. Some may be happy with that, but this is about adding a simple feature that others would like. ) |
Beta Was this translation helpful? Give feedback.
-
Out of interest, does anyone have any creative (hacky) solutions for dealing with this today? I'm working with a lot of multiline strings with embedded JSON in test libraries; it's super frustrating with having to escape
I was specifically wondering about embedding the strings in multiline comments and using reflection to read them back, but couldn't find a way to do this cleanly... |
Beta Was this translation helpful? Give feedback.
-
Typescript uses ` for this, so something as concise as the following would be really nice to have: const example = @`{ "json": 42 }`; |
Beta Was this translation helpful? Give feedback.
-
F# adopts the Python style triple quotes, and I'm really happy with it. I would love for C# to adopt the same, I'm forever cursing under my breath everytime I try to write XML, JSON etc... in a C# string literal today. |
Beta Was this translation helpful? Give feedback.
-
It's great that C# has got so many boundary-pushing features, but it would be great if the C# team could see the value in this feature. So much time is wasted worldwide on quote escaping for HTML/CSS/Javascript snippets. |
Beta Was this translation helpful? Give feedback.
-
I'll champion this. |
Beta Was this translation helpful? Give feedback.
-
I need to add perls
It's kinda similar to the authors:
But IMO more readable:
PS: I'm no perl evangelist by any means, but i found this syntax always quite compelling. |
Beta Was this translation helpful? Give feedback.
Here's my proposal: #4304