Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raw strings proposal #1027

Closed
4 tasks done
andrevidela opened this issue Feb 4, 2021 · 1 comment
Closed
4 tasks done

Raw strings proposal #1027

andrevidela opened this issue Feb 4, 2021 · 1 comment

Comments

@andrevidela
Copy link
Collaborator

andrevidela commented Feb 4, 2021

Current status

This issue keeps track of the progress regarding the raw strings proposal. It comes from #1004. Here is the todo list:

Summary

While writing string literals one might be inconvenienced by the heavy amount of escaping needed when displaying characters such as \ or ". Raw strings provide syntax to dynamically change the escaping sequence of a string literal such that those character can be written plainly.

Motivation

Raw strings is a feature that allow programmers to write strings between special delimiters such that they do not have to escape anything and the string is interpreted as-is without any transformation from the compiler. This allows for the following use cases:

• Writing regex expression while avoiding escape characters
• Writing strings which are heavy in " which need escape characters
• Writing strings which contains special character that need to be rendered as is, specially characters like\n

Without raw strings, this is how the given example would be expressed in Idris:

let regex = "(\\w\\d)+"
let documentation = "the token for multiline string is \"\"\""
let multilines = "in order to go to the next line, use \\n \n like so"

Those example get particularly bad when mixed together:

let regex = "\"\"\"\\n(\\s*\\w+\\n)*\\s*\"\"\""

Additionally, raw strings are useful when pasting code from other sources, typically one might want to insert a bash script as a literal string as part of a CI jobs:

ci_job : String 
ci_job = "rm -rf build/ttc # clean the ttc files to avoid reusing old build\n" ++
"make all # Compile everything \n"++
"make test | tee results.back # save the results as a backup file `results.back` is always the current run\n"++
"cp results.back ../previous\\ runs/test\\ results:\\ $(DATE).back # Save the current run for reference later\n"++
"                                                          # useful to compare a previous run with results.back\n"++
"make clean"

The design goal for this proposal is thefore as follow:

• Allow any string to be represented within raw string, including line breaks
• Allow dynamic escape sequences with \#

The proposal

Reuse the syntax and semantics of Swift's raw strings #"…"#. It comes with the following benefits:

• Already familiar for Swift programmers
• Handles to escape characters using dynamic delimiters for the string
• The changes from normal strings are minimal

This means the previous examples can be rewritten like this:

let regex = #"(\w\d)+"#
let documentation = #"the token for multiline string is """"#
let regex = #""""\n(\s*\w+\n)*\\s*""""#

This allows programmers to copy paste code safely without having character be accidentally escaped or white space trimmed like happens with string interpolation. This is different from the Swift implementation which require to use raw-multiline strings.

Additionally, the Swift implementation allows to declare exactly what the closing delimiter should be by using the opening delimiter. This allows to express every string witout restriction on its content. Take this HTML example

let link = #"<a href="#C4">Jump to Chapter 4</a>"#

This would not parse because the sequence "# appears in the content of the string. However we can change the delimiters for raw strings to be "## instead in order to insert "# while avoiding escapes:

let link = r##"<a href="#C4">Jump to Chapter 4</a>"## -- valid

FInally, the bash example can be pasted without risk of errors:

ci_job : String 
ci_job = r##"rm -rf build/ttc # clean the ttc files to avoid reusing old build
make all # Compile everything
make test | tee results.back # save the results as a backup file `results.back` is always the current run
cp results.back ../previous\ runs/test\ results:\ $(DATE).back # Save the current run for reference later
                                                          # useful to compare a previous run with results.back
make clean"##

Raw strings can be used for ascii art and the custom delimiters allow to escape sequences as to allow string interpolation (#555)

banner : String
banner = #"
         ____    __     _         ___                                           
        /  _/___/ /____(_)____   |__ \                                       
        / // __  / ___/ / ___/   __/ /     Version \#{showVersion True version}
      _/ // /_/ / /  / (__  )   / __/      https://www.idris-lang.org           
     /___/\__,_/_/  /_/____/   /____/      Type :? for help                     
    
    Welcome to Idris 2.  Enjoy yourself!"#

Alternatives considered

This design has been borrowed from the Swift and Rust discussions, you can read their respective discussion here and here

Conclusion

Combined with multi-line strings and string interpolation raw strings offer a powerful way to insert larges string literals without worrying that the compiler might mis-parse some parts of it. It also allows commonplace features like regex matching to be written with a lot more confidence since raw strings are wysiwyg.

@gallais
Copy link
Member

gallais commented Nov 24, 2021

I think this is done.

@gallais gallais closed this as completed Nov 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants