The language reference doesn't explain anything about string literals containing newlines

```
"ab\ncd"
```

denotes a Unicode string U+0061 U+0062 U+000a U+0063 U+0064.

```
"ab
cd"
```

also denotes the same Unicode string.

On the other hand,

```
"ab\
cd"
```

and

```
"ab\
    cd"
```

denotes a Unicode string U+0061 U+0062 U+0063 U+0064.  The Rust lexer ignores an "escaped newline" optionally followed by a sequence of "whitespace" characters. (Update: the following complain about the lack of Rust's definition of "whitespace" was incorrect and I retract it. ~~defined by the below function in `libsyntax/parser/lexer/mod.rs`)~~

``` rust
pub fn is_whitespace(c: Option<char>) -> bool {
    match c.unwrap_or('\x00') { // None can be null for now... it's not whitespace
        ' ' | '\n' | '\t' | '\r' => true,
        _ => false
    }
}
```

~~This predicate doesn't follow the traditional definition of "space" [(by the C language)](http://www.cplusplus.com/reference/cctype/isspace/) or [Unicode's definition of "whitespace"](http://en.wikipedia.org/wiki/Whitespace_character#Unicode).  So if we use Unicode ideographic space (colloquially known by Japanese as "full-width space"), the space-munchinig logic doesn't work.  For example~~

```
"ab\
　cd"
```

~~denotes a Unicode string U+0061 U+0062 U+3000 U+0063 U+0064.  Of course, such a decision is totally up to language designers, but it is desirable to give a clear explanation about it.~~

As for a character literal, it's interesting that the lexer rejects some kinds of "space" characters:

``` rust
            '\t' | '\n' | '\r' | '\'' if delim == '\'' => {
                let last_pos = self.last_pos;
                self.err_span_char(
                    start, last_pos,
                    if ascii_only { "byte constant must be escaped" }
                    else { "character constant must be escaped" },
                    first_source_char);
                return false;
            }
```

For example, this Rust code is rejected:

``` rust
println!("{}", '
');
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The language reference doesn't explain anything about string literals containing newlines #19399

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The language reference doesn't explain anything about string literals containing newlines #19399

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions