Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strings are not optimized as vector of characters #75618

Open
bugadani opened this issue Aug 17, 2020 · 7 comments
Open

Strings are not optimized as vector of characters #75618

bugadani opened this issue Aug 17, 2020 · 7 comments
Labels
A-mir-opt Area: MIR optimizations A-str Area: str and String

Comments

@bugadani
Copy link
Contributor

bugadani commented Aug 17, 2020

This might be a long shot, but it would be really nice if Rust could do this. I'm working on an embedded application, where I draw some text. For a single symbol, I need to use a different font than I use to draw the rest of the text. Right now, Rust doesn't know that I use a single character and the binary contains all of the unnecessary text processing code to map that character to an index and to draw it.

It would be really cool if Rust could recognize that my string is a constant instead of a black box value.

I've reduced an example: https://rust.godbolt.org/z/GrnE79

@tesuji
Copy link
Contributor

tesuji commented Aug 17, 2020

You want str::bytes for the compiler to optimize it out.

@bugadani
Copy link
Contributor Author

But I need characters, not bytes :)

@tesuji
Copy link
Contributor

tesuji commented Aug 17, 2020

I don't know if rustc could be able to decode utf-8 at compile time.
At least those utf-8 decode functions are not const fn.

@workingjubilee
Copy link
Member

workingjubilee commented Aug 17, 2020

A String conceptually wraps a Vec of u8s, a char wraps a u32, so a Vec of chars would be a non-optimization for most purposes, pretty sure. Can you not render a char literal? I assume the answer is "no" and that this is more complicated than it sounds, here.

@bugadani
Copy link
Contributor Author

bugadani commented Aug 17, 2020

Oh I was no way suggesting that a str should represent a Vec, I see putting it like that was unfortunate. Instead, what I want from rustc is to understand that a "literal".chars() could be treated similarly. Iterating through characters and elements of a vector are optimized radically differently, and as I understand (and as @lzutao said) this is probably because the decoding is not const.

I was kind of expecting this since rustc/llvm inlines and optimizes a lot of non-const code so I was a bit surprised when this one didn't.

@bugadani
Copy link
Contributor Author

bugadani commented Aug 26, 2020

Looks like this is caused by next_code_point. As long as it's not part of a loop (or the loop can be unrolled), it's fine:

https://rust.godbolt.org/z/778zvW

@camelid camelid added A-mir-opt Area: MIR optimizations A-str Area: str and String labels Oct 20, 2020
@Nugine
Copy link
Contributor

Nugine commented Oct 27, 2022

You can convert a string to an array of chars by const-evaluation. See const_str::to_char_array

pub fn sum(chars: &[char]) -> u32 {
    chars.iter().map(|&c| c as u32).sum()
}

const S: &str = "中文English🤣";

pub fn s() -> u32 {
    sum(&const_str::to_char_array!(S))
}

#[test]
fn test() {
    let chars = S.chars().collect::<Vec<_>>();
    assert_eq!(s(), sum(&chars))
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-mir-opt Area: MIR optimizations A-str Area: str and String
Projects
None yet
Development

No branches or pull requests

5 participants