Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle non-ascii character at boundary #5089

Merged
merged 2 commits into from
Feb 2, 2022

Conversation

scampi
Copy link
Contributor

@scampi scampi commented Nov 17, 2021

Close #5023

@scampi
Copy link
Contributor Author

scampi commented Nov 18, 2021

With the non-ascii character at the boundary, the value of max_width_index_in_input is 0. The proposed fix avoids the overflow.
However, I wonder if I should instead amend the algorithm that updates max_width_index_in_input so that it doesn't return 0.

rustfmt/src/string.rs

Lines 269 to 280 in 76d626f

let max_width_index_in_input = {
let mut cur_width = 0;
let mut cur_index = 0;
for (i, grapheme) in input.iter().enumerate() {
cur_width += unicode_str_width(grapheme);
cur_index = i;
if cur_width > max_width {
break;
}
}
cur_index
};

@calebcartwright
Copy link
Member

Yeah I was thinking that this should address the reported issue but was wondering if these types of characters appeared in different positions if it could still make us split in the wrong location.

I'd be okay moving ahead with this given the immediate relief it would provide, and consider a broader fix in a follow up PR. Happy to park on this though if you want to experiment with a different approach though

@scampi
Copy link
Contributor Author

scampi commented Jan 30, 2022

@calebcartwright I don't find a way to fix the computation of max_width_index_in_input, but I have replaced the previous fix with an early termination check. I think that makes the rest cleaner, knowing that max_width_index_in_input is at least 1.

Copy link
Member

@calebcartwright calebcartwright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, seems like a reasonable approach for now and certainly addresses the pressing issue at hand 👍

@calebcartwright calebcartwright merged commit 368a9b7 into rust-lang:master Feb 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Panic with wrap_comments and non-ascii character at comment wrap boundary
3 participants