-
Notifications
You must be signed in to change notification settings - Fork 898
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comment indentation/alignment changed in Rust 1.81 #6351
Comments
Thanks for the report. I took a look at the 1.7.1 CHANGELOG, but it's unclear to me which of those changes, if any, caused this. Also, I'm unable to reproduce this when building the v1.7.1 tag and running rustfmt on the input.
That said, I am able to reproduce this when I run with the
There's a chance that this change in behavior is related to changes in rustc (it has happened before). I think the best way to figure out what's going on here is to bisect between the |
searched nightlies: from nightly-2024-06-07 to nightly-2024-07-20 bisected with cargo-bisect-rustc v0.6.9Host triple: x86_64-unknown-linux-gnu cargo bisect-rustc --start=1.80.1 --end=1.81.0 -c rustfmt -- fmt --check |
I've tracked down the regression to the This patch reverses the behaviour and fixes the regression: diff --git a/src/utils.rs b/src/utils.rs
index d1cfc6ac..31957e97 100644
--- a/src/utils.rs
+++ b/src/utils.rs
@@ -690,7 +690,7 @@ impl NodeIdExt for NodeId {
}
pub(crate) fn unicode_str_width(s: &str) -> usize {
- s.width()
+ s.replace('\t', "").width()
}
#[cfg(test)]
@@ -713,4 +713,9 @@ mod test {
Some("aaa\n bbb\n ccc".to_string())
);
}
+
+ #[test]
+ fn tab_width() {
+ assert_eq!(unicode_str_width("\t"), 0);
+ }
} |
Thanks for digging into this! That helped me remember that #6203 mentioned there would be an impact to non-ascii unicode chars, but I don't think we expected this to impact @calebcartwright do you have any thoughts on how we should handle this breakage? If we don't run this with enum Enum12 {
Fn,
NotEquals,
Backslash,
}
fn parse_symbol2(ch: char) -> Enum12 {
match ch {
'=' => Enum12::Fn,
'\u{2260}' => Enum12::NotEquals, // unicode not equal to symbol
'\\' | '\u{3bb}' => Enum12::Backslash, // lambda symbol
_ => todo!(),
}
} Maybe that suggests that |
I don't think either Here's an example that's broken with Rust 1.81's version of rustfmt (i.e. unicode-width v0.1.13):
|
Agreed that neither |
|
That's not correct, only the behaviour for newlines was reverted; the behaviour for tabs and other control characters has not been reverted. Regardless, passing any ASCII character <=31 to unicode-width is a bug. Other users of the crate such as rustc use code like this, which rustfmt also needs to do: |
No
Caveat this by saying I'm not looking at the code so just brainstorming, but if all the call sites have the relevant info (presumably the config, maybe the shape?) in context would it be conceivable to pass those along and swap the tab width for the configured tab spaces that will be replaced later and just use the unicode crate for the width on other characters? |
When I updated to Rust 1.81 I noticed that the comment indentation changed. Not sure if this is an intentional breakage or a bug.
I have the following code (formatted with Rust 1.80.1, rustfmt 1.7.0):
.rustfmt.toml:
When I updated to Rust 1.81 (rustfmt 1.7.1) I got this diff:
Strangely the bug seems to be very dependent on the length of some of my identifiers. If I change the enum name to
Enum
orE
the formatting difference goes away. Are comments like this meant to be aligned or not?The text was updated successfully, but these errors were encountered: