[`refurb`] Preserve digit separators in `Decimal` constructor (`FURB157`) #20588

danparizher · 2025-09-26T03:11:46Z

Summary

Fixes #20572

github-actions · 2025-09-26T03:20:36Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

ntBre

Thanks! But I think we need to be more careful here. At least one of the new snapshots is incorrect.

crates/ruff_linter/src/rules/refurb/rules/verbose_decimal_constructor.rs

amyreese · 2025-10-01T01:30:04Z

crates/ruff_linter/src/rules/refurb/rules/verbose_decimal_constructor.rs

+                let numeric_value = format!("{unary}{rest}");
+                add_thousand_separators(&numeric_value)


Would it be better to just strip leading underscores, and otherwise preserve the placement and frequency, rather than assume that thousands are the only use of underscores in numeric literals?

Leading underscores aren't the only problem. Duplicates in the middle cause a syntax error too, but not in the Decimal constructor:

>>> from decimal import Decimal >>> 1__2 File "<python-input-2>", line 1 1__2 ^ SyntaxError: invalid decimal literal >>> Decimal("1__2") Decimal('12')

But I agree in principle, add_thousand_separators does not seem like the right approach to me.

I'm starting to think we should just mark the fix as unsafe if the value we're replacing the original with is different. The edge cases here seem pretty tricky.

Thanks for catching that; I agree that marking it unsafe is probably the way to go to not break too much.

Wouldn't stripping leading underscores and replacing duplicates with a single underscore be enough?

In my opinion keeping Decimal("10_000_000") -> Decimal(10_000_000) transition as safe would be very convenient.

It would be enough to strip leading underscores, strip trailing underscores, strip leading zeros (unless all the digits are zeros), and collapse medial underscore sequences: Decimal(" _-_0_01__2_ ") would become Decimal(-1_2).

That's the behavior I fight for ❤️ 🚀 :)

That sounds reasonable to me, I gave up a bit too soon :) @danparizher do you want to give this a shot? We can still revert to marking the fix unsafe if the code gets too complicated, but this sounds feasible with the transformations we're already doing.

amyreese

lgtm

dscorbett · 2025-10-02T01:34:27Z

The current state of this PR is that underscores are normalized to be thousands separators, but cf. @amyreese’s earlier comment. If a user writes Decimal("1__23_45_000") that formatting was probably intentional so it should become Decimal(1_23_45_000). I think normalizing the underscores more than the minimum needed to avoid a syntax error should be left for a different rule as in #18221.

MichaReiser · 2025-10-21T07:27:42Z

What's the status of this PR? Is there something left that needs addressing or is it good to go?

ntBre · 2025-10-21T13:06:38Z

I don't think we should normalize the digit separators to thousands separators, as @dscorbett pointed out.

Update normalize_digit_separators to retain the original pattern of digit separators instead of forcing thousands separators. This change ensures that only syntax errors are fixed while the user's formatting is preserved.

amyreese · 2025-10-24T19:56:09Z

crates/ruff_linter/src/rules/refurb/rules/verbose_decimal_constructor.rs

+    // Extract only digits from the trimmed string
+    let digits: String = trimmed.chars().filter(char::is_ascii_digit).collect();
+
+    // If no digits found, return 0
+    if digits.is_empty() {
+        return format!("{unary}0");
+    }
+
+    let mut result = String::new();
+    let chars: Vec<char> = trimmed.chars().collect();
+    let mut i = 0;
+
+    while i < chars.len() {
+        if chars[i].is_ascii_digit() {
+            result.push(chars[i]);
+        } else if chars[i] == '_' {
+            // Only add underscore if the previous character was a digit
+            // and we haven't already added an underscore
+            if !result.is_empty() && !result.ends_with('_') {
+                result.push('_');
+            }
+        }
+        i += 1;
+    }
+
+    // Remove trailing underscores
+    result = result.trim_end_matches('_').to_string();


I think this ends up being overly verbose and defensive, and lines 225-251 could be replaced with just:

let result = trimmed .chars() .dedup_by(|a, b| {*a == '_' && a == b}) .collect::<String>();

Leading/trailing underscores were already trimmed by line 218, so there shouldn't be a need to both check that there are some number of digits and the result after trimming/deduplication is not empty.

amyreese · 2025-10-24T19:57:42Z

crates/ruff_linter/resources/test/fixtures/refurb/FURB157.py

+Decimal("15_000_000")  # Safe fix: normalizes separators, becomes Decimal(15_000_000)
+Decimal("1_234_567")   # Safe fix: normalizes separators, becomes Decimal(1_234_567)
+Decimal("-5_000")      # Safe fix: normalizes separators, becomes Decimal(-5_000)
+Decimal("+9_999")      # Safe fix: normalizes separators, becomes Decimal(+9_999)


Would be nice to include test cases showing that non-thousands separators are maintained accordingly, eg "12_34_56_78" and "1234_5678".

ntBre · 2025-10-24T20:45:18Z

crates/ruff_linter/src/rules/refurb/rules/verbose_decimal_constructor.rs

+            if has_digit_separators {
+                diagnostic.set_fix(Fix::unsafe_edit(Edit::range_replacement(
+                    replacement,
+                    value.range(),
+                )));
+            } else {
+                diagnostic.set_fix(Fix::safe_edit(Edit::range_replacement(
+                    replacement,
+                    value.range(),
+                )));
+            }


We should use Fix::applicable_edit here.

Or wait, if we're going to preserve the original formatting, should it just be safe again?

ntBre · 2025-10-24T20:48:23Z

crates/ruff_linter/src/rules/refurb/rules/verbose_decimal_constructor.rs

+/// - Stripping leading and trailing underscores
+/// - Collapsing medial underscore sequences to single underscores
+/// - Do not force thousands separators
+fn normalize_digit_separators(original_str: &str, unary: &str, _numeric_part: &str) -> String {


We should either remove _numeric_part if it's unused or rename it to numeric_part, if it's used.

ntBre · 2025-10-24T20:51:17Z

crates/ruff_linter/src/rules/refurb/rules/verbose_decimal_constructor.rs

+    // If no digits found, return 0
+    if digits.is_empty() {
+        return format!("{unary}0");
+    }


These is_empty checks don't seem right to me at all. I don't think we should return an arbitrary value for invalid input, and I think we've already validated the input in verbose_decimal_constructor, but I could also be missing something.

ntBre · 2025-10-24T20:52:02Z

crates/ruff_linter/src/rules/refurb/rules/verbose_decimal_constructor.rs

+    let without_unary = if original_str.starts_with('+') || original_str.starts_with('-') {
+        &original_str[1..]
+    } else {
+        original_str
+    };


We already stripped the prefix once on line 99. Can we reuse more of the work from the parent function?

Updates the verbose_decimal_constructor rule to always mark fixes for digit separators in Decimal constructors as safe, removing the previous unsafe applicability. Adds test cases for non-thousands separators and updates related snapshots.

ntBre

I pushed a few commits simplifying things slightly to reuse the earlier unary prefix removal, but this looks good to me.

ntBre · 2025-10-28T21:35:18Z

After a bit more thought, I came up with some additional edge cases with separators and leading zeros. Hopefully everything is covered now.

* origin/main: (21 commits) [ty] Update "constraint implication" relation to work on constraints between two typevars (#21068) [`flake8-type-checking`] Fix `TC003` false positive with `future-annotations` (#21125) [ty] Fix lookup of `__new__` on instances (#21147) Fix syntax error false positive on nested alternative patterns (#21104) [`pyupgrade`] Fix false positive for `TypeVar` with default on Python <3.13 (`UP046`,`UP047`) (#21045) [ty] Reachability and narrowing for enum methods (#21130) [ty] Use `range` instead of custom `IntIterable` (#21138) [`ruff`] Add support for additional eager conversion patterns (`RUF065`) (#20657) [`ruff-ecosystem`] Fix CLI crash on Python 3.14 (#21092) [ty] Infer type of `self` for decorated methods and properties (#21123) [`flake8-bandit`] Fix correct example for `S308` (#21128) [ty] Dont provide goto definition for definitions which are not reexported in builtins (#21127) [`airflow`] warning `airflow....DAG.create_dagrun` has been removed (`AIR301`) (#21093) [ty] follow the breaking API changes made in salsa-rs/salsa#1015 (#21117) [ty] Rename `Type::into_nominal_instance` (#21124) [ty] Filter out "unimported" from the current module [ty] Add evaluation test for auto-import including symbols in current module [ty] Refactor `ty_ide` completion tests [ty] Render `import <...>` in completions when "label details" isn't supported [`refurb`] Preserve digit separators in `Decimal` constructor (`FURB157`) (#20588) ...

fix-20572

050931a

amyreese approved these changes Sep 30, 2025

View reviewed changes

amyreese requested a review from ntBre September 30, 2025 19:52

amyreese added rule Implementing or modifying a lint rule fixes Related to suggested fixes for violations labels Sep 30, 2025

ntBre requested changes Sep 30, 2025

View reviewed changes

crates/ruff_linter/src/rules/refurb/rules/verbose_decimal_constructor.rs Outdated Show resolved Hide resolved

apply feedback

b0e9958

danparizher requested a review from ntBre September 30, 2025 23:50

amyreese reviewed Oct 1, 2025

View reviewed changes

danparizher added 4 commits September 30, 2025 22:04

revert changes; mark fix unsafe

0d2612f

remove newline; fix test language

cec8391

Preserve digit separators in Decimal constructor fixes

62a2c7e

Update test cases to safe fixes

356ceac

amyreese approved these changes Oct 1, 2025

View reviewed changes

Preserve original digit separator formatting in normalization

11b1e4e

Update normalize_digit_separators to retain the original pattern of digit separators instead of forcing thousands separators. This change ensures that only syntax errors are fixed while the user's formatting is preserved.

amyreese reviewed Oct 24, 2025

View reviewed changes

ntBre reviewed Oct 24, 2025

View reviewed changes

Mark Decimal digit separator fixes as safe

c36376b

Updates the verbose_decimal_constructor rule to always mark fixes for digit separators in Decimal constructors as safe, removing the previous unsafe applicability. Adds test cases for non-thousands separators and updates related snapshots.

danparizher requested a review from ntBre October 24, 2025 22:22

ntBre added 4 commits October 28, 2025 16:14

reuse earlier prefix removal

92d5f46

back to safe fix

fc3c55f

remove leftover comment

96e3c02

share unary formatting

7e21fb3

ntBre approved these changes Oct 28, 2025

View reviewed changes

fix issues with separators and zeros

e814694

ntBre merged commit 3490611 into astral-sh:main Oct 28, 2025
37 checks passed

danparizher deleted the fix-20572 branch October 28, 2025 21:55

BrewTestBot mentioned this pull request Oct 31, 2025

ruff 0.14.3 Homebrew/homebrew-core#252101

Merged

		let numeric_value = format!("{unary}{rest}");
		add_thousand_separators(&numeric_value)

[refurb] Preserve digit separators in Decimal constructor (FURB157) #20588

[refurb] Preserve digit separators in Decimal constructor (FURB157) #20588

Conversation

danparizher commented Sep 26, 2025

Summary

Uh oh!

github-actions bot commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ruff-ecosystem results

Linter (stable)

Linter (preview)

Uh oh!

ntBre left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amyreese left a comment

Choose a reason for hiding this comment

Uh oh!

dscorbett commented Oct 2, 2025

Uh oh!

MichaReiser commented Oct 21, 2025

Uh oh!

ntBre commented Oct 21, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ntBre left a comment

Choose a reason for hiding this comment

Uh oh!

ntBre commented Oct 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[`refurb`] Preserve digit separators in `Decimal` constructor (`FURB157`) #20588

[`refurb`] Preserve digit separators in `Decimal` constructor (`FURB157`) #20588

github-actions bot commented Sep 26, 2025 •

edited

Loading

`ruff-ecosystem` results