Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Appending an optional string / array causes the entire expression to be reformatted #228

Open
CyberShadow opened this issue Jul 27, 2024 · 18 comments

Comments

@CyberShadow
Copy link

Description

I don't really mind either way, but filing this in case someone might find the following useful.

I think this diff could have been less noisy:
NixOS/nixpkgs@eac5951

Small example input

Consider this initial expression:

{
  exampleString = ''
    hello
    beautiful
  '';
  exampleArray = [
    "hello"
    "beautiful"
  ];
}

Now, we want to add some optional things:

{
  exampleString = ''
    hello
    beautiful
  '' + lib.optionalString true ''
    wonderful
    world
  '';
  exampleArray = [
    "hello"
    "beautiful"
  ] ++ lib.optionals true [
    "wonderful"
    "world"
  ];
}

Expected output

No change - I think the above is close to the existing style.

Actual output

nixfmt-rfc-style reformats the entire thing, creating a big diff:

{
  exampleString =
    ''
      hello
      beautiful
    ''
    + lib.optionalString true ''
      wonderful
      world
    '';
  exampleArray =
    [
      "hello"
      "beautiful"
    ]
    ++ lib.optionals true [
      "wonderful"
      "world"
    ];
}
@MattSturgeon
Copy link

Is this an issue with the RFC or with the implementation?

It seems this thread is discussing a similar issue too: https://discourse.nixos.org/t/satisfaction-survey-from-the-new-rfc-166-formatting/49758

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nix-formatting-team-treewide-nixpkgs-formatting/56666/2

@Atemu
Copy link
Member

Atemu commented Dec 1, 2024

  exampleArray = [
    "hello"
    "beautiful"
  ] ++ lib.optionals true [
    "wonderful"
    "world"
  ];

This is not legal under RFC166 because the closing bracket before ++ is not indented but not on the last line. The compact style shown here may only be used if it only causes the last line to not be indented, all other lines must be indented:

Operator chains in bindings may be compacted as long as all lines between the first and last one are indented.

https://github.com/NixOS/rfcs/blob/master/rfcs/0166-nix-formatting.md#chainable-operators

If you're used to the RFC style, this property is extremely important to parsing operator chains (aswell as bindings in general) quickly mentally and encountering a non-indented line that is not the last line of the binding is confusing.

I don't see a good way to solve this but, as mentioned in https://discourse.nixos.org/t/nix-formatting-team-treewide-nixpkgs-formatting/56666/3, you could pre-emptively force the non-compact form using a comment or a no-op expression if you anticipate that an operator will be used on your expression in the future.

@emilazy
Copy link
Member

emilazy commented Jan 4, 2025

The current format is really bad for reviewing diffs, for git blame, and for automated backports. The former can be solved by hiding whitespace changes but the latter two are much more difficult to deal with and very painful.

If the operator chain indentation rule is considered vital, then I would prefer formatting every multi‐line list like this:

  buildInputs =
    [
      foo
      bar
      baz
    ];

which, while certainly chunkier, is consistent with (and therefore no chunkier than) what you get when you start adding lib.optionals, and avoids all the problems that come with the diff noise.

Of course you’d either have to do the same for strings or establish a convention of using ${…} over + for conditional parts of build phases. At least there we have options; list syntax has no splicing, so I really think picking something that doesn’t cause huge churn here is vital.

See NixOS/nixpkgs#370526 (comment) for an example of where the current style demands a contributor reformat a huge portion of a critical file when making a change, breaking git blame forever. (I don’t think “people should commit unformatted code, format it in a separate commit, and then add it to .git-blame-ignore-revs, forever” is a viable solution.)

@sersorrel
Copy link

I'm likewise running into this in NixOS/nixpkgs#371214 (but this time I'm removing a trailing lib.optional, not adding one).

@infinisil
Copy link
Member

Discussed this in todays meeting

{
  currentStyle =
    [
      "hello"
      "beautiful"
    ]
    ++ lib.optionals true [
      (x ++ [ bash ])
      "wonderful"
      "world"
    ];

  proposal1 = [
    "hello"
    "beautiful"
  ] ++ lib.optionals true [
    (x ++ [ bash ])
    "wonderful"
    "world"
  ];

  proposal2 = [
    "hello"
    "beautiful"
  ]
  ++ lib.optionals true [
    (x ++ [ bash ])
    "wonderful"
    "world"
  ];
}
  • Both proposals fix the problem of causing large diffs on minor changes, but also weakly violate the standard by not basing the indentation on the code structure
  • proposal1 is technically more invalid according to the current formatting standard (since operator chains are supposed to have the operators at the start of the line), but it does have the advantage of one less line
  • proposal2 has the benefit of the operator being in the first indented column, therefore being valid in the current standard
  • Another proposal is where ] ++ gets left expanded/contracted based on the input, but that could lead to more inconsistencies (and we would still need to decide on 1 or 2 for strict mode).
  • Conclusions:
    • Resolving the issue is worth updating the standard for
    • proposal1 is the preferred solution (for the people in this meeting)
    • We don't need an RFC to update the standard in this way since the change is not controversial
    • We want to also get input from one other team member before having this (ping @piegamesde @Sereja313 @0x4A6F)
    • Once we have approval, we appreciate somebody going ahead with a PR

@emilazy
Copy link
Member

emilazy commented Jan 7, 2025

FWIW currentStyle is fine as long as multi‐line definitions without the operator chain are formatted the same way. This seems okay‐ish for lists but probably a lot more controversial for strings and attrsets.

Thanks for working on this issue. Between proposal1 and proposal2 I actually quite prefer proposal2, because a common awkward thing is

] ++ lib.optionals a [ b ]
++ lib.optional c d
++ lib.optionals e [
  f
  g
] ++ lib.optional h i

where the lines without the ]s look really awkward next to the ones with.

(Needless to say all of the considerations about this issue apply equally to '' … '' / + and { … } / // which are also ubiquitous throughout Nixpkgs.)

@Atemu
Copy link
Member

Atemu commented Jan 7, 2025

I don't find that effect to be weak at all. As mentioned previously, the fact that nothing is on the first level of indent until the expression is "closed" is an extremely valuable property of the RFC style IMHO.

I'd rather see it explored to limit or even disable absorption for multi-line constructs like @emilazy mentioned.

This

{
  exampleArray = [
    "hello"
    "beautiful"
  ];
}

would become this:

{
  exampleArray =
    [
      "hello"
      "beautiful"
    ];
}

which I don't like but it's much preferable to this:

{
  proposal2 = [
    "hello"
    "beautiful"
  ] # Nuh-oh, expression isn't actually over yet!
  ++ lib.optionals true [
    (x ++ [ bash ])
    "wonderful"
    "world"
  ]
  ++ lib.optionals true [
    (x ++ [ bash ])
    "wonderful"
    "world"
  ]
  ++ lib.optionals true [
    (x ++ [ bash ])
    "wonderful"
    "world"
  ]
  ... # more such expressions. It's no longer possible tell when the expression is "closed" from indent alone.
}

Btw., I did not know there was a high-bandwidth synchronous discussion for this. It'd be great if such meetings that meaningfully discuss an issue could be announced before-hand in those specific issues.

@piegamesde
Copy link
Member

The formatting team meetings are regular (not sure how public the schedule is, but in theory everyone can join), the agenda however is "whatever happens to be the topic" and is not really planned in advance. I recommend joining #nix-formatting:nixos.org if you are interested

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/formatting-team-meeting-2024-12-10-and-2025-01-07/58466/1

@infinisil
Copy link
Member

Btw., I did not know there was a high-bandwidth synchronous discussion for this. It'd be great if such meetings that meaningfully discuss an issue could be announced before-hand in those specific issues.

I don't like how hard it is to discover, but our team page currently mentions:

Meeting calendar: Search for "Nix formatting"

And I always announce the meetings in the Matrix room, they're open to join for anybody :)

@Atemu
Copy link
Member

Atemu commented Jan 7, 2025

I'm aware there are formatting team meetings and those who are part of the formatting team or just generally interested in the development will regularly be in such meetings of course but I, as someone not in the formatting team but interested in just this particular issue, wouldn't normally be.

This perhaps doesn't need to happen for every issue but it would be good for at least high-impact issues such as this one where getting a broader opinion spectrum can be critical.

@infinisil
Copy link
Member

Since the agenda is flexible, you can influence it by just joining the meeting and letting us know the issues you think are important so we can discuss them right then :D

@Atemu
Copy link
Member

Atemu commented Jan 7, 2025

I think you're missing the point that most people interested in this particular issue won't be in your regular meetings, no matter how open they are.

@emilazy
Copy link
Member

emilazy commented Jan 7, 2025

I don't find that effect to be weak at all. As mentioned previously, the fact that nothing is on the first level of indent until the expression is "closed" is an extremely valuable property of the RFC style IMHO.

I agree that for lists this is a bullet you can bite, but I don’t know if you can stomach

  prePatch =
    ''
      patchShebangs .
      rm generated.c
    '';

I doubt the Nixpkgs contributor base can. (You can use ${} for strings, but it gets ugly too.)

Personally I think the “run‐on” nature of the expressions is the easiest thing to stomach given all the trade‐offs. To me it’s kinda like

let x = if cond {
  foo
} else {
  bar
};

in Rust, which is the idiomatic formatting there (and of most expression‐oriented bracey languages, I think, few in number though those are.)

@Atemu
Copy link
Member

Atemu commented Jan 8, 2025

I agree that for lists this is a bullet you can bite, but I don’t know if you can stomach

  prePatch =
    ''
      patchShebangs .
      rm generated.c
    '';

I didn't realise this before but I actually quite like how the opening and closing quotes are on the same horizontal position here?

This is also consistent with (and thereby reduces diff noise for!) expressions that go in front such as let, with, assert or even function args. If you were to add a let here, no line would need to change:

  prePatch =
    let
      foo = "bar";
    in
    ''
      patchShebangs .
      rm generated.c
    '';

It's a +3 diff rather than -1+5.

I'm liking this more and more actually.

I think the hardest to stomach would be attrsets but even there I think I could get used to it and the opening brace being on the same horizontal position as the closing one could be a very nice property for easing mental parsing.

Personally I think the “run‐on” nature of the expressions is the easiest thing to stomach given all the trade‐offs. To me it’s kinda like

let x = if cond {
  foo
} else {
  bar
};

in Rust, which is the idiomatic formatting there (and of most expression‐oriented bracey languages, I think, few in number though those are.)

Honestly now that I'm used to our RFC-style formatting for such a ternary, I prefer it over this. An absorbed if cond { looks weird to me nowadays.

@emilazy
Copy link
Member

emilazy commented Jan 8, 2025

I admire your commitment to principle, and I would personally have no objection to formatting Nixpkgs this way. But I do think that both proposal1 and proposal2 would thoroughly beat it if you surveyed Nixpkgs contributors. Of course we have settled on a largely technocratic over democratic support to formatting without too much deference paid to the existing familiar style standards, so if the formatting team is willing to go with the radical option of keeping the current style but always indenting the entire expression regardless of operators then perhaps it’ll happen! Arguably the closing ] violates the indentation rule for simple lists currently; it’s a child of buildInputs = … but indented on the same level as it. I’d at least be interested in seeing a sample Nixpkgs reformat with the rule.

@Atemu
Copy link
Member

Atemu commented Jan 8, 2025

I do think that both proposal1 and proposal2 would thoroughly beat it if you surveyed Nixpkgs contributors.

It would have beaten it if you asked just me a year ago too, even if you brought objective arguments.

If you're extremely used to something, it's natural to not want to let it go; even in the face of overwhelming evidence against the thing. This happens everywhere and one of the greatest issues facing society today if you ask me. For instance: I'm well aware of how bad and unethical eating meat is and yet I still do it because of how used I am to the taste of the meals I've come to like.
If you forced me to eat only vegan meals, I'd be apprehensive of course but I do think I'd come to accept it and at some point perhaps even prefer it. I think it's much the same case here: I was forced to accept the new RFC style and can now appreciate its objective benefits.
To be clear: I still yearn for the oldschool compact Nixpkgs style (just like I'd likely yearn for meaty meals) but I'm now painfully aware of its fundamental flaws and always come to the conclusion that it's not worth it.

That's why I think this decision should be made primarily objectively with little regard for the popular opinion; "technocratically" as you put it.

Arguably the closing ] violates the indentation rule for simple lists currently; it’s a child of buildInputs = … but indented on the same level as it.

Right but it contains the closing ;, so it's the last expression in the binding which makes it fine under the special exception for this precise case. As I see it, it was added to make the new RFC style look more like the previous convention without really breaking that much for the new style. Absorption/compact multi-line declaration is generally in much the same place.

I’d at least be interested in seeing a sample Nixpkgs reformat with the rule.

Same. There could very well be flaws to this that aren't worth the trade-off but I really can't think of any other than that it's not what we're used to.

Perhaps I'll see piegames the next few days again and ask them to show me how to touch nixfmt in the right way to achieve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

8 participants