You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been considering how we represent indentation in Rustfmt. Turns out this is pretty complex and I'm not happy with the current approach. I've been doing some refactoring but it is turning into a massive task so I wanted to share my thoughts before I get too far down the rabbit hole.
Hand waving
There are two kinds of indentation from a user-perspective, block indent and visual indent. These can sometimes be switched by options, but I'll ignore options for now and just concentrate on a single environment.
Block indent is where a line is indented by a set amount, the classic example is blocks:
Due to options we also want to cope with hard vs soft tabs. I.e., using a single tab character vs using n (4 by default) space characters.
Mostly orthogonally, indetation can be due to the position in the block hierarchy vs the expression hierarchy. I believe the block hierarchy is always block indented, but the expression hierarchy can be either block or visually indented (or some combination).
E.g., block hierarchy:
impl ... {
fn foo(...) {
...
{
...
}
}
}
E.g., block-indented expression hierarchy:
Foo {
a: sdfsdfs,
b: Bar {
c: dsfsdaf,
},
}
E.g., visual-indented expression hierarchy:
foo_fn(arg1,
bar_fn(arg2,
arg3));
Now, stepping back from concrete Rust into some more hypothetical layout. With every sub-item we basically start a new layout context, which I think of as a box (aligned either visually or by block). However, we sometimes want to escape that context and create a new one, e.g., where we run out of space.
Note that we've gone from visual to block indenting in order to expand the available space.
Sometimes it isn't that simple though, e.g,
let aaaaaaa = +-----------------------------------------------------------+
+---------+ |
: :
+---------------------------------------------------------------------+;
And of course if you look at the rhs, we could really go wider if we accept some complexity there.
Current implementation
We have an Indent struct which has fields for block_indent and alignment. I believe the motivation here is for the sake of abstracting hard vs soft tabs, but the actual use for this is a bit sloppy. We track both in spaces, where block_indent is meant to be a whole number of tabs times the spaces per tabs and then alignment is the number of spaces used on top of the tabs to align us properly. However, how the two fields are used for block vs visual indent is a bit ad hoc. Furthermore, I'm not really sure how they should be used, especially around visual indentation.
Question: should we use hard tabs at all for visual indenting.
Solutions:
treat hard tabs as just subsitutes for spaces, so if a line should start at column n, then we use n / tab_width tabs + n % tab_width spaces.
only use hard tabs for blocks and always use spaces for visual indentation. This makes more logical sense to me, but is more complex and I'm not sure it is what hard tab people actually want.
If we go for the second solution, then a follow-up question is whether our tracking of block vs visual indenting should use the same or different mechanisms.
We actually keep two Indents around - the current indent which is passed down through rewrite functions and block_indent which is kept in the context and should just be the current block indent.
We occasionally pass around another ad hoc offset usize too, and frequently have many local variables which represent different kinds of indents.
For widths, we just pass around a usize.
A complication
If there is a block indenting context nested inside a visually indented context, then the block indent may not be tab-aligned and there are effectively two block indents - the indent due to the block hierarchy and the indent due to the expression hierarchy.
One possible invariant (which is currently not true): formatting an item never escapes it constraints. If it wants to, then it must 'error' out and request looser constraints from its parent item.
One possible refactoring - group all constraints together (i.e., the various indents and widths) into a single struct. I'm not really sure why we have one indent in the context and one as an argument.
cc @marcusklaas @kamalmarhubi
I've been considering how we represent indentation in Rustfmt. Turns out this is pretty complex and I'm not happy with the current approach. I've been doing some refactoring but it is turning into a massive task so I wanted to share my thoughts before I get too far down the rabbit hole.
Hand waving
There are two kinds of indentation from a user-perspective, block indent and visual indent. These can sometimes be switched by options, but I'll ignore options for now and just concentrate on a single environment.
Block indent is where a line is indented by a set amount, the classic example is blocks:
Visual indent is where a line is aligned with some item on the previous line, for example in a function defintion:
Due to options we also want to cope with hard vs soft tabs. I.e., using a single tab character vs using n (4 by default) space characters.
Mostly orthogonally, indetation can be due to the position in the block hierarchy vs the expression hierarchy. I believe the block hierarchy is always block indented, but the expression hierarchy can be either block or visually indented (or some combination).
E.g., block hierarchy:
E.g., block-indented expression hierarchy:
E.g., visual-indented expression hierarchy:
Now, stepping back from concrete Rust into some more hypothetical layout. With every sub-item we basically start a new layout context, which I think of as a box (aligned either visually or by block). However, we sometimes want to escape that context and create a new one, e.g., where we run out of space.
E.g.,
The box shows our layout context for the sub-item (we have a location and width for the context, but the height depends on the contents).
If we can't layout inside that context (due to lack of space or for aesthetics) we want to try again with
Note that we've gone from visual to block indenting in order to expand the available space.
Sometimes it isn't that simple though, e.g,
And of course if you look at the rhs, we could really go wider if we accept some complexity there.
Current implementation
We have an
Indent
struct which has fields forblock_indent
andalignment
. I believe the motivation here is for the sake of abstracting hard vs soft tabs, but the actual use for this is a bit sloppy. We track both in spaces, whereblock_indent
is meant to be a whole number of tabs times the spaces per tabs and then alignment is the number of spaces used on top of the tabs to align us properly. However, how the two fields are used for block vs visual indent is a bit ad hoc. Furthermore, I'm not really sure how they should be used, especially around visual indentation.Question: should we use hard tabs at all for visual indenting.
Solutions:
If we go for the second solution, then a follow-up question is whether our tracking of block vs visual indenting should use the same or different mechanisms.
We actually keep two
Indent
s around - the current indent which is passed down throughrewrite
functions andblock_indent
which is kept in the context and should just be the current block indent.We occasionally pass around another ad hoc offset usize too, and frequently have many local variables which represent different kinds of indents.
For widths, we just pass around a usize.
A complication
If there is a block indenting context nested inside a visually indented context, then the block indent may not be tab-aligned and there are effectively two block indents - the indent due to the block hierarchy and the indent due to the expression hierarchy.
E.g.,
One solution is that we avoid doing this and instead indent to the nearest tab, e.g.,
However, we still have two block indents to track - one at 4 spaces and one at 16.
To clarify, this is complicated because it makes hard tabbing to block indent require spaces (maybe) and because we must track more indents (I think).
Discussion
Questions:
The text was updated successfully, but these errors were encountered: