Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mispositioned error-indicator at compile-time #38384

Closed
florianb opened this issue Dec 15, 2016 · 23 comments · Fixed by #44386
Closed

Mispositioned error-indicator at compile-time #38384

florianb opened this issue Dec 15, 2016 · 23 comments · Fixed by #44386
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-frontend Area: Compiler frontend (errors, parsing and HIR) C-enhancement Category: An issue proposing an enhancement or a PR with one.

Comments

@florianb
Copy link

The positional indicator of the moved value appears with the wrong indentation as you can see in the screenshot:

image

Apparently i don't know why this happens, i am using tabs (with 4 spaces width) to format my source. In the reality these lines look as follows:

	let arguments = application.get_matches();

	match arguments.subcommand() {
		("host", Some(sub_arguments)) => host::process_matches(sub_arguments),
		_ => {
			application.print_help();
		}
	}

Versions

rustc 1.13.0 (2c6933acc 2016-11-07)
cargo 0.13.0-nightly (eca9e15 2016-11-01)
ProductName:	Mac OS X
ProductVersion:	10.11.6
BuildVersion:	15G1108
@est31
Copy link
Member

est31 commented Dec 15, 2016

Hello fellow tabs user!

I have experienced this too, I was just too lazy to do a bug report. Thanks for filing it :)

@ghost
Copy link

ghost commented Dec 16, 2016

Before rendering compiler messages to-be-printed are analyzed and some spaces are replaced with tabs in order to respect identation in the code. All that is done heuristically on a best-effort basis.

Now we even multiline spans in nightly, which makes this problem even more difficult to solve. That means simply replacing a few spaces with tabs is not sufficient anymore.

My opinion is that it would be best to simply render all tabs from the source code as 4 spaces in stderr. Anything else will be even more buggy.

@jonathandturner Thoughts? If you agree, I'll make a PR.

@sophiajt
Copy link
Contributor

sophiajt commented Jan 9, 2017

Tabs should be handled by putting a tab in the matching output. Seems that is not the right answer for all situations? I'm confused as to why this isn't working, as the original example seems like it should work okay.

@sophiajt
Copy link
Contributor

sophiajt commented Jan 9, 2017

Do you have a smaller repro we could use to fix the issue?

@sophiajt
Copy link
Contributor

sophiajt commented Jan 9, 2017

(btw, sorry for the delay in response, I've been on an extended vacation)

@ghost
Copy link

ghost commented Jan 9, 2017

Tabs should be handled by putting a tab in the matching output. Seems that is not the right answer for all situations?

No. Take a look at this example:

warning: method is never used: `new`, #[warn(dead_code)] warning markers
  --> src/mod.rs:34:5
   |
34 |       fn new() -> Self {
   |  _____^ starting here...
35 | |         Item {
36 | |             value: 0
37 | |         }
38 | |     }
   | |_____^ ...ending here

We have these lines connecting "starting here" and "ending here". How many underscores should those lines have? Depends on how many spaces your terminal/text editor/IDE/browser uses to display tabs.

Ultimately, it's impossible to render error messages together with tabs properly accross all environments.

Another option would be to convert tabs into 4 spaces before rendering. Then error/warning markers will not be misaligned with code - that's the major advantage of converting tabs to spaces.

@Mark-Simulacrum Mark-Simulacrum added A-diagnostics Area: Messages for errors, warnings, and lints A-frontend Area: Compiler frontend (errors, parsing and HIR) labels May 19, 2017
@Mark-Simulacrum Mark-Simulacrum added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Jul 26, 2017
@eira-fransham
Copy link

There is no work-around for this that I can see, even setting tab-width to 4 doesn't work since the LHS number output is variable-width.

@kevincox
Copy link
Contributor

kevincox commented Sep 3, 2017

As @stjepang said I think the best soltution is to render the tabs as spaces for the error output. It might be theorically possible to get the tab size from the terminal but that sounds like more effort this it is worth. I am rarely (never) copying the output of errors back into my source so it shouldn't be an issue.

A bouns feature would be user-adjustable tab size but starting with 2 or 4 is probably sufficent.

@kevincox
Copy link
Contributor

kevincox commented Sep 3, 2017

Also is there any way to get this prioritized? It seems like having unaligned error messages is unacceptable as the text relies on the range indicators to make any sense.

@florianb
Copy link
Author

florianb commented Sep 3, 2017

Why not preserving the whitespace-chars of the specific line as given in the source, just replacing everything else from the occurrence of the first non-white-space on?

@florianb
Copy link
Author

florianb commented Sep 3, 2017

It should be safe to "repeat" the given line, and replace the non-whitespace-content with the debug output. Imagine the following:

····let arguments = application.get_matches();

····match arguments.subcommand() {
········("host", Some(sub_arguments)) => host::process_matches(sub_arguments),
····↦_·=>↦{
············application.print_help();
········}
····}

To handle that situation in line 4 right, the following steps should work:

  1. take the line (····↦_·=>↦{) and replace non-tabs by spaces: ····↦····↦·
  2. replace/insert the debug message from given index on (in this example 11 for the curly bracket): ····↦_·····↦^ foo bar

@est31
Copy link
Member

est31 commented Sep 3, 2017

@florianb I think the only way to resolve this is to replace each tab with a fixed amount of spaces. Your suggestion doesn't work because of cases like:

fn main() {
    """;
}

Giving you errors like:

  |
3 |       """;
  |  _______^
4 | | }
  | |_^

It needs to draw the entire line with _ characters instead of whitespace.

@florianb
Copy link
Author

florianb commented Sep 3, 2017

@est31 - thanks i didn't think of that case.

@est31
Copy link
Member

est31 commented Sep 3, 2017

Studying clang/gcc

From a suggestion by @eddyb on IRC on what gcc/clang do, I've looked at their behaviour.
The tested statement for clang was \t\t\t\treturn ""; "\tfsdf", so 4 tabs followed by return "";, followed by " fsdf" (tab between " and f).
clang output:

error: expected ';' after expression
                                return ""; "    fsdf"
                                                     ^
                                                     ;
warning: expression result unused [-Wunused-value]
                                return ""; "    fsdf"
                                           ^~~~~~~~~~

For gcc, I've omitted the last character to produce an invalid string literal. It gave the following error:

warning: missing terminating " character
     return ""; " fsdf
                ^
error: missing terminating " character
     return ""; " fsdf
                ^~~~~~

Both treated the " before the fsdf as "column 16" so they gave each tab one "column point".

Now let's add a space to test whether they implement a "tab stop" algorithm, or a "tab == 4 whitespace" one.

Clang output when I add one space before the return:

                                 return ""; "   fsdf"
                                            ^~~~~~~~~

One space more between the line start and return, but one space less between " and fsdf.

Gcc output of the same (with the last " removed):

      return ""; " fsdf
                 ^~~~~~

So clang basically implements the full "tab == go to the next tab stop" algorithm, while gcc replaces each tab by one whitespace, and continues to print whitespaces next to tabs.

Used versions: gcc (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406, clang version 4.0.0-1ubuntu1 (tags/RELEASE_400/rc1)

Plan for rustc

I suggest to take a similar path to the one clang is going, by keeping the column number as it is right now (the same that clang/gcc are using), and by either printing out each tab as 4 spaces or alternatively by implementing the full tab stop algorithm that apparently clang is doing.

I'm generally neutral as to whether implement tab == 4 spaces or tab == next tabstop, as both seem to fix the original bug to me. I guess once the general infrastructure stands, the algorithms should be equally easy to implement.

Further points

  1. The column info needs to stay to be compatible with external tools like text editors which in turn probably want to be compatible with clang/gcc. E.g. when I do kate file.rs:42:33 it jumps to column 33 of line 42, treating tabs as one "column point". Or gnome-builder displays errors inline and does something similar:
    grafik
    This snippet uses tabs for indentation.
  2. There is the case I've noted above which requires us to walk the same path with _ characters that we walk with tabs, if we preserve them as we do right now. The only way out of this would be to detect the terminal's tab width (e.g. by printing a tab and then detecting how many columns the terminal has advanced), but this obviously doesn't work when you redirect the output to a file and then open it with a text editor or similar.

@kevincox
Copy link
Contributor

kevincox commented Sep 3, 2017

@est31 That sounds good.

Yes, it's critical that the column number in the error messages count charaters. IIUC this issue is just that the printed messages don't lign up correctly because it prints some tabs in the source and expects to be able to line up some individual characters next to it.

As far as tab-stops vs simple expansion for the error message I don't think it matters too much.

@eira-fransham
Copy link

eira-fransham commented Sep 3, 2017

Simple replacement will break formatting for (bad-style) code. Obviously no-one should be using tabs after normal characters but some people will be and they will complain. Better just fix the problem now before it gets filed as an issue.

@florianb
Copy link
Author

florianb commented Sep 4, 2017

@Vurich: unfortunately the concept of elastic tabstops is emerging.

@kevincox
Copy link
Contributor

kevincox commented Sep 4, 2017

@Vurich while you are correct, it isn't relly possible to do that without knowing what that author has configured their tab size to. I guess it would be nice to do the work in tabstops with some value which could be adjusted but it seems like this case will never just work. We could consider making it configurable or looking for an .editorconfig. file but this seems like more complexity then is required.

All that being said I am clearly biased because I don't do this myself. It would be interesting to see if and how clang and GCC deal with this. I would assume they don't but I haven't tested.

@eira-fransham
Copy link

That's fair, I'm convinced. Still, 4-space tabstops will never break anything that simple replacement won't break even worse and it's a pretty safe default to assume.

@est31
Copy link
Member

est31 commented Sep 4, 2017

I think providing configuration options or following an editorconfig or similar are overengineering, but even if we want to add something like that, providing a support for 4-space tabstops is an important first step.

I'll check tomorrow whether I want to provide a PR to fix at least this issue with the plan I've proposed above. If anyone of you is very eager to do the implementation themselves, please speak up!

@est31
Copy link
Member

est31 commented Sep 7, 2017

With my PR #44386 I've chosen to go with gcc's strategy for now (tab mapped to one space always). Its the easiest to implement, and fixes the actual issue of mispositioning. Obviously its not beautiful, but if anyone wants to improve on that, please do so!

est31 added a commit to est31/rust that referenced this issue Sep 7, 2017
Fixes rust-lang#38384

Most of the Rust community uses 4 spaces for indentation,
but there are also tab users of Rust (including myself!).

This patch fixes a bug in error printing which mispositions
error indicators when near code with tabs.

The code attempted to fix the issue by replacing spaces
with tabs, but it sadly wasn't enough, as sometimes
you may not print spaces but _ or ^ instead.

This patch employs the reverse strategy: it replaces each
tab with a space, so that the number of _ and ^ and spaces
in error indicators below the code snippet line up
perfectly.

In a study [1] preceeding this patch, we could see that
this strategy is also chosen by gcc version 6.3.0.

Its not perfect, as the output is not beautiful, but its
the easiest to implement. If anyone wants to improve on
this, be my guest! This patch is meant as improvement of
the status quo, not as perfect end status. It fixes the
actual issue of mispositioning error indicators.

[1]: rust-lang#38384 (comment)
bors added a commit that referenced this issue Sep 13, 2017
Fix mispositioned error indicators

Fixes #38384

Most of the Rust community uses 4 spaces for indentation,
but there are also tab users of Rust (including myself!).

This patch fixes a bug in error printing which mispositions
error indicators when near code with tabs.

The code attempted to fix the issue by replacing spaces
with tabs, but it sadly wasn't enough, as sometimes
you may not print spaces but _ or ^ instead.

This patch employs the reverse strategy: it replaces each
tab with a space, so that the number of _ and ^ and spaces
in error indicators below the code snippet line up
perfectly.

In a study [1] preceeding this patch, we could see that
this strategy is also chosen by gcc version 6.3.0.

Its not perfect, as the output is not beautiful, but its
the easiest to implement. If anyone wants to improve on
this, be my guest! This patch is meant as improvement of
the status quo, not as perfect end status. It fixes the
actual issue of mispositioning error indicators.

[1]: #38384 (comment)
@est31
Copy link
Member

est31 commented Sep 16, 2017

I've filed #44618 as a follow-up to not forget about the clang option

@est31
Copy link
Member

est31 commented Jan 12, 2018

There's been a regression on this with help spans. See #47377 . The tests src/test/ui/codemap_tests/tab_2.rs, src/test/ui/codemap_tests/tab_3.rs and src/test/ui/codemap_tests/tab.rs are still checked in an passing though so the regression is only with those specific help spans.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-frontend Area: Compiler frontend (errors, parsing and HIR) C-enhancement Category: An issue proposing an enhancement or a PR with one.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants