Style/IndentationWidth fails with BOM set by Vim #2703

fabioxgn · 2016-01-22T18:49:49Z

if you use :set bomb in vim to add BOM to an utf-8 encoded file, rubocop starts failing.

test.rb:2:2: C: Style/IndentationWidth: Use 2 (not 1) spaces for indentation. (https://github.com/bbatsov/ruby-style-guide#spaces-indentation)
  def method
 ^

Aparently it is detecting the BOM char as an extra space in the first line.

I've attached a file that reproduces this issue:

test.rb.zip

The text was updated successfully, but these errors were encountered:

jonas054 · 2016-01-28T18:24:58Z

Attaching the file was a nice idea. Thanks.

This is probably my fault, so I'll try to fix it.

jonas054 · 2016-01-28T20:00:55Z

@whitequark Is there any chance Parser::Source::Range#column could return 0 for a range that's on the first line of the buffer and is preceded only by a byte order mark?

alexdowad · 2016-01-28T20:04:00Z

Is there any chance Parser::Source::Range#column could return 0 for a range that's on the first line of the buffer and is preceded only by a byte order mark?

Easiest way to do this would be to eat the BOM in Buffer#source=, after the call to reencode_string. #raw_source= should remain unchanged.

whitequark · 2016-01-28T21:58:00Z

The function of BOM is fundamentally the same as the encoding comment. We don't mangle the encoding comment, I don't see a reason to mangle BOM either.

alexdowad · 2016-01-29T03:43:24Z

I don't see a reason to mangle BOM either.

If you don't want to mangle it, that's fine. However, one important difference is that the BOM is invisible when viewed in a text editor. So when people call range.column, they don't expect the BOM to "count". Intuitively, #column is not a byte offset, but an offset of distance from the left margin when text is displayed.

whitequark · 2016-01-29T05:47:20Z

#column returns the number of characters since the beginning of the line. (Yes, in retrospect the name is misleading, but this did not occur to me when I was writing it.) Due to presence of tabs, combining and zero-width characters in Unicode, this already has nothing to do with the column at which text is displayed, and there is no point in special-casing BOM. Moreover, special-casing BOM will confuse those tools which do treat the column in error message as the character number, e.g. Sublime Text.

What you want to display is the number of grapheme clusters since the beginning of the line. Use one of the multitude of gems that implement UCD lookups to calculate the number of grapheme clusters. This is not the job of Parser.

alexdowad · 2016-01-29T05:50:10Z

What you want to display is grapheme clusters. Use one of the multitude of gems that implement UCD lookups to calculate the number of grapheme clusters. This is not the job of Parser.

Very well.

alexdowad · 2016-01-29T05:51:31Z

@jonas054 Looks like you will have to special-case this in IndentationWidth.

jonas054 · 2016-01-29T19:30:41Z

@alexdowad Fair enough. @whitequark Thanks for the thorough explanation.

[Fix #2703] Calculate column on first line when BOM is present

jonas054 self-assigned this Jan 28, 2016

jonas054 added the bug label Jan 28, 2016

jonas054 mentioned this issue Jan 29, 2016

Fix handling of fullwidth characters in AlignArray #2710

Merged

bbatsov closed this as completed in 2d12870 Jan 30, 2016

bbatsov added a commit that referenced this issue Jan 30, 2016

Merge pull request #2746 from jonas054/2703_fix_columns_with_bom

a4f8c3f

[Fix #2703] Calculate column on first line when BOM is present

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Style/IndentationWidth fails with BOM set by Vim #2703

Style/IndentationWidth fails with BOM set by Vim #2703

fabioxgn commented Jan 22, 2016

jonas054 commented Jan 28, 2016

jonas054 commented Jan 28, 2016

alexdowad commented Jan 28, 2016

whitequark commented Jan 28, 2016

alexdowad commented Jan 29, 2016

whitequark commented Jan 29, 2016

alexdowad commented Jan 29, 2016

alexdowad commented Jan 29, 2016

jonas054 commented Jan 29, 2016

Style/IndentationWidth fails with BOM set by Vim #2703

Style/IndentationWidth fails with BOM set by Vim #2703

Comments

fabioxgn commented Jan 22, 2016

jonas054 commented Jan 28, 2016

jonas054 commented Jan 28, 2016

alexdowad commented Jan 28, 2016

whitequark commented Jan 28, 2016

alexdowad commented Jan 29, 2016

whitequark commented Jan 29, 2016

alexdowad commented Jan 29, 2016

alexdowad commented Jan 29, 2016

jonas054 commented Jan 29, 2016