Whitespace + allman checks by wilzbach · Pull Request #348 · dlang-community/D-Scanner

wilzbach · 2016-05-15T01:23:52Z

Adds the ability to check for whitespace between operators (#318);

I realized it makes sense to bundle my changes in one PR. Thus it includes #344, #345, #346 and the whitespace checks

Fixes:

Detect trailing whitespace #345 (trailing whitespace) fixes add check for trailing whitespace #328
Check for two or more consecutive empty lines #346 (consecutive lines) fixes check for more than two consecutive empty lines #324

Hackerpilot · 2016-05-26T22:00:18Z

src/analysis/allman.d

+    ubyte[] code;
+
+    ///
+    this(string fileName, ubyte[] code, bool skipTests = false)


What was the reason for taking the raw source code as a parameter here instead of the token array? The tokens are avalable in the analysis.run.analize function so they could be passed to this constructor.

I didn't know how I could access the tokens based on a statement and look ahead/before. I only found startLocation

wilzbach · 2016-12-10T18:58:38Z

I managed to find some to rebase this :)

What was the reason for taking the raw source code as a parameter here instead of the token array? The tokens are avalable in the analysis.run.analize function so they could be passed to this constructor.

The lexer drops all unnecessary whitespace are and these checks explicitly depend on lexWhitespace.

(I excluded the more complex check for space between operators.)

Hackerpilot · 2016-12-12T23:06:28Z

I don't recall if I asked this before, but wouldn't it be easier to check the tokens array for line numbers and column numbers? The tokens do have that information.

- add allman style check - detect trailing whitespace - check for two or more consecutive empty lines

wilzbach · 2016-12-15T14:42:24Z

src/analysis/allman.d

+	override void visit(const DoStatement st)
+	{
+		// the DoStatement only knows about the line and column of the expression
+		checkForBrace(st.statementNoCaseNoDefault, 0, 0);


For

do { } while(true)

the AST is:

<doStatement> <expression> <unaryExpression> <primaryExpression> <true/> </primaryExpression> </unaryExpression> </expression> <statementNoCaseNoDefault> <blockStatement> <declarationsAndStatements> <declarationOrStatement> <statement> <statementNoCaseNoDefault> <expressionStatement> <expression> <unaryExpression> <primaryExpression> <identifierOrTemplateInstance> <identifier>i</identifier> </identifierOrTemplateInstance> </primaryExpression> <suffix>++</suffix> </unaryExpression> </expression> </expressionStatement> </statementNoCaseNoDefault> </statement> </declarationOrStatement> </declarationsAndStatements> </blockStatement> </statementNoCaseNoDefault> </doStatement>

and none of the elements contains information about the line and column:

DoStatement: https://github.com/Hackerpilot/libdparse/blob/master/src/dparse/ast.d#L1467
BlockStatement: https://github.com/Hackerpilot/libdparse/blob/master/src/dparse/ast.d#L959
StatementNoCaseNoDefault: https://github.com/Hackerpilot/libdparse/blob/master/src/dparse/ast.d#L2249

wilzbach · 2016-12-15T23:47:41Z

I don't recall if I asked this before, but wouldn't it be easier to check the tokens array for line numbers and column numbers? The tokens do have that information.

Yes you did, but afaict the tokens array doesn't contain information about whitespace / new lines as they are already ignored during the lexing phase. However, I am happy if you prove me wrong ;-)

ghost · 2016-12-16T01:00:46Z

Try this, you must change the lexer config so that whites are included and also (edit) don't use getTokensForParser().

void main()
{
    import dparse.lexer;
    import std.stdio, std.file : read;

    ubyte[] src = cast(ubyte[]) read(__FILE_FULL_PATH__);
    StringCache cache = StringCache(StringCache.defaultBucketCount);
    LexerConfig config = LexerConfig("", StringBehavior.source, WhitespaceBehavior.include, CommentBehavior.intern);

    auto toks = DLexer(src, config, &cache);

    foreach(tk; toks)
    {
        if (tk.type == tok!"identifier")
            writeln("identifier: '", tk.text, "'");
        else if (tk.type == tok!"whitespace")
            writeln("whites for ", tk.text.length, " chars");
        else // many other cases...
            writeln(str(tk.type));
    }
}

but actually you only need them to detect the trailing spaces. Otherwise you can use the tokenN position, the tokenN length and the tokenN+1 position to detect missing whites after if, foreach and such tokens.
Also take care of the comments, since if/*bla bla*/(true) is perfectly valid.

wilzbach · 2017-06-07T21:33:57Z

but actually you only need them to detect the trailing spaces. Otherwise you can use the tokenN position, the tokenN length and the tokenN+1 position to detect missing whites after if, foreach and such tokens.
Also take care of the comments, since if/bla bla/(true) is perfectly valid.

Hmm I am still not sure how I could do this without access to the raw code. Allman is probably easy to look at as an example, in particular this method is the core of the Allman check:

/**
Checks whether a brace or newline comes first
*/
void findBraceOrNewLine(size_t start, size_t end, size_t line, size_t column)
{
	import std.algorithm : canFind;
	import std.utf : byCodeUnit;

	auto codeRange = (cast(char[]) code[start..end]).byCodeUnit;

	// inline statements are allowed -> search for newline
	if (codeRange.canFind('\n'))
	{
		foreach (s; codeRange)
		{
			// first brace
			if (s == '{')
			{
				// DoStatement hasn't a proper line and column attached
				// -> calculate ourselves
				if (line == 0 && column == 0)
				{
					// find line & column of brace
					auto t = findLineAndColumnForPos(start);
					line = t.line + 1; // Dscanner starts lines at 1
					column = t.column;
				}
				addErrorMessage(line, column, KEY, MESSAGE);
				break;
			}
			// newline - test passed
			else if (s == '\n')
			{
				break;
			}
		}
	}
}

With the code array, I simply go back/forth because I know the {start,end}Location of an statement. If I only have the tokens array, how could I access the next token without iterating through all of them?
And imho this check is rather easy to do with access to the code array and it's in memory anyways, so I don't really get why it's so bad to use it?

ghost · 2017-06-07T21:58:41Z

Allman enforcer:

void main()
{
    import dparse.lexer;
    import std.stdio, std.file : read;

    ubyte[] src = cast(ubyte[]) read(__FILE_FULL_PATH__);
    StringCache cache = StringCache(StringCache.defaultBucketCount);
    LexerConfig config = LexerConfig("", StringBehavior.source);

    auto toks = getTokensForParser(src, config, &cache);

    foreach(const i; 1..toks.length) {
        if (toks[i].type == tok!"{" && toks[i-1].line == toks[i].line)
            writeln(__FILE_FULL_PATH__, "(", toks[i].line, "): Allman not respected !");
    }
}

You see, it's shorter with the tokens.

wilzbach · 2017-06-12T10:42:04Z

This is superseded by #446, #447 and #448.

This was referenced May 15, 2016

Check for two or more consecutive empty lines #346

Closed

Detect trailing whitespace #345

Closed

add allman style check #344

Closed

wilzbach force-pushed the whitespace_checks branch from 42fbd89 to 333b664 Compare May 15, 2016 01:26

Hackerpilot self-assigned this May 15, 2016

Hackerpilot added the enhancement label May 15, 2016

wilzbach mentioned this pull request May 26, 2016

Have travis enforce Allman brace style dlang/phobos#4368

Closed

Hackerpilot reviewed May 26, 2016
View reviewed changes

wilzbach force-pushed the whitespace_checks branch 6 times, most recently from 4fdac36 to 1998b39 Compare December 10, 2016 18:54

Introduce source code based checks

1cfe2f8

- add allman style check - detect trailing whitespace - check for two or more consecutive empty lines

wilzbach force-pushed the whitespace_checks branch from 1998b39 to 1cfe2f8 Compare December 15, 2016 14:28

wilzbach commented Dec 15, 2016

View reviewed changes

wilzbach mentioned this pull request Dec 26, 2016

If constraints: check for same indent level #394

Closed

wilzbach mentioned this pull request Jan 4, 2017

std.traits: hasFunctionAttributes dlang/phobos#4287

Merged

wilzbach mentioned this pull request Feb 22, 2017

[Style] Enforce a .. b opSlice style dlang/phobos#5179

Merged

wilzbach mentioned this pull request Jun 12, 2017

Add allman check #446

Merged

wilzbach closed this Jun 12, 2017

wilzbach mentioned this pull request Feb 11, 2018

posix.mak: Enforce whitespace before opening version parenthesis dlang/phobos#6160

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whitespace + allman checks#348

Whitespace + allman checks#348
wilzbach wants to merge 1 commit intodlang-community:masterfrom
wilzbach:whitespace_checks

wilzbach commented May 15, 2016

Uh oh!

Hackerpilot May 26, 2016

Uh oh!

wilzbach May 26, 2016

Uh oh!

wilzbach commented Dec 10, 2016

Uh oh!

Hackerpilot commented Dec 12, 2016

Uh oh!

wilzbach Dec 15, 2016 •

edited

Loading

Uh oh!

wilzbach commented Dec 15, 2016

Uh oh!

ghost commented Dec 16, 2016 •

edited by ghost

Loading

Uh oh!

wilzbach commented Jun 7, 2017

Uh oh!

ghost commented Jun 7, 2017 •

edited by ghost

Loading

Uh oh!

wilzbach commented Jun 12, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wilzbach commented May 15, 2016

Uh oh!

Hackerpilot May 26, 2016

Choose a reason for hiding this comment

Uh oh!

wilzbach May 26, 2016

Choose a reason for hiding this comment

Uh oh!

wilzbach commented Dec 10, 2016

Uh oh!

Hackerpilot commented Dec 12, 2016

Uh oh!

wilzbach Dec 15, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wilzbach commented Dec 15, 2016

Uh oh!

ghost commented Dec 16, 2016 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wilzbach commented Jun 7, 2017

Uh oh!

ghost commented Jun 7, 2017 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wilzbach commented Jun 12, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wilzbach Dec 15, 2016 •

edited

Loading

ghost commented Dec 16, 2016 •

edited by ghost

Loading

ghost commented Jun 7, 2017 •

edited by ghost

Loading