Skip to content

add errorsink.d, an abstract interface to error messages#14906

Merged
WalterBright merged 1 commit intodlang:masterfrom
WalterBright:errorsink
Feb 26, 2023
Merged

add errorsink.d, an abstract interface to error messages#14906
WalterBright merged 1 commit intodlang:masterfrom
WalterBright:errorsink

Conversation

@WalterBright
Copy link
Member

I've made considerable progress into making the lexer standalone, and the lexer/parser standalone. The problem is its reliance on globals.d and errors.d, which pull in a bunch of other stuff not relevant to lexing/parsing. The code in errors.d is of rather stupefying complexity, which is completely irrelevant to the lexer/parser. This complexity also makes it difficult to create a standalone lexer/parser.

Enter the abstract interface for error messages, ErrorSink, in errorsink.d. It's only got the functions in it that the lexer/parser actually need. The lexer/parser should not need to know anything about color formatting, highlighting, gagging, or where the messages are sent to.

The constructors for the lexer/parser are passed an instance of ErrorSink.

This isn't perfect, there's still an ugly wart in it that relies on getenv. I'll tackle that problem after this one is incorporated.

Sorry about the high line count, although nearly all of it is just changing the call to error(). The substantive changes are pretty small.

The more we can reduce all these "everybody imports everybody else" the more tractable the code base will be.

@WalterBright WalterBright added the Severity:Refactoring No semantic changes to code label Feb 22, 2023
@dlang-bot
Copy link
Contributor

Thanks for your pull request, @WalterBright!

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + dmd#14906"

@WalterBright WalterBright force-pushed the errorsink branch 4 times, most recently from 4f04dca to 509ee75 Compare February 22, 2023 08:54
@ibuclaw
Copy link
Member

ibuclaw commented Feb 22, 2023

Errrrrrrrrrrrr @jacob-carlborg ?

@WalterBright WalterBright force-pushed the errorsink branch 3 times, most recently from a3d5a8c to c1d394b Compare February 22, 2023 09:05
@ibuclaw
Copy link
Member

ibuclaw commented Feb 22, 2023

#9350
#9484
#9480
#9507
#10072
#10711

@WalterBright WalterBright force-pushed the errorsink branch 2 times, most recently from f467d27 to 06a8c6a Compare February 22, 2023 09:22
@jacob-carlborg
Copy link
Contributor

🤦

@WalterBright WalterBright force-pushed the errorsink branch 3 times, most recently from c4bcb74 to d36d348 Compare February 22, 2023 10:06
@thewilsonator
Copy link
Contributor

Yay, lets add even more files before packaging dmd properly.

@ibuclaw
Copy link
Member

ibuclaw commented Feb 22, 2023

Yay, lets add even more files before packaging dmd properly.

Worse than that, this PR literally reinvents the wheel... why?

/***************************************
* Where error/warning/deprecation messages go.
*/
abstract class ErrorSink
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reuse existing wheels or explain why they are insufficient.

Such deficiencies might include dmd.frontend imports std.

Note also that dmd.frontend is a stable externally available API and so at worst this PR should move the definition from there to here, and have a publicly imported alias to the new place.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reuse existing wheels or explain why they are insufficient.

This line illustrates the insufficiency:

Returns: false if the message should also be printed to stderr, true otherwise

This is the wrong abstraction. The user of Lexer should be able to send the errors anywhere, baking into the API "stderr" or not is insufficent. ErrorSink puts no constraint on where the messages should go. They can even go nowhere, as ErrorSinkNull demonstrates.

It's other insufficiency is it is centered around a pointer to a function which overrides the default behavior. This is a primitive form of OOP that fell out of favor in the 1980s. The abstract class, for which the user adds derived classes with specific behaviors, is the more modern approach. The abstract class designer does not have to anticipate what the implementation classes will be doing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to touch dmd.frontend in this PR. As I mentioned to @ibuclaw it needs a significant overhaul.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I consider this PR to require approval from @jacob-carlborg before merging.

don't want to touch dmd.frontend in this PR

I do not trust you will ever get around to this. And this adds yet more files without packaging things properly and I trust even less that you will ever do that.

They can even go nowhere, as ErrorSinkNull demonstrates.

So can DiagnositReporter, return true;.

As I mentioned to @ibuclaw it needs a significant overhaul.

Where? why? (Hint, no it doesn't).

ljmf00
ljmf00 previously requested changes Feb 22, 2023
Copy link
Member

@ljmf00 ljmf00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain why we defer the creation of a proper error handling mechanism on DMD and also why not use the available interfaces exposed to be used on diagnostics?

@WalterBright WalterBright force-pushed the errorsink branch 4 times, most recently from 8eb54d1 to 030f344 Compare February 22, 2023 20:00
@WalterBright
Copy link
Member Author

I'm mystified by the unittest runner:

./build unittest
(DC) LEXER-unittest
(DC) DMD-unittest
(RUN) DMD-UNITTEST
Success

However, the unittests were not run, as I put a printf in them. So looking at the -v output, I see it is linking in:

../generated/linux/release/64/lexer-unittest.o ../generated/linux/release/64/backend-unittest.o ../generated/linux/release/64/common-unittest.o

What? The unittest feature in D is designed so you compile with -unittest, run the program, and the unittests are run. Where are all these extra .o files for? And why won't the unittests run when I run the generated dmd? What happened to our simple way of doing unittests?

@WalterBright
Copy link
Member Author

Why?

The best way to make understandable code is to have all the inputs and outputs at the entry point to the function. John Carmack explains this far better than I in http://www.sevangelatos.com/john-carmack-on/

What I've been doing over several previous PRs is just that - giving Lexer everything it needs via the call to its constructor, rather than "side loading" things through global variables, like global. This PR addresses a remaining problem - where do error messages go? The Lexer should not care what happens to the error messages, it just needs a place to send them. This is abstracted into ErrorSink, which is just what it says, it's a sink for error messages. Taking a look at the abstract class, it's utterly trivial and can be understood completely in 30 seconds. A couple handy implementations of it are there as well - ErrorSinkNull which just does nothing, and ErrorSinkStderr which just writes the messages to stderr. Both are trivial.

Contrast this with dmd.errors. It's a bewildering panoply of functions. There's a hook in it, diagnosticHandler, which snatches away control. But the user still has to understand all that stuff in errors.d, and it still all gets pulled into the executable, along with all its dependencies. It's full of global state, exactly what is undesirable in encapsulating functionality.

This PR does not break diagnosticHandler, or change the existing code that interfaces with it, other than adding a parameter to the constructor of Lexer.

Everything works but the unittests. So far, I have not been able to debug that because I have not managed to make heads or tails of how the unittests are run. Something about unittests in dmd have gone completely off the rails into a swamp.

/***************************************
* Where error/warning/deprecation messages go.
*/
abstract class ErrorSink
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I consider this PR to require approval from @jacob-carlborg before merging.

don't want to touch dmd.frontend in this PR

I do not trust you will ever get around to this. And this adds yet more files without packaging things properly and I trust even less that you will ever do that.

They can even go nowhere, as ErrorSinkNull demonstrates.

So can DiagnositReporter, return true;.

As I mentioned to @ibuclaw it needs a significant overhaul.

Where? why? (Hint, no it doesn't).

@jacob-carlborg
Copy link
Contributor

Then I consider this PR to require approval from @jacob-carlborg before merging.

I'm not doing any code reviews. This PR is the perfect example of why I'm not contributing anymore.

Geod24
Geod24 previously requested changes Feb 23, 2023
Copy link
Member

@Geod24 Geod24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Lots of useless imports;
  • The lazy instantiation of globals in the middle of semantic is a big no;
  • The two classes you provide, ErrorSinkNull and ErrorSinkStderr, are completely unused, something you have repeatedly complained about;

Instead of re-doing what other people have been wanting for years, perhaps we should just look into how/what @jacob-carlborg did, with DiagnosticReporter, which was also introduced gradually (starting with the lexer).

* Params:
* mod = the Module
* aclasses = array to fill in
* aclasses = array to fill in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated change, should be its own commit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another case of tabs being converted to spaces by my checkin procedure. Not worth another PR.

if (!s.isStaticIfDeclaration())
{
//s.dsymbolSemantic(sc2);
//s.dsymbolSemantic(sc2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated, own commit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That came about because of my checkin process converts all tabs to spaces, and apparently there were tabs on that line. Being a bit of trivia, there's no issue here.


extern (C++) void _init()
{
global.errorSink = new ErrorSinkCompiler;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved out of globals and into main to reduce circular dependencies

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It used to be in main, as you commented on later. The trouble was the main-less unit tester in the test suite. There's a weird problem with the test suite "unit tester" as it makes no sense as a unit tester. This needs to be redone as just tests (with a main entry point), but redoing that is beyond the scope of this PR.

import dmd.mtype : TypeBasic;

if (!global.errorSink)
global.errorSink = new ErrorSinkCompiler;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, this pattern is not good

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, that's left over from a previous incarnation of this PR. I had a fair amount of trouble figuring out where the unittests were being initialized.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this code doesn't call globals._init, and so global.errorSink is not initialized. So I have to initialize it here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having everything in global is not ideal either.

enum code = "token";

scope lexer = new Lexer("test.d", code.ptr, 0, code.length, 0, 0);
scope lexer = new Lexer("test.d", code.ptr, 0, code.length, 0, 0, new ErrorSinkStderr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should just be a function at this point, instead of copy-pasted code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a class because it consists of several functions, not one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant that at some point, we should realize that:

scope lexer = new TestLexer(code);
// Or:
scope lexer = makeTestLexer(code);

is the way to go.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's worth considering for a future PR. We already use both techniques in dmd.

@WalterBright
Copy link
Member Author

Lots of useless imports;

Removed. Explained in the comments.

The lazy instantiation of globals in the middle of semantic is a big no;

Removed. Explained in the comments.

The two classes you provide, ErrorSinkNull and ErrorSinkStderr, are completely unused, something you have repeatedly complained about;

ErrorSinkNull I plan to use in errors.d to replace the use of:

auto gaggedErrorsSave = global.startGagging();

in errors.d, which is the perfect use case for it.

I plan to use ErrorSinkError in the unit tests for Lexer.

Both are also an illustration for anyone needing their own custom error message sink.

Instead of re-doing what other people have been wanting for years, perhaps we should just look into how/what @jacob-carlborg did, with DiagnosticReporter, which was also introduced gradually (starting with the lexer).

ErrorSink is a better design. It's hard to think of anything simpler - if you can, I want to hear about it. Note also that the only import ErrorSink needs is dmd.location. I'd actually like to get it to zero imports. In general, using sinks is an excellent approach to eliminating arbitrary dependencies.

@WalterBright
Copy link
Member Author

This serves our goals of reducing technical debt and making the lexer/parser more amenable to being standalone by reducing its dependencies on other modules. It's ready to merge. @RazvanN7 @dkorpel

@WalterBright WalterBright added the Review:Ready To Merge Let's get this merged label Feb 24, 2023
Copy link
Contributor

@dkorpel dkorpel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a copy of what already exists does not reduce technical debt. If there's deficiencies in DiagnosticReporter, why not fix that instead?

@WalterBright
Copy link
Member Author

Creating a copy of what already exists does not reduce technical debt.

To reiterate:

  1. it results in lexer.d having a dependency on dmd.console.
  2. if a diagnosticreporter is used, the code for the non-use case remains in the program, unused
  3. I quote:
 * The diagnostic handler.
 * If non-null it will be called for every diagnostic message issued by the compiler.
 * If it returns false, the message will be printed to stderr as usual.

it should not leave a builtin dependency on stderr. It should not have default behavior at all.
4. It uses an old-fashioned pointer-to-handler OOP design that was supplanted in 1990 (aka a "hook" interface)
5. abstract interfaces are a better design
6. It still leaves a dependency on the gagging global which is an execrable design practice
7. It is much more complicated than necessary
8. anyone can understand ErrorSink in 30 seconds
9. GUI programs don't have a console or stderr
10. it enables easy mock sinks, which are critical for unittesting
11. it "side loads" into Lexer, rather than going through the function parameter list

This PR does not delete diagnostichandler because that would break existing code. I wished to keep this PR as a refactoring. Migrating from diagnostichandler to the abstract interface can be done later.

@WalterBright WalterBright force-pushed the errorsink branch 3 times, most recently from ba5f8f1 to 245cc79 Compare February 26, 2023 01:44

void errorSupplemental(const ref Loc loc, const(char)* format, ...);

void warning(const ref Loc loc, const(char)* format, ...);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warningSupplemental?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only added the ones used.

void deprecation(const ref Loc loc, const(char)* format, ...);

void deprecationSupplemental(const ref Loc loc, const(char)* format, ...);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

message should also be included.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only added the ones used. Others can be added as necessary.

@maxhaton
Copy link
Member

Is this any less dependant on gagging?

@WalterBright WalterBright merged commit 49dec45 into dlang:master Feb 26, 2023
@WalterBright WalterBright deleted the errorsink branch February 26, 2023 23:10
@WalterBright
Copy link
Member Author

Is this any less dependant on gagging?

Currently, no. But as I remarked upthread, there's an obvious case in colorHightlightCode() for using ErrorSinkNull instead of gagging. Eventually, I'd like to see all use of gagging replaced by ErrorSinkNull.

else
{
p.error(s.loc, "expected identifier after `[`");
p.eSink.error(s.loc, "expected identifier after `[`");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just makes iasmgcc ugly to look at for no obvious benefit.

const errors = global.errors;
scope gas = new GccAsmStatement(Loc.initial, tokens);
scope p = new Parser!ASTCodegen(null, ";", false);
scope p = new Parser!ASTCodegen(null, ";", false, global.errorSink);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, might as well make it a default param instead, there are dozens of repetitions of this pattern of change.

Comment on lines +37 to +43
void error(const ref Loc loc, const(char)* format, ...)
{
va_list ap;
va_start(ap, format);
verror(loc, format, ap);
va_end(ap);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dangerously close to hijacking gcc error diagnostics here. Or am I expected to just fork out the implementation to the C++ side of the compiler?


void error(const ref Loc loc, const(char)* format, ...)
{
fputs("Error: ", stderr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't loc come before "Error:" etc?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Review:Ready To Merge Let's get this merged Severity:Refactoring No semantic changes to code

Projects

None yet

Development

Successfully merging this pull request may close these issues.