Add customizable diagnostic reporter for the lexer and parser#9350
Add customizable diagnostic reporter for the lexer and parser#9350dlang-bot merged 1 commit intodlang:masterfrom
Conversation
|
Thanks for your pull request and interest in making D better, @jacob-carlborg! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please see CONTRIBUTING.md for more information. If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment. Bugzilla referencesYour PR doesn't reference any Bugzilla issue. If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog. Testing this PR locallyIf you don't have a local development environment setup, you can use Digger to test this PR: dub fetch digger
dub run digger -- build "master + dmd#9350" |
471204e to
ada9bad
Compare
8f57973 to
5706b55
Compare
5706b55 to
73c1025
Compare
thewilsonator
left a comment
There was a problem hiding this comment.
Otherwise LGTM. Theres something wrong with the docs somewhere though.
c1a7d83 to
6263228
Compare
|
Blocks #9359. |
This replace the existing `ErrorHandler` with the new `DiagnosticReporter` for the lexer and parser. This will use the diagnostic reporter not just for reporting errors but for all types of diagnostic reporting. Since the diagnostic reporter is passed to the lexer and parser it is possible to customize what happens when a diagnostic is reported. This is useful for testing and when using the compiler as a library.
6263228 to
b9ee3c0
Compare
It allows to configure where the output goes and how it's formatted and even if an error/warning/deprecation should be reported at all. |
|
Why does it need to be configured? |
|
To be able to write the diagnostic message to other places than stderr, like a text area or store in a variable. Or not print at all. It's used here #9359. A passing test shouldn't have any output. |
|
So the internals of the compiler can be unit tested. For example, unit test may parse a snippet of the D code, assert that the resulting AST is the expected one and also assert that emitted diagnostic messages are as expected. You can't do this with the old end to end compiler testing. |
|
The old error reporting is suitable only in a very narrow number of use cases - when standard dmd compiler driver is used. When dmd is used as a library the old error reporting mechanism is completely unusable as the custom applications need to drive the dmd library in a custom manner. |
|
|
That's not suitable for customizing.
Can you please show an example of how to do this with the Dub package?
That would require changing the error handling in hundreds of places. You don't like big pull requests.
The |
True. Which is why I have been modifying them so the messages are different.
Sure it is. Let me explain how libraries work. The linker is presented with an undefined symbol. It first looks in the object files for a resolution. If not found, then it looks in the library for a resolution. If found, that and only that object file is pulled out of the library. Repeat for any new undefined symbols. Hence, to "customize" a function that's in a library, simply provide your own function in your object file, with the same signature.
No it doesn't, because the default sink would be what it does now. Setting a sink should be only one function call upon startup, not on every call.
Adding "unit test infrastructure" is a failure right there. D has unit tests, just add them. Note that Phobos has a massive amount of unit tests, and no unit test infrastructure. |
I don't know where to start. Yes it works, but you can't be serious suggesting it as a long term solution.
Using the linker to override the symbol is neat when you have to get something to work quickly. It should be regarded as a hack, an exception, not something that is required for a library to do. Now, when it comes to the unittests, this is what @jacob-carlborg was talking about, I believe: unittest
{
// The code we wish to analyze
immutable moduleString = q{
pragma(msg, int, double, real);
pragma(msg, int.stringof, ':', " ", ulong.max);
pragma(msg, "Integer expr: ", 4242_424242);
pragma(msg, "With hex: ", 0x2_A);
};
// DMD is an object that represent the compiler
// `BufferDiagnosticReporter` is a diagostic reporter that just buffers everything
// in a "string[] messages" variable
scope dmd = new DMD(new BufferDiagnosticReporter);
dmd.initialize();
dmd.runSemanticOnModule(moduleString);
auto diag = dmd.getDiagnosticReporter();
assert(diag.messages.length == 4);
assert(diag.messages[0].text == "intdoublereal", diag.messages[0].text);
assert(diag.messages[1].text == "int: 18446744073709551615", diag.messages[1].text);
assert(diag.messages[2].text == "Integer expr: 4242_424242", diag.messages[2].text);
assert(diag.messages[3].text == "With hex: 0x2_A", diag.messages[3].text);
}If we want the compiler as a library to be more than a student project, and actually something people use to build tools for dlang, which we so desperately need, this should be trivial to do. |
That won't work for unit tests where usually all the files are compiled into a single binary without any library involved.
Yes, I'm already using the Line 18 in 7894b2d But without any further customizations that requires you to run all of the unit tests even if you're only interested in only running a single |
I do not see how
It's literally ONE function we're talking about -
You can override it with a function that has multiple dispatch.
There's nothing wrong with it, it's another tool in the toolbox. The only thing about it is it appears to be a forgotten technique. It has an advantage that not only is the call direct rather than indirect, it can be inlined.
As shown, it offers nothing over what Seb and others have already well developed in the autotester. Even so, encouraging people to use error message texts in their unit tests will prove disastrous. It will introduce a whole new category of things we cannot improve due to backwards compatibility problems. This goes back to what I wrote about the dependency of dmd on |
|
What all these changes do is address overriding one function - |
|
Suggestion:
Then, user does this: It's not pure, but it gets the job done without upending the code base. |
Indeed, I thought the symbol was pragma(mangle, "_D3dmd6errors11verrorPrintFKxSQBc7globals3LocEQBs7console5ColorPxaQdPS4core4stdc6stdarg13__va_list_tagQBnQBqZv")
private void verrorPrint(const ref Loc loc, Color headerColor, const(char)* header,
const(char)* format, va_list ap, const(char)* p1 = null, const(char)* p2 = null)
{
// Some code
}In order to override the default behavior ?
It doesn't make the solution more appealing. It might be one function, but it's a very basic need to be able to capture messages. Is there a library out there that uses this approach ?
Could you provide a code example ? I have one in mind but don't want to assume too much.
There's nothing wrong with having many tools in hands, but most tools are simply better at achieving what we want to achieve, with less downsides. And the call is not direct if you use a function that does multiple dispatch. Also, to have inlining, since it's done at the linker level, you need LTO.
That was an example of an unittest you could put in DMD's test suite. An example of an actual application of using DMD as a library, and capturing messages would a web server like
It works, and is already much, much better than the linker hack you were previously suggesting. It is not |
There's no reason to use
I don't understand the question?
If the overriding function is known to the caller, which it would be if it was included in the caller's source code, it can be inlined by the compiler. (I've written librarians, linkers, compilers, and inliners. I should know how this works.) A thousand lines of code to intercept one function call is not desirable. Multiply that by the number of globals dmd has, and we've burned down the house to kill a few termites. Getting rid of globals is a goal, not an imperative overriding all other considerations. Replacing 5 lines of code with a thousand, plus a thousand more lines of disruption, requires enormous justification, and I'm not seeing that. Please don't use unittesting error messages as a justification for this. It just legitimizes a place we don't want to go. Use justifications that are worthwhile. I don't want to see any unittesting of error messages, because that will prevent us from improving the error messages. The messages must be considered to be ephemeral. The only place they should be tested is in |
Which requires you to modify DMD's source code in order to use it as a library. Absolutely unacceptable. void myErrorMsgSink(string s) {
if (gohere) myErrorMsgSinkHere(s);
else myErrorMsgSinkThere(s);
}Where is "gohere" defined ? What does it correspond to ?
It is not included in the caller's source code.
Please read my message. The example was to show the expected API. No one should tests the error messages but the compiler. There is no use case I can think of, and I am not trying to encourage it. I provided you with other use cases for using the messages, why did you choose to ignore it ? |
|
In theory I'm fine with doing something like
The problem with |
|
@WalterBright do you prefer |
I think there's some terrible misunderstanding here. If I want to override strlen() in the C library and use my own, I just write my own strlen() in my code and the one in the C library will never be linked in. I don't have to modify the C library source code at all. There's nothing clever or hackish about it. Perhaps at DConf we can sit down and I can show you how libraries and linking work, as this back-and-forth is clearly not working. It should take about 5 minutes. I invite anyone else to join us. |
|
It belongs as part of |
I very well understand, thanks. The point was that |
That's because there are 1000 files. Those files can be concatenated into fewer files. |
You can:
|
|
For 1 & 2, I already mentioned the cons here, to which you replied you "do not see how Now we're just going in circles. |
|
so point on the module thing, the compiler won't necessarily error out. If one module is passed on command line and the other found in the -I path, the compiler will prefer the command line one and not even look at the import path, thereby bypassing the error condition and letting the linker do its thing. |
|
@adamdruppe that's right. |
This replace the existing
ErrorHandlerwith the newDiagnosticReporterfor the lexer and parser. This will use the diagnostic reporter not just for reporting errors but for all types of diagnostic reporting. Since the diagnostic reporter is passed to the lexer and parser it is possible to customize what happens when a diagnostic is reported. This is useful for testing and when using the compiler as a library.