Skip to content

Comments

build.d: Exit gracefully#10560

Merged
thewilsonator merged 1 commit intodlang:masterfrom
MoonlightSentinel:build-exit
Nov 13, 2019
Merged

build.d: Exit gracefully#10560
thewilsonator merged 1 commit intodlang:masterfrom
MoonlightSentinel:build-exit

Conversation

@MoonlightSentinel
Copy link
Contributor

This PR enables build.d to handle build failures instead of terminating immediatly (without caring about running processes, temporary files, ...).

This allows build.d e.g. to provide additional information about a failed target

@dlang-bot
Copy link
Contributor

Thanks for your pull request and interest in making D better, @MoonlightSentinel! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "master + dmd#10560"

src/build.d Outdated
/// Aborts the current build
void abortBuild()
{
throw new Exception("Build failed!");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to throw a specific Exception type for each build failure cause and optionally include additional information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the possibility to specify a custom message but custom exceptions for every possible build failure seem overkill IMHO. The only intent of this change is to enable destructors and scope(...) statements which will be skipped by exit.

@marler8997
Copy link
Contributor

So originally we were throwing exceptions in build.d, but we changed it because we don't want to print a stacktrace to build.d when this occurs. Maybe you could catch the exception before it gets caught by the default exception handler and prints the stacktrace?

@MoonlightSentinel
Copy link
Contributor Author

So originally we were throwing exceptions in build.d, but we changed it because we don't want to print a stacktrace to build.d when this occurs. Maybe you could catch the exception before it gets caught by the default exception handler and prints the stacktrace?

@marler8997 Done

src/build.d Outdated
*/
BuildException abortBuild(string msg = "Build failed!")
{
return new BuildException(msg);
Copy link
Contributor

@marler8997 marler8997 Nov 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also throw here. it looks weird to throw both at the callsite and inside the function, but doing both has benefits even if the caller's throw never actually gets executed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, double throw would be really weird. How would that be beneficial?

Copy link
Contributor

@marler8997 marler8997 Nov 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

someone might accidently call abortBuild without throw abortBuild, thinking that it would exit the program (since it's called abortBuild). Since error conditions like this are rarely tested, it's good to make these code paths as "full proof" as you can.

If you throw inside the function but still return an Exception type (that will never actually return), then caller can choose to use the explicit throw or not, depending on if they need that control flow. Take this example

int returnInt()
{
    if (foo)
    {
        return 0;
    }
    abortBuild();
}

In this case you would get a compile error because the end of the function doesn't return anything. However, if abortBuild returns an Exception type even though it never returns, then you can just do throw abortBuild() and the compiler will be happy.

Note that there's been some discussion about adding a trait or a special return type for functions that means "noreturn", but D doesn't have that yet. This is sort of a workaround for not having that feature.

Copy link
Contributor

@marler8997 marler8997 Nov 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be an easy mistake to make as well:

void foo()
{
    if (somethingWentWrong)
        abortBuild("oh no");
    okWereGood();
}

If you throw inside the function, then this works as one would expect from just looking at the code without going into the function definitions.

If you don't want to throw inside abortBuild, then it's better to call it something like makeBuildError, as it's not really aborting the build, just returning an error that you need to throw yourself. I know you named it abortBuild as you were initially throwing so the name is only confusing because of my suggestion :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clever. I'll revert to throwing inside of abortBuild but keep the fake return value to allow throw abortBuildat the callsite

@MoonlightSentinel
Copy link
Contributor Author

Changed cxx-unittest to use abortBuild instead of enforce

Copy link
Contributor

@wilzbach wilzbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@thewilsonator
Copy link
Contributor

Auto-merge toggled on

@marler8997
Copy link
Contributor

marler8997 commented Nov 12, 2019

I've noticed that the new build.d script will sometimes deadlock if a failure occurs, hopefully this fixes it. It might not have been cleaning up something properly between threads or something.

@thewilsonator thewilsonator merged commit ad5f2c7 into dlang:master Nov 13, 2019
@MoonlightSentinel
Copy link
Contributor Author

I've noticed that the new build.d script will sometimes deadlock if a failure occurs, hopefully this fixes it. It might not have been cleaning up something properly between threads or something.

Do you have some clues under which consitions the deadlock appeared? (System, Target, ...)

@marler8997
Copy link
Contributor

marler8997 commented Nov 13, 2019

I tried reproducing the hang but couldn't. I can reproduce a similar issue today by injecting an error into the backend. Sometimes it will fail with an error, and sometimes we'll get an uncaught exception like this:

(DC) D_BACK_OBJS backend.d, bcomplex.d, evalu8.d, divcoeff.d, dvec.d, go.d, gsroa.d, glocal.d, gdag.d, gother.d, gflow.d, out.d, gloop.d, compress.d, cgelem.d, cgcs.d, ee.d, cod4.d, cod5.d, nteh.d, blockopt.d, mem.d, cg.d, cgreg.d, dtype.d, debugprint.d, fp.d, symbol.d, elem.d, dcode.d, cgsched.d, cg87.d, cgxmm.d, cgcod.d, cod1.d, cod2.d, cod3.d, cv8.d, dcgcv.d, pdata.d, util2.d, var.d, md5.d, backconfig.d, ph2.d, drtlsym.d, dwarfeh.d, ptrntab.d, dvarstats.d, dwarfdbginf.d, cgen.d, os.d, goh.d, barray.d, cgcse.d, elpicpie.d, machobj.d, elfobj.d, aarray.d
/home/marler8997/ddev1/dmd/src/dmd/backend/cg87.d(1): Error: declaration expected, not `!`

uncaught exception
core.thread.ThreadException@src/core/thread.d(766): Unable to join thread
----------------
??:? object.Throwable core.thread.Thread.join(bool) [0x5b13fa]
??:? int std.parallelism._sharedStaticDtor12().__foreachbody1(ref core.thread.Thread) [0x599554]
??:? int core.thread.Thread.opApply(scope int delegate(ref core.thread.Thread)) [0x5b1990]
??:? void std.parallelism._sharedStaticDtor12() [0x5994f4]
??:? void std.parallelism.__modshareddtor() [0x599564]
??:? void rt.minfo.runModuleFuncsRev!(rt.minfo.ModuleGroup.runDtors().__lambda1).runModuleFuncsRev(const(immutable(object.ModuleInfo)*)[]) [0x5c19fc]
??:? void rt.minfo.ModuleGroup.runDtors() [0x5c15d4]Illegal instruction (core dumped)

In any case, with this change I no longer get this uncaught exception, so I think the hang is probably fixed as I think it's closely related to this uncaught exception.

@rainers
Copy link
Member

rainers commented Nov 13, 2019

Is build.d using some kind of "fork", maybe also indirectly when spawning other processes? Then there might be a couple of issues that appeared with the parallel GC (some not directly related). These are fixed in 2.089 (dlang/druntime#2816, dlang/druntime#2817, dlang/druntime#2813), but it seems 2.088 is used by the builds here.

@marler8997
Copy link
Contributor

@rainers that seems likely as I was no longer getting the hang when trying to reproduce today. Now I only get the uncaught exception on join. I was pretty sure it was a druntime issue, I tried to debug it once but wasn't able to find the root cause.

@wilzbach
Copy link
Contributor

So maybe we should bump the version that AUTO_BOOTSTRAP uses to 2.089 then to avoid issues on the Buildkite CI?

@marler8997
Copy link
Contributor

It was mostly an annoyance during development. Most CI's should just timeout if a failure occurs and it hangs so not a huge deal for them.

@rainers
Copy link
Member

rainers commented Nov 14, 2019

So maybe we should bump the version that AUTO_BOOTSTRAP uses to 2.089 then to avoid issues on the Buildkite CI?

Might be worth a try. I've also seen a similar error in the DAutoTest logs recently.

It was mostly an annoyance during development. Most CI's should just timeout if a failure occurs and it hangs so not a huge deal for them.

It can block other PRs if it waits for the timeout instead of failing fast. I think it also obscures the actual error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants