Skip to content

Conversation

@wilzbach
Copy link
Contributor

One really nice thing about the default Checked/checked was that it was issuing an explanatory message before terminating the program. That was a massive improvement over "Bus error", "Division by zero" and similar messages with little additional detail provided by lower-level facilities.

In release mode, assert(0, msg) does not print msg and provides no explanation. We could argue that this behavior should be changed (I think it should); we worry about things like the size of messages and code way too much for this era. Or we could look for a trusted means to write to stderr before aborting. Whatever we do, we should keep the nice behavior.

So this stops the world while printing the exception.
Imho lockingTextWriter shouldn't require such hacks :/

Alternative to #5928

@dlang-bot
Copy link
Contributor

Thanks for your pull request, @wilzbach!

Bugzilla references

Auto-close Bugzilla Severity Description
18024 enhancement checkedint.Abort should be @safe

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "master + phobos#6369"

@JackStouffer
Copy link
Contributor

Why not synchronized?

@wilzbach
Copy link
Contributor Author

wilzbach commented Mar 28, 2018

Why not synchronized?

The synchronized statement wraps a statement with a mutex to synchronize access among multiple threads which if I understood your point was that another thread thread writing from somewhere else to stderr is unsafe.

However, I just checked http://www.unix.org/whitepapers/reentrant.html
It says the following:

The POSIX.1 and C-language functions that operate on character streams (represented by pointers to objects of type FILE) are required by POSIX.1c to be implemented in such a way that reentrancy is achieved (see ISO/IEC 9945:1-1996, §8.2). This requirement has a drawback; it imposes substantial performance penalties because of the synchronization that must be built into the implementations of the functions for the sake of reentrancy.

Also LockingTextWriter locks its handle:

phobos/std/stdio.d

Lines 2844 to 2846 in 2d87546

FILE* fps = f._p.handle;
orientation_ = fwide(fps, 0);
FLOCK(fps);

but if you don't trust that one, even fprintf would do the job ...

@JackStouffer
Copy link
Contributor

JackStouffer commented Mar 28, 2018

I understood your point was that another thread thread writing from somewhere else to stderr is unsafe.

It's more subtle than that.

Having any reference at all to stderr in multiple threads is unsafe. stderr has been turned into a function which returns a __gshared singleton File; since it's a value type, the postblit is called when it's returned which increments the ref counter. The ref counter is also __gshared, and is modified with no use of atomic ops or locking. When the count gets to 0, the file is auto closed. Therefore ANY usage of stderr in multiple threads is a problem.

You already sidestep this issue by having the reference to stderr returned during the locked portion.

@wilzbach
Copy link
Contributor Author

wilzbach commented Apr 9, 2018

@JackStouffer as far as I understand we fixed the ref-counting of File in #6382 and now only an assignment of the global stderr variable is potentially dangerous, but as this PR goes I reason the following:

  1. the stderr message is followed immediately after by a segfault of the program (so we could literally do anything that's unsafe)
  2. _writefln can be at the most called once and thus it will only call stderr.lockingTextWriter.put once, which is now guaranteed to have a correct reference count. The only potential problem is that a user might have assigned stderr to something else, but just for dumping the error message, I don't think this case is a concern to us?

Imho we can even remove the thread_stopAll hack now...

@JackStouffer
Copy link
Contributor

the stderr message is followed immediately after by a segfault of the program (so we could literally do anything that's unsafe)

Is it? I don't see that happening anywhere but a scope(failure) call.

@wilzbach
Copy link
Contributor Author

wilzbach commented Apr 9, 2018

Is it? I don't see that happening anywhere but a scope(failure) call.

Yeah, it's how the Abort (default) hook works:

Warn.hookOpCmp(lhs, rhs);
assert(0);

(though I was confusing things and thought that this PR touches Abort directly whereas it only touched Warn).

Anyhow, printing to stderr should be @safe and AFAICT it's now @safe and if it still isn't we have a serious problem.

@JackStouffer
Copy link
Contributor

Well, your program is no longer going to segfault (I think) but there can still be bad data races. Since File takes up three registers assigning a different File instance to stdout will take multiple mov commands. What happens when you try to write to a file that's only been half copied?

This is true even if you were to assert(0); right after writing.

@wilzbach
Copy link
Contributor Author

wilzbach commented Jun 6, 2018

See also #6550 for yet another PR to make checkedint @safe

@wilzbach
Copy link
Contributor Author

wilzbach commented Jan 2, 2019

In the meantime the LockingWriter has been made thread-safe, but it's still not formally @safe.
However, I still think that this workaround is our best trick for now to finally make CheckedInt @safe.

@thewilsonator what do you think? Can we fix this long-outstanding issue with this trick?

thread_suspendAll;
scope(exit) thread_resumeAll;
scope(failure) assert(0, "Thread suspension while printing'" ~ msg ~ "' from CheckedInt failed.");
// this can be done @safe-ly, because it happens just before the program terminates
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warn just logs right? Whats this about program termination?

Copy link
Contributor

@thewilsonator thewilsonator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the slightly confused comment, LGTM.

@RazvanN7
Copy link
Collaborator

This has been fixed in: #7909

@RazvanN7 RazvanN7 closed this Jan 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants