Add pureFree to core.memory by nordlow · Pull Request #1836 · dlang/druntime

nordlow · 2017-06-11T11:03:51Z

To be used in contexts such as RefCounted-destructor instead of having to define local overloads such as here https://github.com/dlang/phobos/pull/4832/files#diff-4e008aedb3026d4a84f58323e53bf017R5087.

AFAICT, free doesn't set errno.

nordlow · 2017-06-11T14:17:31Z

What's wrong with the continuous-integration/jenkins/pr-merge?

wilzbach · 2017-06-11T15:50:32Z

What's wrong with the continuous-integration/jenkins/pr-merge?

It's unrelated - it's a regression introduced in dlang/phobos#5427

MoritzMaxeiner

pureFree needs to save&restore errno
modified unittests should be marked @system

MoritzMaxeiner · 2017-07-05T13:20:47Z

src/core/memory.d


 ///
-nothrow @nogc unittest
+pure nothrow @nogc unittest


If you modify this unittest, shouldn't it also be marked as @system ?

Why should it be marked as @system?

MoritzMaxeiner · 2017-07-05T13:21:44Z

src/core/memory.d

 }

-pure @nogc nothrow unittest
+pure nothrow @nogc unittest


Same here w.r.t adding @system

GH won't let me comment on unchanged lines, so I'll put add it here at the last changed line:
If you expose fakePureFree via pureFree, the needed by unittests comment at the fakePureFree declaration should be removed.

Removed the comment at fakePureFree.

MoritzMaxeiner · 2017-07-05T13:30:30Z

src/core/memory.d

+ * Pure variants of C's memory allocation functions `malloc`, `calloc`, and
+ * `realloc` and deallocation function `free`.
+ *
+ * Purity for allocation is achieved via resetting the `errno` to it's value


it's -> its

Since deallocation must also save&restore errno (see comment at pureFree):
Purity for allocation -> Purity for allocation and deallocation

s/Purity for allocation is achieved via resetting the errno to it's value/Purity for allocation is achieved by saving and restoring the value of errno, thus behaving as if it were never changed/

Changed to its.

MoritzMaxeiner · 2017-07-05T14:17:16Z

src/core/memory.d

    return ret;
 }
+/// ditto
+void pureFree(void* ptr) @system pure @nogc nothrow


free is not guaranteed not to set errno by current standards; libc implementations may choose to set errno for both valid and invalid pointers.
POSIX.1 Issue 8 (still WIP), is going to forbid free to set errno for valid pointers, but explicitly allows setting it for an invalid pointers.
AFAIK there's no way for us to know whether the used libc will be linked to statically or dynamically and in the latter case there's no way AFAIK to know if the libc does or does not set errno.
It follows that pureFree must save&restore the errno value the same way that the allocation functions do:

const errno = fakePureGetErrno(); fakePureFree(ptr); if (errno != 0) fakePureSetErrno(errno);

Yah, let's set and restore it for now regardless. We may change that later as an optimization. Thanks!

JackStouffer · 2017-07-05T14:34:06Z

I thought we all agreed that free can't be pure?

MoritzMaxeiner · 2017-07-05T14:36:01Z

AFAICT, free doesn't set errno

This may be true for your libc, but it's not guaranteed by ISO C or POSIX. The WIP POSIX.1 Issue 8 even explicitly mentions that implementations are free to set errno if they receive an invalid pointer as input.

wilzbach · 2017-07-05T14:39:39Z

AFAICT, free doesn't set errno.
free is not guaranteed not to set errno by current standards;

FWIW it doesn't matter for the pureness considerations. For example many functions in std.math can set errno as well, but they are considered as pure.

wilzbach · 2017-07-05T14:40:42Z

AFAICT, free doesn't set errno.

Linking to the previous threaD: #1746

JackStouffer · 2017-07-05T14:41:48Z

FWIW it doesn't matter for the pureness considerations. For example many functions in std.math can set errno as well, but they are considered as pure.

That's a bug then. Can you please make a bugzilla issue?

MoritzMaxeiner · 2017-07-05T14:43:11Z

FWIW it doesn't matter for the pureness considerations. For example many functions in std.math can set errno as well, but they are considered as pure.

errno is global state. Changing it is impure, so those are bugs.

MoritzMaxeiner · 2017-07-05T14:47:32Z

I thought we all agreed that free can't be pure?

I have read this discussion and the thread @wilzbach linked to, but I saw (so far) no convincing argument as to why free should not be pure (in D's definition of pure, which is weakly pure w.r.t. functional PL's pure)
IMHO if we allow heap allocations to be pure, so should heap deallocations; we are inconsistent otherwise.

JackStouffer · 2017-07-05T14:58:59Z

@Calrama Check out this discussion #1718

Ping @schveiguy

wilzbach · 2017-07-05T15:20:08Z

FWIW it doesn't matter for the pureness considerations. For example many functions in std.math can set errno as well, but they are considered as pure.
That's a bug then. Can you please make a bugzilla issue?

See e.g. ilogb, but for the most part I was confusing it as std.math uses assembler intrinsics via core.math and just ignores the error masks on the FPU. Isn't this the same as resetting the erno to its previous result after the allocation?

MoritzMaxeiner · 2017-07-05T15:21:18Z

@JackStouffer Thank you. AFAICT the described issue lies with people casting away immutable inside a pure function and then passing that to free, because the compiler is free to remove the call to free as it has no return value (and thus no effect by the rules of pure)?
What I don't understand there is this: pure functions are required to only take immutable arguments, so shouldn't casting away immutable from an argument inside pure be illegal, i.e. produce a compiler error?

jmdavis · 2017-07-05T17:02:45Z

What I don't understand there is this: pure functions are required to only take immutable arguments, so shouldn't casting away immutable from an argument inside pure be illegal, i.e. produce a compiler error?

pure functions have not been required to take immutable arguments for years now. I would suggest that you read this: http://klickverbot.at/blog/2012/05/purity-in-d/

MoritzMaxeiner · 2017-07-05T17:26:52Z

pure functions have not been required to take immutable arguments for years now.

Sorry, temporary insanity caused by looking at the FAQ, which still states that.

andralex · 2017-07-26T22:59:56Z

@Calrama could you please post an issue against https://dlang.org/const-faq.html#invariant? Better yet, a PR fixing it. Thanks!

andralex

Let's get this moving. Thanks!

andralex · 2017-07-26T23:02:11Z

src/core/memory.d

    return ret;
 }
+/// ditto
+void pureFree(void* ptr) @system pure @nogc nothrow


Yah, let's set and restore it for now regardless. We may change that later as an optimization. Thanks!

andralex · 2017-07-26T23:03:05Z

src/core/memory.d

+ * Pure variants of C's memory allocation functions `malloc`, `calloc`, and
+ * `realloc` and deallocation function `free`.
+ *
+ * Purity for allocation is achieved via resetting the `errno` to it's value


s/Purity for allocation is achieved via resetting the errno to it's value/Purity for allocation is achieved by saving and restoring the value of errno, thus behaving as if it were never changed/

andralex · 2017-07-26T23:03:38Z

cc @nordlow

MoritzMaxeiner · 2017-07-26T23:48:56Z

@andralex Sure, PR is open

schveiguy · 2017-07-27T14:15:00Z

The issue with pure free is that it could be elided by the compiler. pure malloc doesn't have the same problem because of the return type.

What is going to happen is that someone will need to free immutable or const data. They will cast away the immutable and call pureFree. Then that function will be elided - and memory will leak.

The only saving grace of this is that we are not actually marking the real free function, so when we have to rescind this in the future due to subtle nasty memory leaks, not much will be affected, or at least code that is affected will have opted in to the idea that free can be pure.

I'm not going to continually have this argument (please see the previous PR), it seems enough people want to try this doomed experiment, so go ahead.

nordlow · 2017-07-27T14:39:11Z

Why should the compiler elide successive calls to a (weakly) pure function taking mutable references?

Pardon my short memory, if this question has already been answered somewhere, @schveiguy @andralex

I'll update this PR in a couple of hours.

andralex · 2017-07-27T14:55:24Z

I'm not going to continually have this argument (please see the previous PR), it seems enough people want to try this doomed experiment, so go ahead.

Steve, I'm hoping for a different tack on this. Yes, it is clear there is a problem. The problem stands in the way of people doing things that are useful. So we need to find a solution to the problem. Stating that the problem is not solvable and "can't do that" has low informational entropy - a bunch of folks will confirm they can't solve it, starting with the mailman ringing the door right now. What is interesting and good and what we should aim for is the high information response - a creative solution that solves the problem. What are your thoughts on that?

schveiguy · 2017-07-27T17:35:04Z

I didn't say the problem wasn't solvable, I said that solving it this way will likely lead to subtle hard-to-detect leaks.

My recommendation (as before) is to let people tackle this using private declarations of pureFree in their own applications. For instance using it in the RefCounted struct. I think with that level of control, you can make sure it's not elided.

Note that this all really depends on the type, not the call. If the type is not fully immutable/const, then the compiler cannot elide pure calls which use it. In the case of RefCounted, the reference count is always mutable, so I'm assuming that the compiler sees this and won't allow elision.

We can't make such guarantees if we simply provide an easy way to do it. So we let the user make the guarantees on their own.

It's like lock-free programming. There's lots of wrong ways to do it, and in most cases the wrong ways still work most of the time. It's those times it doesn't work where you are in big trouble.

schveiguy · 2017-07-27T17:36:52Z

Why should the compiler elide successive calls to a function taking mutable references?

It's not the call to pureFree that would be elided, but the call to a function that casts away immutable and calls pureFree.

And it's not successive calls either, it would be all the calls. A pure function that takes no mutable references and returns nothing can simply be elided by the compiler 100% of the time.

MoritzMaxeiner · 2017-07-27T20:07:48Z

It's not the call to pureFree that would be elided, but the call to a function that casts away immutable and calls pureFree.

I see the problem, but there is one remark I have: The spec states that

Immutable data values, once constructed, remain the same for the duration of the program's execution.

From that I would assume that casting away immutable in order to free an object is illegal in D (and as such UB), i.e. an immutable object must remain alive for the entirety of the program's remaining execution time.

schveiguy · 2017-07-27T20:21:33Z

From that I would assume that casting away immutable in order to free an object is illegal in D

If there is no other reference, and you are about to forget the last reference (i.e. you are freeing it), then how is it illegal? For all intents and purposes, that piece of memory that you no longer have access to may very well still have the value assigned to it. In fact, free may not even change the value, but just store it away for safe keeping!

We are already accepting that malloc/free isn't modifying global data, even though it really is. We have to have some level of practicality that you can free immutable data as long as you no longer reference it.

Note that freeing immutable objects already cast away immutable (the destructor is not marked immutable).

It would be a good idea to update the spec for this.

MoritzMaxeiner · 2017-07-27T20:42:28Z

If there is no other reference, and you are about to forget the last reference (i.e. you are freeing it), then how is it illegal?

Because the spec - as it is currently worded - explicitly forbids the freeing of immutable objects. Once constructed they must remain alive for the remainder of the program's execution, irrespective of whether you are still using them.

It would be a good idea to update the spec for this.

If we want it to be legal for immutable objects to be freed, yes.

schveiguy · 2017-07-27T20:45:13Z

Because the spec - as it is currently worded - explicitly forbids the freeing of immutable objects.

Then the GC is in violation of the spec. Again, we have to put aside these kinds of requirements when talking about allocators. This is like the old question, if a tree falls in a forest but nobody is around, does it make a sound? Similarly, if nothing can access a piece of immutable memory, is it illegal to modify it?

MoritzMaxeiner · 2017-07-27T20:51:40Z

Then the GC is in violation of the spec.

Then either the GC or the spec needs to be fixed with whatever we end up deciding to be the way to go forward. In case of the GC being fixed, it would have to store an additional list of memory locations where immutable objects were constructed and only collect them when the D runtime shuts down.

Similarly, if nothing can access a piece of immutable memory, is it illegal to modify it?

In my opinion yes, because immutable objects need not even be stored in writable memory.

Immutable data can be placed in ROM (Read Only Memory) or in memory pages marked by the hardware as read only

schveiguy · 2017-07-27T21:04:26Z

In my opinion yes, because immutable data need not even be stored in writable memory

Sorry I thought in this context we were only talking about allocated RAM, not ROM.

MoritzMaxeiner · 2017-07-27T21:17:07Z

Sorry I thought in this context we were only talking about allocated RAM, not ROM.

For one thing, you can have write protected RAM, using e.g. mprotect on Linux; it would be perfectly valid (according to the current spec) to store immutable objects in memory pages that are (generally) write protected and are only temporarily writeable for construction of new immutable objects.
For another, when looking at an immutable object's address in a process' virtual address space, how would you be able to differentiate between that address being mapped into ROM and into RAM?

dlang-bot · 2017-07-27T23:24:54Z

Thanks for your pull request, @nordlow! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.

Some tips to help speed things up:

smaller, focused PRs are easier to review than big ones
try not to mix up refactoring or style changes with bug fixes or feature enhancements
provide helpful commit messages explaining the rationale behind each change

Bear in mind that large or tricky changes may require multiple rounds of review and revision.

Please see CONTRIBUTING.md for more information.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

nordlow · 2017-07-27T23:25:18Z

Updates made according to comments.

jmdavis · 2017-07-28T01:58:35Z

It seems to me that given the issues with call elision that Steven brought up, pureFree needs a warning on it explaining the issue, because it's exactly the sort of thing that most folks won't have a clue about. They'll just blindly use it when they want to use free in pure code without understanding when it's safe to use it or not, and they'll end up leaking memory. The fact that Steven keeps having to explain it highlights the fact that it's not obvious.

andralex · 2017-07-28T02:03:20Z

@jmdavis a documentation PR would be very welcome - thanks!

FeepingCreature · 2018-06-05T14:34:09Z

General point: pure functions cannot be elided. Total functions can be elided; pure functions are not necessarily total, ie. they do not necessarily return. For instance, this is a pure function that will change the program's behavior if elided:

pure void foo() {
  while (true) {
  }
}

If a pure function is called more than once, all but the first call can be elided.

schveiguy · 2018-06-05T15:40:16Z

It all depends on how D defines it. A pure function that never returns isn't an interesting or desirable behavior. We could simply say that pure functions that don't ever return are undefined behavior, and let the compiler assume it must return at some point, allowing the obvious optimization. We could also dictate that it must be called the first time, but this means we will waste cycles calling functions that easily could be elided. Either way, it's not really a fundamental property of purity in general that the first call must be done.

After all, pure functions model mathematics, and I don't know if there's a math book that describes functions that never return ;)

aG0aep6G · 2018-06-05T16:16:43Z

We could simply say that pure functions that don't ever return are undefined behavior, and let the compiler assume it must return at some point, allowing the obvious optimization.

What about inferred pure?

void foo()() /* a template now */
{
  while (true) {}
}

The compiler will infer pure. Surely, we don't want foo() to have undefined behavior.

schveiguy · 2018-06-05T17:18:19Z

Surely, we don't want foo() to have undefined behavior.

I'd be fine with the compiler assuming that pure functions always return. I'm not convinced that it's detrimental for a function that never returns to be elided if it does nothing observable inside.

I'm trying to see the benefit of having an opaque function like void pure foo(); not be elided. "because it might have an infinite loop" doesn't seem worth giving up on that optimization benefit.

FeepingCreature · 2018-06-05T17:45:15Z

I want to emphasize that Phobos has had actual production bugs due to confusing pure and total functions.

schveiguy · 2018-06-05T18:46:08Z

Great, it should be easy to show then! Can you point me at the bug? The bug report I saw you posted on the forums doesn't have anything to do with strong pure functions.

FeepingCreature · 2018-06-05T20:46:05Z

Ah, you're right. What happened was that the compiler removed any call returning a struct with an enum property to be accessed, which would be wrong regardless, and I had to correct the pull request to avoid the issue with pure as I went. So at no point did the compiler actually have this bug.

In my defense, I'll note that the incorrect code was recommended to me multiple times, explicitly stating it was permissible due to pure, so it was a close thing.

So I'd restate things as this: that we'd almost had this bug in dmd master, and it would have led to the precise issue of a function that was effectively while (true), namely a find call on an infinite range, which would have in fact been inferred as pure, being treated as if it returned when it didn't, and as a result causing an innocuous D expression, true.repeat, which I've in fact seen used in production, to exhibit wrong behavior.

schveiguy · 2018-06-05T21:01:03Z

OK, so if I can be a bit pedantic, what I think you are saying is that something like true.repeat.find(false) would somehow be optimized into returning something instead of going into the infinite loop it should have? What would it optimize into, just an empty range?

I'm not sure if I got it right, but I'm still not sure where a strong-pure function makes sense to run just because it might contain an infinite loop (which IMO is clearly a programming error).

FeepingCreature · 2018-06-05T21:37:44Z

That's exactly what happens, since find returns the remaining range, which is statically known (enum empty = false;) to be infinite. So if your method elides the call to find, it will conclude that find returned a nonempty range, indicating it found the item.

schveiguy · 2018-06-06T15:14:34Z

OK, I see what you are saying (I think):

true.repeat is comprised solely of enum and no data. So if you have a function that returns that, the return value basically cannot vary based on the inputs.

Since all the inputs and outputs are immutable or enum, this is tagged as a strong pure function. The logic in the compiler could deduce that it must ALWAYS return that range, which indicates the value (false) was found. But the real answer is that it cannot find that item, and it should never actually return.

I see the flaw assuming that there is no infinite loop in any strong-pure function, and definitely this cannot be allowed to happen. Thanks for giving the clear example.

I still am concerned with applying the same restriction to void-returning functions. In this case, the control flow of the program (assuming it returns) is not affected based on anything the function does. One cannot assume incorrectly anything based on the return value. It truly is a "do nothing" call. It's a different thing to assume the return value based on the type than it is to assume it returns at all. In one case, the two are linked, due to the immutable property of the type, but in the other case, the return value doesn't exist, so it's not relevant. Literally the ONLY thing a strong-pure void-returning function can do is either use a finite amount of time to execute, or never return.

I'd prefer simply never calling a void strong-pure function to calling it only the first time. This allows generic programming that doesn't need to be hand-optimized based on some admittedly strange criteria.

FeepingCreature · 2018-06-06T15:55:30Z

It does seem like a void-returning strong pure function should not fulfill any purpose. This does feel like the setup for a really subtle bug somewhere though.

nordlow force-pushed the pureFree branch 3 times, most recently from 39d84f2 to 081c062 Compare June 11, 2017 11:13

MoritzMaxeiner suggested changes Jul 5, 2017

View reviewed changes

andralex approved these changes Jul 26, 2017

View reviewed changes

schveiguy mentioned this pull request Jul 27, 2017

Qualify free(void*) as pure #1718

Closed

nordlow force-pushed the pureFree branch from 081c062 to 64aa688 Compare July 27, 2017 23:24

Add pureFree to core.memory

46cc93e

nordlow force-pushed the pureFree branch from 64aa688 to 46cc93e Compare July 27, 2017 23:30

andralex approved these changes Jul 28, 2017

View reviewed changes

andralex merged commit 0329ee1 into dlang:master Jul 28, 2017

nordlow deleted the pureFree branch July 28, 2017 08:57

Uh oh!

Conversation

nordlow commented Jun 11, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nordlow commented Jun 11, 2017

Uh oh!

wilzbach commented Jun 11, 2017

Uh oh!

MoritzMaxeiner left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nordlow Jul 27, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JackStouffer commented Jul 5, 2017

Uh oh!

MoritzMaxeiner commented Jul 5, 2017

Uh oh!

wilzbach commented Jul 5, 2017

Uh oh!

wilzbach commented Jul 5, 2017

Uh oh!

JackStouffer commented Jul 5, 2017

Uh oh!

MoritzMaxeiner commented Jul 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MoritzMaxeiner commented Jul 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JackStouffer commented Jul 5, 2017

Uh oh!

wilzbach commented Jul 5, 2017

Uh oh!

MoritzMaxeiner commented Jul 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jmdavis commented Jul 5, 2017

Uh oh!

MoritzMaxeiner commented Jul 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andralex commented Jul 26, 2017

Uh oh!

andralex left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andralex commented Jul 26, 2017

Uh oh!

MoritzMaxeiner commented Jul 26, 2017

Uh oh!

schveiguy commented Jul 27, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

nordlow commented Jun 11, 2017 •

edited

Loading

nordlow Jul 27, 2017 •

edited

Loading

MoritzMaxeiner commented Jul 5, 2017 •

edited

Loading

MoritzMaxeiner commented Jul 5, 2017 •

edited

Loading

MoritzMaxeiner commented Jul 5, 2017 •

edited

Loading

MoritzMaxeiner commented Jul 5, 2017 •

edited

Loading

schveiguy commented Jul 27, 2017 •

edited

Loading

nordlow commented Jul 27, 2017 •

edited

Loading

MoritzMaxeiner commented Jul 27, 2017 •

edited

Loading

schveiguy commented Jul 27, 2017 •

edited

Loading

MoritzMaxeiner commented Jul 27, 2017 •

edited

Loading

MoritzMaxeiner commented Jul 27, 2017 •

edited

Loading

MoritzMaxeiner commented Jul 27, 2017 •

edited

Loading

FeepingCreature commented Jun 5, 2018 •

edited

Loading