Skip to content
This repository was archived by the owner on Oct 12, 2022. It is now read-only.
/ druntime Public archive

Add pureFree to core.memory#1836

Merged
andralex merged 1 commit intodlang:masterfrom
nordlow:pureFree
Jul 28, 2017
Merged

Add pureFree to core.memory#1836
andralex merged 1 commit intodlang:masterfrom
nordlow:pureFree

Conversation

@nordlow
Copy link
Contributor

@nordlow nordlow commented Jun 11, 2017

To be used in contexts such as RefCounted-destructor instead of having to define local overloads such as here https://github.com/dlang/phobos/pull/4832/files#diff-4e008aedb3026d4a84f58323e53bf017R5087.

AFAICT, free doesn't set errno.

@nordlow nordlow force-pushed the pureFree branch 3 times, most recently from 39d84f2 to 081c062 Compare June 11, 2017 11:13
@nordlow
Copy link
Contributor Author

nordlow commented Jun 11, 2017

What's wrong with the continuous-integration/jenkins/pr-merge?

@wilzbach
Copy link
Contributor

What's wrong with the continuous-integration/jenkins/pr-merge?

It's unrelated - it's a regression introduced in dlang/phobos#5427

Copy link
Contributor

@MoritzMaxeiner MoritzMaxeiner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • pureFree needs to save&restore errno
  • modified unittests should be marked @system


///
nothrow @nogc unittest
pure nothrow @nogc unittest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you modify this unittest, shouldn't it also be marked as @system ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should it be marked as @system?

}

pure @nogc nothrow unittest
pure nothrow @nogc unittest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here w.r.t adding @system

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GH won't let me comment on unchanged lines, so I'll put add it here at the last changed line:
If you expose fakePureFree via pureFree, the needed by unittests comment at the fakePureFree declaration should be removed.

Copy link
Contributor Author

@nordlow nordlow Jul 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the comment at fakePureFree.

* Pure variants of C's memory allocation functions `malloc`, `calloc`, and
* `realloc` and deallocation function `free`.
*
* Purity for allocation is achieved via resetting the `errno` to it's value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's -> its

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since deallocation must also save&restore errno (see comment at pureFree):
Purity for allocation -> Purity for allocation and deallocation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Purity for allocation is achieved via resetting the errno to it's value/Purity for allocation is achieved by saving and restoring the value of errno, thus behaving as if it were never changed/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to its.

return ret;
}
/// ditto
void pureFree(void* ptr) @system pure @nogc nothrow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

free is not guaranteed not to set errno by current standards; libc implementations may choose to set errno for both valid and invalid pointers.
POSIX.1 Issue 8 (still WIP), is going to forbid free to set errno for valid pointers, but explicitly allows setting it for an invalid pointers.
AFAIK there's no way for us to know whether the used libc will be linked to statically or dynamically and in the latter case there's no way AFAIK to know if the libc does or does not set errno.
It follows that pureFree must save&restore the errno value the same way that the allocation functions do:

const errno = fakePureGetErrno();
fakePureFree(ptr);
if (errno != 0) fakePureSetErrno(errno);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah, let's set and restore it for now regardless. We may change that later as an optimization. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@JackStouffer
Copy link
Contributor

I thought we all agreed that free can't be pure?

@MoritzMaxeiner
Copy link
Contributor

AFAICT, free doesn't set errno

This may be true for your libc, but it's not guaranteed by ISO C or POSIX. The WIP POSIX.1 Issue 8 even explicitly mentions that implementations are free to set errno if they receive an invalid pointer as input.

@wilzbach
Copy link
Contributor

wilzbach commented Jul 5, 2017

AFAICT, free doesn't set errno.
free is not guaranteed not to set errno by current standards;

FWIW it doesn't matter for the pureness considerations. For example many functions in std.math can set errno as well, but they are considered as pure.

@wilzbach
Copy link
Contributor

wilzbach commented Jul 5, 2017

AFAICT, free doesn't set errno.

Linking to the previous threaD: #1746

@JackStouffer
Copy link
Contributor

FWIW it doesn't matter for the pureness considerations. For example many functions in std.math can set errno as well, but they are considered as pure.

That's a bug then. Can you please make a bugzilla issue?

@MoritzMaxeiner
Copy link
Contributor

MoritzMaxeiner commented Jul 5, 2017

FWIW it doesn't matter for the pureness considerations. For example many functions in std.math can set errno as well, but they are considered as pure.

errno is global state. Changing it is impure, so those are bugs.

@MoritzMaxeiner
Copy link
Contributor

MoritzMaxeiner commented Jul 5, 2017

I thought we all agreed that free can't be pure?

I have read this discussion and the thread @wilzbach linked to, but I saw (so far) no convincing argument as to why free should not be pure (in D's definition of pure, which is weakly pure w.r.t. functional PL's pure)
IMHO if we allow heap allocations to be pure, so should heap deallocations; we are inconsistent otherwise.

@JackStouffer
Copy link
Contributor

@Calrama Check out this discussion #1718

Ping @schveiguy

@wilzbach
Copy link
Contributor

wilzbach commented Jul 5, 2017

FWIW it doesn't matter for the pureness considerations. For example many functions in std.math can set errno as well, but they are considered as pure.
That's a bug then. Can you please make a bugzilla issue?

See e.g. ilogb, but for the most part I was confusing it as std.math uses assembler intrinsics via core.math and just ignores the error masks on the FPU. Isn't this the same as resetting the erno to its previous result after the allocation?

@MoritzMaxeiner
Copy link
Contributor

MoritzMaxeiner commented Jul 5, 2017

@JackStouffer Thank you. AFAICT the described issue lies with people casting away immutable inside a pure function and then passing that to free, because the compiler is free to remove the call to free as it has no return value (and thus no effect by the rules of pure)?
What I don't understand there is this: pure functions are required to only take immutable arguments, so shouldn't casting away immutable from an argument inside pure be illegal, i.e. produce a compiler error?

@jmdavis
Copy link
Member

jmdavis commented Jul 5, 2017

What I don't understand there is this: pure functions are required to only take immutable arguments, so shouldn't casting away immutable from an argument inside pure be illegal, i.e. produce a compiler error?

pure functions have not been required to take immutable arguments for years now. I would suggest that you read this: http://klickverbot.at/blog/2012/05/purity-in-d/

@MoritzMaxeiner
Copy link
Contributor

MoritzMaxeiner commented Jul 5, 2017

pure functions have not been required to take immutable arguments for years now.

Sorry, temporary insanity caused by looking at the FAQ, which still states that.

@andralex
Copy link
Member

@Calrama could you please post an issue against https://dlang.org/const-faq.html#invariant? Better yet, a PR fixing it. Thanks!

Copy link
Member

@andralex andralex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this moving. Thanks!

return ret;
}
/// ditto
void pureFree(void* ptr) @system pure @nogc nothrow
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah, let's set and restore it for now regardless. We may change that later as an optimization. Thanks!

* Pure variants of C's memory allocation functions `malloc`, `calloc`, and
* `realloc` and deallocation function `free`.
*
* Purity for allocation is achieved via resetting the `errno` to it's value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Purity for allocation is achieved via resetting the errno to it's value/Purity for allocation is achieved by saving and restoring the value of errno, thus behaving as if it were never changed/

@andralex
Copy link
Member

cc @nordlow

@MoritzMaxeiner
Copy link
Contributor

@andralex Sure, PR is open

@schveiguy
Copy link
Member

schveiguy commented Jul 27, 2017

The issue with pure free is that it could be elided by the compiler. pure malloc doesn't have the same problem because of the return type.

What is going to happen is that someone will need to free immutable or const data. They will cast away the immutable and call pureFree. Then that function will be elided - and memory will leak.

The only saving grace of this is that we are not actually marking the real free function, so when we have to rescind this in the future due to subtle nasty memory leaks, not much will be affected, or at least code that is affected will have opted in to the idea that free can be pure.

I'm not going to continually have this argument (please see the previous PR), it seems enough people want to try this doomed experiment, so go ahead.

@nordlow
Copy link
Contributor Author

nordlow commented Jul 27, 2017

Why should the compiler elide successive calls to a (weakly) pure function taking mutable references?

Pardon my short memory, if this question has already been answered somewhere, @schveiguy @andralex

I'll update this PR in a couple of hours.

@andralex
Copy link
Member

I'm not going to continually have this argument (please see the previous PR), it seems enough people want to try this doomed experiment, so go ahead.

Steve, I'm hoping for a different tack on this. Yes, it is clear there is a problem. The problem stands in the way of people doing things that are useful. So we need to find a solution to the problem. Stating that the problem is not solvable and "can't do that" has low informational entropy - a bunch of folks will confirm they can't solve it, starting with the mailman ringing the door right now. What is interesting and good and what we should aim for is the high information response - a creative solution that solves the problem. What are your thoughts on that?

@schveiguy
Copy link
Member

I didn't say the problem wasn't solvable, I said that solving it this way will likely lead to subtle hard-to-detect leaks.

My recommendation (as before) is to let people tackle this using private declarations of pureFree in their own applications. For instance using it in the RefCounted struct. I think with that level of control, you can make sure it's not elided.

Note that this all really depends on the type, not the call. If the type is not fully immutable/const, then the compiler cannot elide pure calls which use it. In the case of RefCounted, the reference count is always mutable, so I'm assuming that the compiler sees this and won't allow elision.

We can't make such guarantees if we simply provide an easy way to do it. So we let the user make the guarantees on their own.

It's like lock-free programming. There's lots of wrong ways to do it, and in most cases the wrong ways still work most of the time. It's those times it doesn't work where you are in big trouble.

@schveiguy
Copy link
Member

Why should the compiler elide successive calls to a function taking mutable references?

It's not the call to pureFree that would be elided, but the call to a function that casts away immutable and calls pureFree.

And it's not successive calls either, it would be all the calls. A pure function that takes no mutable references and returns nothing can simply be elided by the compiler 100% of the time.

@MoritzMaxeiner
Copy link
Contributor

MoritzMaxeiner commented Jul 27, 2017

It's not the call to pureFree that would be elided, but the call to a function that casts away immutable and calls pureFree.

I see the problem, but there is one remark I have: The spec states that

Immutable data values, once constructed, remain the same for the duration of the program's execution.

From that I would assume that casting away immutable in order to free an object is illegal in D (and as such UB), i.e. an immutable object must remain alive for the entirety of the program's remaining execution time.

@schveiguy
Copy link
Member

schveiguy commented Jul 27, 2017

From that I would assume that casting away immutable in order to free an object is illegal in D

If there is no other reference, and you are about to forget the last reference (i.e. you are freeing it), then how is it illegal? For all intents and purposes, that piece of memory that you no longer have access to may very well still have the value assigned to it. In fact, free may not even change the value, but just store it away for safe keeping!

We are already accepting that malloc/free isn't modifying global data, even though it really is. We have to have some level of practicality that you can free immutable data as long as you no longer reference it.

Note that freeing immutable objects already cast away immutable (the destructor is not marked immutable).

It would be a good idea to update the spec for this.

@MoritzMaxeiner
Copy link
Contributor

MoritzMaxeiner commented Jul 27, 2017

If there is no other reference, and you are about to forget the last reference (i.e. you are freeing it), then how is it illegal?

Because the spec - as it is currently worded - explicitly forbids the freeing of immutable objects. Once constructed they must remain alive for the remainder of the program's execution, irrespective of whether you are still using them.

It would be a good idea to update the spec for this.

If we want it to be legal for immutable objects to be freed, yes.

@schveiguy
Copy link
Member

Because the spec - as it is currently worded - explicitly forbids the freeing of immutable objects.

Then the GC is in violation of the spec. Again, we have to put aside these kinds of requirements when talking about allocators. This is like the old question, if a tree falls in a forest but nobody is around, does it make a sound? Similarly, if nothing can access a piece of immutable memory, is it illegal to modify it?

@MoritzMaxeiner
Copy link
Contributor

MoritzMaxeiner commented Jul 27, 2017

Then the GC is in violation of the spec.

Then either the GC or the spec needs to be fixed with whatever we end up deciding to be the way to go forward. In case of the GC being fixed, it would have to store an additional list of memory locations where immutable objects were constructed and only collect them when the D runtime shuts down.

Similarly, if nothing can access a piece of immutable memory, is it illegal to modify it?

In my opinion yes, because immutable objects need not even be stored in writable memory.

Immutable data can be placed in ROM (Read Only Memory) or in memory pages marked by the hardware as read only

@schveiguy
Copy link
Member

In my opinion yes, because immutable data need not even be stored in writable memory

Sorry I thought in this context we were only talking about allocated RAM, not ROM.

@MoritzMaxeiner
Copy link
Contributor

MoritzMaxeiner commented Jul 27, 2017

Sorry I thought in this context we were only talking about allocated RAM, not ROM.

For one thing, you can have write protected RAM, using e.g. mprotect on Linux; it would be perfectly valid (according to the current spec) to store immutable objects in memory pages that are (generally) write protected and are only temporarily writeable for construction of new immutable objects.
For another, when looking at an immutable object's address in a process' virtual address space, how would you be able to differentiate between that address being mapped into ROM and into RAM?

@dlang-bot
Copy link
Contributor

Thanks for your pull request, @nordlow! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.

Some tips to help speed things up:

  • smaller, focused PRs are easier to review than big ones

  • try not to mix up refactoring or style changes with bug fixes or feature enhancements

  • provide helpful commit messages explaining the rationale behind each change

Bear in mind that large or tricky changes may require multiple rounds of review and revision.

Please see CONTRIBUTING.md for more information.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

@nordlow
Copy link
Contributor Author

nordlow commented Jul 27, 2017

Updates made according to comments.

@andralex andralex merged commit 0329ee1 into dlang:master Jul 28, 2017
@jmdavis
Copy link
Member

jmdavis commented Jul 28, 2017

It seems to me that given the issues with call elision that Steven brought up, pureFree needs a warning on it explaining the issue, because it's exactly the sort of thing that most folks won't have a clue about. They'll just blindly use it when they want to use free in pure code without understanding when it's safe to use it or not, and they'll end up leaking memory. The fact that Steven keeps having to explain it highlights the fact that it's not obvious.

@andralex
Copy link
Member

@jmdavis a documentation PR would be very welcome - thanks!

@nordlow nordlow deleted the pureFree branch July 28, 2017 08:57
@FeepingCreature
Copy link
Contributor

FeepingCreature commented Jun 5, 2018

General point: pure functions cannot be elided. Total functions can be elided; pure functions are not necessarily total, ie. they do not necessarily return. For instance, this is a pure function that will change the program's behavior if elided:

pure void foo() {
  while (true) {
  }
}

If a pure function is called more than once, all but the first call can be elided.

@schveiguy
Copy link
Member

It all depends on how D defines it. A pure function that never returns isn't an interesting or desirable behavior. We could simply say that pure functions that don't ever return are undefined behavior, and let the compiler assume it must return at some point, allowing the obvious optimization. We could also dictate that it must be called the first time, but this means we will waste cycles calling functions that easily could be elided. Either way, it's not really a fundamental property of purity in general that the first call must be done.

After all, pure functions model mathematics, and I don't know if there's a math book that describes functions that never return ;)

@aG0aep6G
Copy link
Contributor

aG0aep6G commented Jun 5, 2018

We could simply say that pure functions that don't ever return are undefined behavior, and let the compiler assume it must return at some point, allowing the obvious optimization.

What about inferred pure?

void foo()() /* a template now */
{
  while (true) {}
}

The compiler will infer pure. Surely, we don't want foo() to have undefined behavior.

@schveiguy
Copy link
Member

Surely, we don't want foo() to have undefined behavior.

I'd be fine with the compiler assuming that pure functions always return. I'm not convinced that it's detrimental for a function that never returns to be elided if it does nothing observable inside.

I'm trying to see the benefit of having an opaque function like void pure foo(); not be elided. "because it might have an infinite loop" doesn't seem worth giving up on that optimization benefit.

@FeepingCreature
Copy link
Contributor

I want to emphasize that Phobos has had actual production bugs due to confusing pure and total functions.

@schveiguy
Copy link
Member

schveiguy commented Jun 5, 2018

Great, it should be easy to show then! Can you point me at the bug? The bug report I saw you posted on the forums doesn't have anything to do with strong pure functions.

@FeepingCreature
Copy link
Contributor

FeepingCreature commented Jun 5, 2018

Ah, you're right. What happened was that the compiler removed any call returning a struct with an enum property to be accessed, which would be wrong regardless, and I had to correct the pull request to avoid the issue with pure as I went. So at no point did the compiler actually have this bug.

In my defense, I'll note that the incorrect code was recommended to me multiple times, explicitly stating it was permissible due to pure, so it was a close thing.

So I'd restate things as this: that we'd almost had this bug in dmd master, and it would have led to the precise issue of a function that was effectively while (true), namely a find call on an infinite range, which would have in fact been inferred as pure, being treated as if it returned when it didn't, and as a result causing an innocuous D expression, true.repeat, which I've in fact seen used in production, to exhibit wrong behavior.

@schveiguy
Copy link
Member

OK, so if I can be a bit pedantic, what I think you are saying is that something like true.repeat.find(false) would somehow be optimized into returning something instead of going into the infinite loop it should have? What would it optimize into, just an empty range?

I'm not sure if I got it right, but I'm still not sure where a strong-pure function makes sense to run just because it might contain an infinite loop (which IMO is clearly a programming error).

@FeepingCreature
Copy link
Contributor

FeepingCreature commented Jun 5, 2018

That's exactly what happens, since find returns the remaining range, which is statically known (enum empty = false;) to be infinite. So if your method elides the call to find, it will conclude that find returned a nonempty range, indicating it found the item.

@schveiguy
Copy link
Member

OK, I see what you are saying (I think):

true.repeat is comprised solely of enum and no data. So if you have a function that returns that, the return value basically cannot vary based on the inputs.

Since all the inputs and outputs are immutable or enum, this is tagged as a strong pure function. The logic in the compiler could deduce that it must ALWAYS return that range, which indicates the value (false) was found. But the real answer is that it cannot find that item, and it should never actually return.

I see the flaw assuming that there is no infinite loop in any strong-pure function, and definitely this cannot be allowed to happen. Thanks for giving the clear example.

I still am concerned with applying the same restriction to void-returning functions. In this case, the control flow of the program (assuming it returns) is not affected based on anything the function does. One cannot assume incorrectly anything based on the return value. It truly is a "do nothing" call. It's a different thing to assume the return value based on the type than it is to assume it returns at all. In one case, the two are linked, due to the immutable property of the type, but in the other case, the return value doesn't exist, so it's not relevant. Literally the ONLY thing a strong-pure void-returning function can do is either use a finite amount of time to execute, or never return.

I'd prefer simply never calling a void strong-pure function to calling it only the first time. This allows generic programming that doesn't need to be hand-optimized based on some admittedly strange criteria.

@FeepingCreature
Copy link
Contributor

It does seem like a void-returning strong pure function should not fulfill any purpose. This does feel like the setup for a really subtle bug somewhere though.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants