-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: resource type #494
Comments
Here's an actual real example of code that could benefit from the semantics in this proposal: https://github.com/thejoshwolfe/consoline/blob/master/consoline.h (search for "free"). One more aspect to consider is an example from the link above, where a function returns a disposable array of disposable strings. I'm not sure if/how this proposal relates to that usecase, but below is what I would write in Zig, and then I added some // the # in the result means this function is an allocator.
fn freeThese() -> %#[]#[]u8 {
var result = %return allocate([3]#[]u8);
// result starts with 0's, which means freeSlice(result[i]) starts out as a no-op.
%defer freeSliceOfSlice(u8, result);
{var i: usize = 0; while (i < result.len) : (i += 1) {
result[i] = %return allocate(["free me".len]u8);
mem.copy(u8, result[i], "free me");
}}
return result;
}
// the # in the parameter means this function is a deallocator.
fn freeSlice(comptime T: type, slice: #[]T) {
if (slice.len == 0) return;
free(slice.ptr);
// at this point, the @typeOf(slice) is []T
}
// the # in the parameter means this function is a deallocator.
fn freeSliceOfSlice(comptime T: type, slice: #[]#[]T) {
{var i: usize = 0; while (i < result.len) : (i += 1) {
freeSlice(T, result[i]);
}}
// at this point, the @typeOf(slice) is #[][]T
freeSlice([]T, result);
// at this point, the @typeOf(slice) is [][]T
}
fn caller() -> %void {
var array: #[]#[]u8 = %return freeThese();
defer freeSliceOfSlice(u8, result);
// at this point, the @typeOf(array) is [][]u8
handleArray(array);
}
fn handleArray(array: []const[]const u8) {
// now the #'s are gone, because they are handled outside this function.
} The above is a simplification of the usecase from the link above where a function provides a list of autocomplete suggestions as a list of strings. In that usecase, it made sense to me that the strings and the list itself would all be deleted by the receiver of the suggestions. The All this analysis might be possible maybe, but also this is starting to look a lot like Rust's borrow checking. Is that where this is headed? |
I think this is quite interesting. The purpose of this is to help enforce handling of the "resource," right? One could extend this a bit by having the resource allocation/creation function specify the default cleanup handler:
Now when you use it, you get a form of RAII:
In this case, resource_cleanup is automatically called. If you use your ## notation, then that is skipped. I think most of the use cases will be using a "standard" cleanup function anyway, so you are going to be removing boilerplate most of the time. It does have the disadvantage that you now are starting to have some "magic" in the code, but in this case the idiom would be so common that I think it might be OK. If the compiler can determine that you did not clean up something then it should be able to correct that when given a default cleanup function. I am not thrilled about the syntax of where the defer is done, but it does keep in all in one place. I would keep the use of the sigil in both the allocator and the cleanup function for symmetry. The allocator returns a type of '#foo' so the cleanup function should take that as its argument. |
@thejoshwolfe, I am not sure what #[]#[]u8 would mean? How would you allocate that? If you were filling up an array of strings, it seems like you would need to strip off the "#" for the individual strings. I think that this does not really nest all that clearly. Would you have the string destructor inside the outer array destructor? If you did that, then you had to use '##' or something... Note that I am waving my hands vigorously as to how this might be implemented. A simple function pointer prepended or appended to the underlying type would work though that starts to become a lot of wrapping when combined with ? and %. |
@kyle-github I updated my comment above with much more detail (again). |
Sorry for my slowness here. It is a little more clear after the second edit. However , I am not tracking where "result" comes from in freeSliceOfSlice()... Copy/paste error? In the allocator, you have this:
There are a couple things here I am missing :-(
|
@kyle-github responding to your "default cleanup handler", it's too contrary to the zen of zig to have hidden control flow, like RAII destructors or default function calls at block exit. The way an idea like that would fit into Zig is that instead of the compiler implicitly calling a function, the compiler would just give a compile error until you write the function call explicitly in your code. kyle-github brings up a new point, which is that it might not be enough to So this leads to putting a comptime function pointer in the type of the object itself, which is going to be an awful lot of verbosity in the type. pub fn ArrayList(comptime T: type) -> type{
struct {
const Self = this;
...
pub fn init(allocator: &Allocator) -> @resource(Self, Self.deinit) {
...
}
...
}
}
@debugPrint(ArrayList(u8).init.return_type);
// @resource(ArrayList(u8), ArrayList(u8).deinit) Again, this has a lot of associated hand waving. |
@thejoshwolfe there is a slippery slope to constructors/destructors there :-) The point about allowing a default defer was that Zig already does a number of things to make certain idioms really clean (like all the I guess I assumed that the resource concept would extend past memory to include pretty much anything that has a setup and teardown requirement. Perhaps a managed resource is different enough to make it a first class entity in Zig? To take @thejoshwolfe 's example, perhaps there is a type, similar to Or maybe something along the lines of Python's |
While throwing a compile error for not handling the resource seems a good way to handle this within scope of Zig Zen, what about copying and moving said resources? Should there be a @copyResource(src,dst,with_function) and similar move operation available? Would ownership be tracked or would the resources be special immutable references/ptrs until you unwrap them? |
Since this is getting a bit far afield, what are the use cases that @andrewrk is trying to cover? Here was what I assumed:
How about something a little different:
Very, very handwaved, but I hope you can see the flow. As far as I can tell, this works with @thejoshwolfe 's example of init/de-init for buffers that are passed by value. |
Er, looking at this, probably better to reverse the meaning of |
This is a bit speculative since I do not know how Zig is implemented in LLVM, but it looks like a few simple additions to compiler checking make the check for whether a managed resource is handled or not simple:
I think that this means that all managed/not managed checking is local to a function. |
@llariel, If we take the proposal that I am making and you implement your structs using the pimpl idiom from C++, then the direct copy should be sufficient in most cases (this is what it was designed to get around in C++). The shared parts are all in the implementation part and thus are automatically copied. Of course, that can lead down the rat hole of thinking about things like copy constructors if you are not careful. |
I'm responding to #494 (comment) and I haven't read anything below that yet. kotlin has ideas about types of things changing based on assertions and if statements and things. I'm not sure we want to go down that road. But I think you hit on something important with the resource deallocation function taking the I also think Here are my proposed edits: fn freeThese() -> %#[]#[]u8 {
const result = ##(%return allocate([]u8, 3));
// set all slices to .len = 0 so freeSlice(result[i]) is a no-op.
mem.set(result, []u8{});
%defer freeSliceOfSlice(u8, result); // implicitly casting [][]u8 to #[]#[]u8 is ok
var i: usize = 0;
while (i < result.len) : (i += 1) {
result[i] = ##(%return allocate(u8, "free me".len));
mem.copy(u8, result[i], "free me");
}
return result; // implicitly casting [][]u8 to %#[]#[]u8 is ok
}
// the # in the parameter means this function takes ownership of the resource
fn freeSlice(comptime T: type, slice_resource: #[]T) {
const slice = ##slice_resource;
if (slice.len == 0) return;
free(slice.ptr);
}
// the # in the parameter means this function takes ownership of the resource
fn freeSliceOfSlice(comptime T: type, slice_resource: #[]#[]T) {
const slice = ##slice_resource;
{var i: usize = 0; while (i < slice.len) : (i += 1) {
freeSlice(T, slice[i]);
}}
freeSlice([]T, slice);
}
fn caller() -> %void {
const result = %return freeThese(); // @typeOf(result) == #[]#[]u8
const array = result defer freeSliceOfSlice(u8, result); // @typeOf(array) == [][]u8
handleArray(array);
}
fn handleArray(array: []const[]const u8) {
// now the #'s are gone, because they are handled outside this function.
} The other idea I have is to have an error similar to compile-time detection of
For example: fn allocate() -> #i32 {
// ...
}
fn bar(x: #i32) {
// ...
}
fn foo() {
const resource = allocate();
const n = ##resource;
const n2 = ##resource; // error: `resource` not valid after first ##resource
// ...
bar(n);
bar(n); // error: `n` not valid after implicitly casting it to #i32 on previous line
} For |
@kyle-github, well the problem is mostly that there are resources/types that you want to manage, but they shouldn't be copied at all or the copying requires special care (referefence counting). After all even copying a pointer address to another pointer variable and then freeing both are an error, it is a similar issue. |
@Ilariel can you give a code example? I'm not understanding this use case |
@andrewrk I think you hit on what I was thinking, but I was not going all the way to ownership and borrowing a la Rust. I think that with my proposal you actually get a lot of Rust's safety without having it always imposed. @Ilariel it is not a solution in the fully general sense. Hence my point about "most cases." If you need to really enforce and manage things to that level, then you would need C++-like control. At that point, why not use C++? |
@andrewrk, Taking your shared_ptr code example from #453
|
@kyle-github, because I've been looking for a "better C" where there is no weird preprocessor macro system, but a typesafe sane compile time way to do things. Zig seems promising to me, but it lacks a sane way to handle resources like C does. C++ or Rust RAII for handling resources is nice, but they aren't exactly what I am looking for. |
how do you handle resources in C? |
@andrewrk, I meant that both C and Zig (at least at the moment) lack a way to manage resources in sane way. When you using reference counting mechanism you must always remember to increment and decrement at proper places and so on. Assignment becomes hazardous because copying is not what you want. You have almost all the control you want but not really any builtin assistance in the language itself, only in external programs. |
@Ilariel As far as I can tell, the proposal(s) above only cover borrowing and direct ownership, not sharing. With Zig you are still on your own to remember to increment/decrement reference counts. I guess I do not see a difference for that case. The only way I can see to handle things like ref counting transparently/automatically are to either 1) add it to the compiler (as in Swift or in some cases in Objective-C), 2) allow some sort of action trigger/overloading of assignment (C++). At that point you are starting to introduce "magic" to the code. At least the C++ method does not force a particular model of management. For the case of mutexes, you can use the mechanism @andrewrk or I were proposing since the mutex itself is rarely shared among multiple users. Actually |
@kyle-github, I think that rather than being able to trigger something at assignment we should be able to disallow copying and make it an explicit action if it is desired. Like having the explicit @ copyResource and @ moveResource so that they can be handled properly and the intent is being communicated to reader. As for mutexes, yes the method you proposed allows handling locks in a sane way because you can create an interface where you return a resource representing the section where the lock was opened and that resource has to be handled and if it is not you get an compile time error. |
@Ilariel I think I see what you're proposing now. Disallow normal copy to prevent accidental misuse of a resource. But then we still need a way to copy, and this is where your |
@andrewrk, yes, and with that we could most likely implement most use cases of RAII style resource management where it is appropriate in a way that doesn't conflict with Zig Zen or at least the compiler would tell you if you made a mistake. The copy disallowing part might have to be an annotation for a type because it should apply even if the struct used as a resource is unwrapped because the type itself inherently shouldn't be misused by copying it. |
@Ilariel thanks for the example. This helps a lot to understand what you are doing... One question, in the constructor, a static instance of I agree with your point about the annotation. Conceptually, unwrapping a |
@kyle-github, so this part?
Ideally we should be able to initialize the memory without a copy from temporary at all. First writing it to a temp variable and then moving it becomes verbose for no reason at this point. So either making a create function for TaggedData or some kind of struct access syntax like would solve that. e.g
|
@Ilariel yes, that is the part. I was thinking about this a little bit this morning (it is morning here). Are the Here's the reference counting example (heavily snipped for brevity):
Please excuse the handwaved parts. I think if you do this right you do not need the special functions. |
I do not get this proposal but it seems to be some variant of C++'s smart pointers. Smart pointer is just one of possible implementations of ownership management, rather low level and (in C++) quite confusing. Another option is to have containers which manage (or not) lifetime of their items stored as pointers. This is arguably higher level, and IMHO more intuitive to think about. One project employing this technique is Ultimate++ ( https://www.ultimatepp.org/ ), C++ IDE with their own stdlib and their own unique approach to everything else. |
This is a simplistic reference counting wrapper (not at all complete). It is being used here as a way to show the need (or lack thereof) for special copy/move functions and control over whether a type is assignable or not. It is not a "real" implementation, just a driver for the discussion. There are several things discussed in this issue and perhaps they should be broken into their own issues:
|
@kyle-github, I think you forgot the "#" sigils, but I think everyone will get it, and yes as you said there might not be a other reason for them to be builtins than the names after all they wouldn't really do anything other than communicate the intent I think @PavelVozenilek The smart pointers happened to just be the easiest "more complex" example. Containers and allocators could be used as an example too. |
To recap about the resource type and the # sigil. Is this correct? Resource type
|
Yes another resource management technique (for memory) - allocator which keeps track of unfreed blocks and shows them as error. My implementation it described in #480. To extend it to other resources is (relatively) easy. No special syntax was needed, no rules to learn, problematic allocation were identified precisely by file/line or even with stack trace. Similar approach is used by language Lobster ( http://www.strlen.com/lobster/ ). Unfreed memory is reported at the end of app run. Language Pony ( https://www.ponylang.org/ ) has opposite approach. It uses very complicated system for ownership, invented to make multithreaded data races impossible. |
Having an allocator that tracks allocations for debugging and profiling is good and it sure helps you debug your memory leaks. However the resource type could make some of bugs and runtime crashes into compile errors which would be more ideal at least according to my interpretation of Zig Zen. Perhaps ideally we would use an allocator like yours as an resource which gets freed and it will then report the improper usage. |
I think the combination of the As far as I understand the Zen of Zig, this seems like a good balance. And, neither of these need to add any runtime overhead if I guess correctly. |
I don't get it, too many words. Type code + result, or code + error messages. Even if they are fake. I don't understand in what order I should read |
too clumsy |
This proposal is a counter-proposal to #158 and #473.
Introducing a new sigil:
#
. It works like&
,%
,[]
, and?
in that you can prefix a type and get a new type:How it works is that it recognizes that a value transfers ownership. So you need syntax in order to deal with the resource appropriately:
You can use the
##
prefix operator to "unwrap" the resource. This acknowledges that you will takeownership of the data and deal with cleaning it up properly.
The
##
prefix operator is always safe and codegens to a no-op. It simply is a visual representation that the code manually deals with ownership of resources.More often, however, you will use
defer
or%defer
to handle resource cleanup. These keywords are extended to support unwrapping this type:When you use these syntaxes, you are communicating that you have handled the cleanup of the resource appropriately.
When
%defer
is used, there is still a manual component to the resource management, when the function does not return from the block the%defer
is in with an error. The resource type is only meant to aid in visibility of resource management. When you see##
in code, you should be wondering where the resource is cleaned up and trying to understand how the resource ownership is managed.##
can also inform zig what might be the best thing to do when undefined behavior is encountered in ReleaseSafe mode (see #426 ).Note:
$
instead of#
is OK. The#
sigil was chosen based on likelihood of being present on international keyboards, but I have no actual data. Research needed.Note:
_ = foo()
wherefoo
returns a#T
should not be allowed, and likewise iffoo
returns a%T
.The text was updated successfully, but these errors were encountered: