Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: resource type #494

Closed
andrewrk opened this issue Sep 22, 2017 · 36 comments
Closed

proposal: resource type #494

andrewrk opened this issue Sep 22, 2017 · 36 comments
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@andrewrk
Copy link
Member

This proposal is a counter-proposal to #158 and #473.

Introducing a new sigil: #. It works like &, %, [], and ? in that you can prefix a type and get a new type:

fn alloc(count: size_t) -> #[]u8 {
    // ...
}
fn free(x: []u8) {
    // ...
}

How it works is that it recognizes that a value transfers ownership. So you need syntax in order to deal with the resource appropriately:

fn foo() {
    const slice_resource = alloc(10);
    slice_resource[1] = 2; // error: resource deallocation not specified
}

You can use the ## prefix operator to "unwrap" the resource. This acknowledges that you will take
ownership of the data and deal with cleaning it up properly.

fn foo() {
    const slice = ##alloc(10);
    slice[1] = 2; // now it works
    free(slice);
}

The ## prefix operator is always safe and codegens to a no-op. It simply is a visual representation that the code manually deals with ownership of resources.

More often, however, you will use defer or %defer to handle resource cleanup. These keywords are extended to support unwrapping this type:

fn foo() {
    const slice = alloc(10) defer |s| free(s);
    slice[1] = 2; // now it works
}
fn foo() {
    const slice = alloc(10) %defer |s| free(s);
    slice[1] = 2; // now it works
}

When you use these syntaxes, you are communicating that you have handled the cleanup of the resource appropriately.

When %defer is used, there is still a manual component to the resource management, when the function does not return from the block the %defer is in with an error. The resource type is only meant to aid in visibility of resource management. When you see ## in code, you should be wondering where the resource is cleaned up and trying to understand how the resource ownership is managed. ## can also inform zig what might be the best thing to do when undefined behavior is encountered in ReleaseSafe mode (see #426 ).

Note: $ instead of # is OK. The # sigil was chosen based on likelihood of being present on international keyboards, but I have no actual data. Research needed.

Note: _ = foo() where foo returns a #T should not be allowed, and likewise if foo returns a %T.

@andrewrk andrewrk added breaking Implementing this issue could cause existing code to no longer compile or have different behavior. enhancement Solving this issue will likely involve adding new logic or components to the codebase. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. labels Sep 22, 2017
@andrewrk andrewrk added this to the 0.2.0 milestone Sep 22, 2017
@thejoshwolfe
Copy link
Contributor

thejoshwolfe commented Sep 23, 2017

Here's an actual real example of code that could benefit from the semantics in this proposal: https://github.com/thejoshwolfe/consoline/blob/master/consoline.h (search for "free").

One more aspect to consider is an example from the link above, where a function returns a disposable array of disposable strings. I'm not sure if/how this proposal relates to that usecase, but below is what I would write in Zig, and then I added some # sigils where I thought they made sense. Then I added comments about the type of things changing, which requires a lot of hand waving.

// the # in the result means this function is an allocator.
fn freeThese() -> %#[]#[]u8 {
    var result = %return allocate([3]#[]u8);
    // result starts with 0's, which means freeSlice(result[i]) starts out as a no-op.
    %defer freeSliceOfSlice(u8, result);
    {var i: usize = 0; while (i < result.len) : (i += 1) {
        result[i] = %return allocate(["free me".len]u8);
        mem.copy(u8, result[i], "free me");
    }}
    return result;
}

// the # in the parameter means this function is a deallocator.
fn freeSlice(comptime T: type, slice: #[]T) {
    if (slice.len == 0) return;
    free(slice.ptr);
    // at this point, the @typeOf(slice) is []T
}

// the # in the parameter means this function is a deallocator.
fn freeSliceOfSlice(comptime T: type, slice: #[]#[]T) {
    {var i: usize = 0; while (i < result.len) : (i += 1) {
        freeSlice(T, result[i]);
    }}
    // at this point, the @typeOf(slice) is #[][]T
    freeSlice([]T, result);
    // at this point, the @typeOf(slice) is [][]T
}

fn caller() -> %void {
    var array: #[]#[]u8 = %return freeThese();
    defer freeSliceOfSlice(u8, result);
    // at this point, the @typeOf(array) is [][]u8
    handleArray(array);
}

fn handleArray(array: []const[]const u8) {
    // now the #'s are gone, because they are handled outside this function.
}

The above is a simplification of the usecase from the link above where a function provides a list of autocomplete suggestions as a list of strings. In that usecase, it made sense to me that the strings and the list itself would all be deleted by the receiver of the suggestions.

The # sigil is starting too look inadequate compared to what I really wanted to express in the comments above. I wanted to be able to say "the type of this variable is something that has been deallocated, and so the pointer value is now undefined and a safety error to use." and "the type of this variable is now a newly allocated thing that will be cleaned up on error.".

All this analysis might be possible maybe, but also this is starting to look a lot like Rust's borrow checking. Is that where this is headed?

@kyle-github
Copy link

kyle-github commented Sep 23, 2017

I think this is quite interesting. The purpose of this is to help enforce handling of the "resource," right?

One could extend this a bit by having the resource allocation/creation function specify the default cleanup handler:

fn resource_alloc(count: size_t) -> #u8[] defer resource_cleanup {
 .... do stuff ...
}

fn resource_cleanup(data: #u8[])  {
   ....
}

Now when you use it, you get a form of RAII:

fn foo(bar:int) -> baz {
    const rsrc = resource_alloc(10); /* should handle errors */
    ...
    rsrc[3] = 9;
/* resource_cleanup() called on function exit just like an explicit defer */
}

In this case, resource_cleanup is automatically called. If you use your ## notation, then that is skipped.

I think most of the use cases will be using a "standard" cleanup function anyway, so you are going to be removing boilerplate most of the time. It does have the disadvantage that you now are starting to have some "magic" in the code, but in this case the idiom would be so common that I think it might be OK.

If the compiler can determine that you did not clean up something then it should be able to correct that when given a default cleanup function.

I am not thrilled about the syntax of where the defer is done, but it does keep in all in one place. I would keep the use of the sigil in both the allocator and the cleanup function for symmetry. The allocator returns a type of '#foo' so the cleanup function should take that as its argument.

@kyle-github
Copy link

@thejoshwolfe, I am not sure what #[]#[]u8 would mean? How would you allocate that? If you were filling up an array of strings, it seems like you would need to strip off the "#" for the individual strings. I think that this does not really nest all that clearly. Would you have the string destructor inside the outer array destructor? If you did that, then you had to use '##' or something...

Note that I am waving my hands vigorously as to how this might be implemented. A simple function pointer prepended or appended to the underlying type would work though that starts to become a lot of wrapping when combined with ? and %.

@thejoshwolfe
Copy link
Contributor

@kyle-github I updated my comment above with much more detail (again).

@kyle-github
Copy link

kyle-github commented Sep 23, 2017

Sorry for my slowness here. It is a little more clear after the second edit. However , I am not tracking where "result" comes from in freeSliceOfSlice()... Copy/paste error?

In the allocator, you have this:

fn freeThese() -> %#[]#[]u8 {
    var result = %return allocate([3]#[]u8);
    // result starts with 0's, which means freeSlice(result[i]) starts out as a no-op.
    %defer freeSliceOfSlice(u8, result);
    {var i: usize = 0; while (i < result.len) : (i += 1) {
        result[i] = %return allocate(["free me".len]u8);
        mem.copy(u8, result[i], "free me");
    }}
    return result;
}

There are a couple things here I am missing :-(

  1. I guess I do not know what a slice looks like in memory. It seems like the line result[i] = %return allocate(["free me".len]u8); is going to overwrite the empty slice that allocate made. In freeSlice() something is freed. Does this work if slice.ptr is zero?
  2. In the allocator, the line I referenced above does not allocate a '#' type. How is that transformed?
  3. Assuming that allocate does set up the '#' type or the assignment to 'result[i]' sets that up, isn't it a violation to leave these hanging due to the %return in the inner allocation loop?

@thejoshwolfe
Copy link
Contributor

@kyle-github responding to your "default cleanup handler", it's too contrary to the zen of zig to have hidden control flow, like RAII destructors or default function calls at block exit. The way an idea like that would fit into Zig is that instead of the compiler implicitly calling a function, the compiler would just give a compile error until you write the function call explicitly in your code.

kyle-github brings up a new point, which is that it might not be enough to free() a resource. You might need a specific deinit function based on the init function you used to get the resource. I noticed that in std.ArrayList and std.Buffer use the init and deinit pattern. You don't just want to free an ArrayList; you need to deinit it, or else you'll leak the underlying buffer. This also means that the resource management concept extends beyond just pointers and slices, but byval objects like ArrayList can need resource management too (because they are keeping a pointer/slice behind the scenes).

So this leads to putting a comptime function pointer in the type of the object itself, which is going to be an awful lot of verbosity in the type.

pub fn ArrayList(comptime T: type) -> type{
    struct {
        const Self = this;
        ...
        pub fn init(allocator: &Allocator) -> @resource(Self, Self.deinit) {
            ...
        }
        ...
    }
}

@debugPrint(ArrayList(u8).init.return_type);
// @resource(ArrayList(u8), ArrayList(u8).deinit)

Again, this has a lot of associated hand waving.

@kyle-github
Copy link

@thejoshwolfe there is a slippery slope to constructors/destructors there :-)

The point about allowing a default defer was that Zig already does a number of things to make certain idioms really clean (like all the % handling, which I love!). By allowing the "constructor" function to be any arbitrary function and the destructor to be an arbitrary function, you still get the ability to simulate constructors/destructors if you want or to do other things such as simulating RAII or something akin to Python's with (with some gyrations).

I guess I assumed that the resource concept would extend past memory to include pretty much anything that has a setup and teardown requirement.

Perhaps a managed resource is different enough to make it a first class entity in Zig? To take @thejoshwolfe 's example, perhaps there is a type, similar to struct, called resource that has an init and de-init function? Very handwavy....

Or maybe something along the lines of Python's with as a new construct. Very explicit and clear what you mean to do.

@Ilariel
Copy link

Ilariel commented Sep 23, 2017

While throwing a compile error for not handling the resource seems a good way to handle this within scope of Zig Zen, what about copying and moving said resources?

Should there be a @copyResource(src,dst,with_function) and similar move operation available?

Would ownership be tracked or would the resources be special immutable references/ptrs until you unwrap them?

@kyle-github
Copy link

kyle-github commented Sep 23, 2017

Since this is getting a bit far afield, what are the use cases that @andrewrk is trying to cover?

Here was what I assumed:

  1. provide a mechanism for the compiler to enforce that some types of data must be explicitly handled.

How about something a little different:

  • Use the sigil # as @andrewrk mentioned to indicate that something needs explicit management.
  • A function that returns a managed type, has its return value type prefaced with a # sigil.
  • A function that consumes a managed type (for clean up) has the type of the parameter prefixed with a # sigil.
  • a function that does not return a managed type but takes one as an argument can explicitly state to the compiler that it does not handle the managed type by use of ## to prefix the type of the parameter.
  • The compiler will make sure that all code paths from the point where the value is first declared or set must end in a call to a consuming function.
  • taking a managed value and assigning it to a non-managed value (or field within an array or struct) is an error.
fn initFoo(someArgs: aType) -> #foo  <---- indicates that the response must be explicitly managed.
{
...
}


fn cleanupFoo(aFoo: #foo) { <--- one sigil means it is consumed here
.... do whatever is needed to clean up a foo instance ...
}

... later in some client code....

fn myFunc(anotherArg: anotherType) -> #foo {
 ... something something something...
    const aFoo = initFoo(...);
    .... something ...
    myPassthroughFunc(aFoo);   <-- not a clean up function, so still must handle aFoo somewhere
    .... return aFoo; <-- still legal, now the caller has the responsibility to handle aFoo.
}

fn myPassThroughFunc(aFoo: ##foo) {   <---- two sigils means we do not clean it up.
    ... something ...
}


fn myMain(...) {
    const aFoo = myFunc(...);   <-- aFoo has a managed type attribute.
    const bFoo = myFunc(...);
    defer cleanupFoo(aFoo);   <-- all code paths after this for aFoo are good
    ... no defer for bFoo ...

    ....

   cleanupFoo(bFoo);  <-- direct call instead of via defer
}

Very, very handwaved, but I hope you can see the flow. As far as I can tell, this works with @thejoshwolfe 's example of init/de-init for buffers that are passed by value.

@kyle-github
Copy link

Er, looking at this, probably better to reverse the meaning of ## and # because there will be a lot fewer consumer/clean up functions than users of a managed value.

@kyle-github
Copy link

This is a bit speculative since I do not know how Zig is implemented in LLVM, but it looks like a few simple additions to compiler checking make the check for whether a managed resource is handled or not simple:

  • any function that takes a managed parameter and calls a consumer function must annotate its incoming parameter as showing that it consumes the param. Ie. if function A calls B calls C and C consumes a parameter that is passed through all the functions, A and B must show that they are consumers too. A non-consuming function that calls a consuming function is an error.
  • Because of the first point, all management checking needs to only check the following things:
    -- If the value is returned, then the function's caller takes responsibility.
    -- if the value is explicitly passed to a consumer function, then the value is handled and any use of the value after that point is an error.
    -- if the value is passed to a consumer function via defer, then the value is handled.
  • The compiler must check to see that any non-handled parameters are only passed to functions that are not consumers.
  • Once you pass to a consumer the whole call chain from that point must be consuming for that value.

I think that this means that all managed/not managed checking is local to a function.

@kyle-github
Copy link

@llariel, If we take the proposal that I am making and you implement your structs using the pimpl idiom from C++, then the direct copy should be sufficient in most cases (this is what it was designed to get around in C++). The shared parts are all in the implementation part and thus are automatically copied.

Of course, that can lead down the rat hole of thinking about things like copy constructors if you are not careful.

@andrewrk
Copy link
Member Author

andrewrk commented Sep 23, 2017

I'm responding to #494 (comment) and I haven't read anything below that yet.

kotlin has ideas about types of things changing based on assertions and if statements and things. I'm not sure we want to go down that road. But I think you hit on something important with the resource deallocation function taking the # type.

I also think ## and the special defers should recursively remove all # from the type, and you can always implicitly cast non-# types to #-types.

Here are my proposed edits:

fn freeThese() -> %#[]#[]u8 {
    const result = ##(%return allocate([]u8, 3));
    // set all slices to .len = 0 so freeSlice(result[i]) is a no-op.
    mem.set(result, []u8{});
    %defer freeSliceOfSlice(u8, result); // implicitly casting [][]u8 to #[]#[]u8 is ok

    var i: usize = 0;
    while (i < result.len) : (i += 1) {
        result[i] = ##(%return allocate(u8, "free me".len));
        mem.copy(u8, result[i], "free me");
    }
    return result; // implicitly casting [][]u8 to %#[]#[]u8 is ok
}

// the # in the parameter means this function takes ownership of the resource
fn freeSlice(comptime T: type, slice_resource: #[]T) {
    const slice = ##slice_resource;
    if (slice.len == 0) return;
    free(slice.ptr);
}

// the # in the parameter means this function takes ownership of the resource
fn freeSliceOfSlice(comptime T: type, slice_resource: #[]#[]T) {
    const slice = ##slice_resource;
    {var i: usize = 0; while (i < slice.len) : (i += 1) {
        freeSlice(T, slice[i]);
    }}
    freeSlice([]T, slice);
}

fn caller() -> %void {
    const result = %return freeThese(); // @typeOf(result) == #[]#[]u8
    const array = result defer freeSliceOfSlice(u8, result); // @typeOf(array) == [][]u8

    handleArray(array);
}

fn handleArray(array: []const[]const u8) {
    // now the #'s are gone, because they are handled outside this function.
}

The other idea I have is to have an error similar to compile-time detection of
use of undefined value once a value is used in any of the following ways:

  • prefixed with ##
  • casted to #T
  • used with the special defer syntax

For example:

fn allocate() -> #i32 {
    // ...
}

fn bar(x: #i32) {
    // ...
}

fn foo() {
    const resource = allocate();
    const n = ##resource;

    const n2 = ##resource; // error: `resource` not valid after first ##resource

    // ...

    bar(n);

    bar(n); // error: `n` not valid after implicitly casting it to #i32 on previous line
}

For const variables this is pretty easy to detect. For var variables it requires
value tracking similar to detecting use of undefined at compile-time.

@Ilariel
Copy link

Ilariel commented Sep 23, 2017

@kyle-github, well the problem is mostly that there are resources/types that you want to manage, but they shouldn't be copied at all or the copying requires special care (referefence counting). After all even copying a pointer address to another pointer variable and then freeing both are an error, it is a similar issue.

@andrewrk
Copy link
Member Author

Should there be a @copyresource(src,dst,with_function) and similar move operation available where you?

@Ilariel can you give a code example? I'm not understanding this use case

@kyle-github
Copy link

@andrewrk I think you hit on what I was thinking, but I was not going all the way to ownership and borrowing a la Rust. I think that with my proposal you actually get a lot of Rust's safety without having it always imposed.

@Ilariel it is not a solution in the fully general sense. Hence my point about "most cases." If you need to really enforce and manage things to that level, then you would need C++-like control. At that point, why not use C++?

@Ilariel
Copy link

Ilariel commented Sep 23, 2017

@andrewrk, Taking your shared_ptr code example from #453

const mem = @import("std").mem;

fn RefCounted(comptime T: type) -> type {
    struct {
        const Self = this;

        const TaggedData = struct {
            data: T,
            ref_count_ptr: &usize,
            allocator: &mem.Allocator,
        };

        tagged_data_ptr: &TaggedData,

        pub fn create(allocator: &mem.Allocator) -> %Self {
            const ref_count_ptr = %return allocator.create(usize);
            %defer allocator.free(ref_count_ptr);

            const tagged_data_ptr = %return allocator.create(TaggedData);
            %defer allocator.free(tagged_data_ptr);

            *ref_count_ptr = 1;

            *tagged_data_ptr = TaggedData {
                .data = undefined,
                .ref_count_ptr = ref_count_ptr,
                .allocator = allocator,
            };

            Self {
                .tagged_data_ptr = tagged_data_ptr,
            }
        }

        pub fn deinit(self: &Self) {
            *self.tagged_data_ptr.ref_count_ptr -= 1;
            if (*self.tagged_data_ptr.ref_count_ptr == 0) {
                const allocator = self.tagged_data_ptr.allocator;
                const ref_count_ptr = self.tagged_data_ptr.ref_count_ptr;
                allocator.free(self.tagged_data_ptr);
                allocator.free(ref_count_ptr);
            }
        }

        /// A call to strongRef should be followed by a defer which
        /// calls deinit() on the result.
        pub fn strongRef(self: &Self) -> Self {
            *self.tagged_data_ptr.ref_count_ptr += 1;
            Self {
                .tagged_data_ptr = self.tagged_data_ptr,
            }
        }

        pub fn weakRef(self: &Self) -> &T {
            &self.tagged_data_ptr.data
        }
    }
}

const debug = @import("std").debug;
fn hello(accidentalCopy : #RefCounted(i32)) {
    //Now we have two ptrs that have reference count of 1 pointing to the int
}

fn copy_refcounted(comptime T: type, source : &RefCounted(T) , destination : &RefCounted(T))  {
 *destination = (*source).strongRef();
}


fn fixed(proper : &#RefCounted(i32)) //allow pointer {
    var ref_count : #RefCounted(i32) = undefined;
   @copyResource(&proper,&ref_count,copy_refcounted(i32));
    //Now we have two ptrs that have reference count of 2 pointing to the int
}

test "ref counted" {
    var ref_counted_int = %%RefCounted(i32).create(&debug.global_allocator);
    defer ref_counted_int.deinit();
    //hello(ref_counted_int);
    fixed(ref_counted_int);
    (*ref_counted_int.weakRef()) = 1234;
}

@Ilariel
Copy link

Ilariel commented Sep 23, 2017

@kyle-github, because I've been looking for a "better C" where there is no weird preprocessor macro system, but a typesafe sane compile time way to do things. Zig seems promising to me, but it lacks a sane way to handle resources like C does. C++ or Rust RAII for handling resources is nice, but they aren't exactly what I am looking for.

@andrewrk
Copy link
Member Author

it lacks a sane way to handle resources like C does

how do you handle resources in C?

@Ilariel
Copy link

Ilariel commented Sep 23, 2017

@andrewrk, I meant that both C and Zig (at least at the moment) lack a way to manage resources in sane way.
At the moment it is this in both:
allocate/acquire, use, free/release or so. You must remember to unlock all the mutexes and other locks, files and whatever you opened, release all the memory with correct function depending on where the memory is from e.g allocator, malloc, mmap, etc.

When you using reference counting mechanism you must always remember to increment and decrement at proper places and so on. Assignment becomes hazardous because copying is not what you want. You have almost all the control you want but not really any builtin assistance in the language itself, only in external programs.

@kyle-github
Copy link

kyle-github commented Sep 23, 2017

@Ilariel As far as I can tell, the proposal(s) above only cover borrowing and direct ownership, not sharing. With Zig you are still on your own to remember to increment/decrement reference counts. I guess I do not see a difference for that case.

The only way I can see to handle things like ref counting transparently/automatically are to either 1) add it to the compiler (as in Swift or in some cases in Objective-C), 2) allow some sort of action trigger/overloading of assignment (C++). At that point you are starting to introduce "magic" to the code. At least the C++ method does not force a particular model of management.

For the case of mutexes, you can use the mechanism @andrewrk or I were proposing since the mutex itself is rarely shared among multiple users. Actually defer covers that case pretty well already.

@Ilariel
Copy link

Ilariel commented Sep 23, 2017

@kyle-github, I think that rather than being able to trigger something at assignment we should be able to disallow copying and make it an explicit action if it is desired. Like having the explicit @ copyResource and @ moveResource so that they can be handled properly and the intent is being communicated to reader.

As for mutexes, yes the method you proposed allows handling locks in a sane way because you can create an interface where you return a resource representing the section where the lock was opened and that resource has to be handled and if it is not you get an compile time error.

@andrewrk
Copy link
Member Author

@Ilariel I think I see what you're proposing now. Disallow normal copy to prevent accidental misuse of a resource. But then we still need a way to copy, and this is where your @copyResource and @moveResource comes into play - it's a lot like a copy constructor and a move constructor but it is explicit.

@Ilariel
Copy link

Ilariel commented Sep 23, 2017

@andrewrk, yes, and with that we could most likely implement most use cases of RAII style resource management where it is appropriate in a way that doesn't conflict with Zig Zen or at least the compiler would tell you if you made a mistake. The copy disallowing part might have to be an annotation for a type because it should apply even if the struct used as a resource is unwrapped because the type itself inherently shouldn't be misused by copying it.

@kyle-github
Copy link

@Ilariel thanks for the example. This helps a lot to understand what you are doing...

One question, in the constructor, a static instance of TaggedData is created and then auto-copied over the memory that was allocated. I assume that this is a case where @moveResource would be used instead?

I agree with your point about the annotation. Conceptually, unwrapping a % type is narrowing the possible values. So is unwrapping a ? type. However, for a non-copyable type, there is no unwrapping to do.

@Ilariel
Copy link

Ilariel commented Sep 24, 2017

@kyle-github, so this part?

*tagged_data_ptr = TaggedData {
     .data = undefined,
     .ref_count_ptr = ref_count_ptr,
     .allocator = allocator,
};

Ideally we should be able to initialize the memory without a copy from temporary at all. First writing it to a temp variable and then moving it becomes verbose for no reason at this point. So either making a create function for TaggedData or some kind of struct access syntax like would solve that. e.g

(*tagged_data_ptr) { //Deref pointer and access struct fields
    .data = undefined,
    .ref_count_ptr = ref_count_ptr,
    .allocator = allocator,
};

@kyle-github
Copy link

@Ilariel yes, that is the part.

I was thinking about this a little bit this morning (it is morning here). Are the @copyResource and @moveResource actually needed as special build-in functions?

Here's the reference counting example (heavily snipped for brevity):

const mem = @import("std").mem;

fn RefCounted(comptime T: type) -> noassign type {  <--- note the # to denote a managed resource
    struct noassign {   <----- like packed etc.   
        const Self = this;

        const TaggedData = struct {
            const TD = this;
            data: T,
            ref_count_ptr: &usize,
            allocator: &mem.Allocator,

            fn inc(self: &TD) -> &TD {
                 *self.ref_count_ptr += 1;
            }

            fn dec(self: &TD) -> &TD {
                .... dec and delete ...
            } 
        };

        tagged_data_ptr: &TaggedData,

        pub fn create(allocator: &mem.Allocator) -> %Self {
            const ref_count_ptr = %return allocator.create(usize);
            %defer allocator.free(ref_count_ptr);

            const tagged_data_ptr = %return allocator.create(TaggedData);
            %defer allocator.free(tagged_data_ptr);

            *ref_count_ptr = 1;

            // *tagged_data_ptr = TaggedData {
            //    .data = undefined,
            //    .ref_count_ptr = ref_count_ptr,
            //    .allocator = allocator,
            // };

            Self.move(tagged_data_ptr, &TaggedData { 
                   .data = undefined,
                   .ref_count_ptr = ref_count_ptr,
                   .allocator = allocator,
            });

            Self {
                .tagged_data_ptr = tagged_data_ptr,
            }
        }

        pub fn deinit(self: &Self) {
            .... snip ....        
       }

        pub fn strongRef(self: &Self) -> Self {
             .... snip ....
        }

        pub fn weakRef(self: &Self) -> &T {
            &self.tagged_data_ptr.data
        }

        // function to move/swap 
        pub fn move(self: &Self, other: &Self) {
               tmp: Self = undefined;

               tmp.tagged_data_ptr = self.tagged_data_ptr;
               *other.tagged_data_ptr = tmp.tagged_data_ptr;
        }
        pub fn copy(self: &Self, other: &Self) {
               // do increment of ref count
               *self.tagged_data_ptr = *other.tagged_data_ptr.inc();
        }
    }
}

const debug = @import("std").debug;
fn hello(accidentalCopy : #RefCounted(i32)) {
    //Now we have two ptrs that have reference count of 1 pointing to the int
}


fn fixed(proper : &#RefCounted(i32)) //allow pointer {
    var ref_count : #RefCounted(i32) = undefined;
    ref_count.copy(proper); <--- increments the count while copying.
    //Now we have two ptrs that have reference count of 2 pointing to the int
}

test "ref counted" {
    var ref_counted_int = %%RefCounted(i32).create(&debug.global_allocator);
    defer ref_counted_int.deinit();
    //hello(ref_counted_int);
    fixed(ref_counted_int);
    (*ref_counted_int.weakRef()) = 1234;
}

Please excuse the handwaved parts. I think if you do this right you do not need the special functions.

@PavelVozenilek
Copy link

PavelVozenilek commented Sep 24, 2017

I do not get this proposal but it seems to be some variant of C++'s smart pointers. Smart pointer is just one of possible implementations of ownership management, rather low level and (in C++) quite confusing.

Another option is to have containers which manage (or not) lifetime of their items stored as pointers. This is arguably higher level, and IMHO more intuitive to think about. One project employing this technique is Ultimate++ ( https://www.ultimatepp.org/ ), C++ IDE with their own stdlib and their own unique approach to everything else.

@kyle-github
Copy link

This is a simplistic reference counting wrapper (not at all complete). It is being used here as a way to show the need (or lack thereof) for special copy/move functions and control over whether a type is assignable or not. It is not a "real" implementation, just a driver for the discussion.

There are several things discussed in this issue and perhaps they should be broken into their own issues:

  1. Use of the # sigil to mark types requiring specific management. This started with @andrewrk's proposal and then has diverged in other directions.
  2. the need to control whether a type (or value?) can be assigned or not.

@Ilariel
Copy link

Ilariel commented Sep 24, 2017

@kyle-github, I think you forgot the "#" sigils, but I think everyone will get it, and yes as you said there might not be a other reason for them to be builtins than the names after all they wouldn't really do anything other than communicate the intent I think

@PavelVozenilek The smart pointers happened to just be the easiest "more complex" example. Containers and allocators could be used as an example too.

@Ilariel
Copy link

Ilariel commented Sep 24, 2017

To recap about the resource type and the # sigil. Is this correct?

Resource type

  • #T is a managed resource of type T.
  • Const variables and variables of T can be always casted to #T to create a managed resource.
  • Function that takes a #T as a parameter will take ownership of the resource parameter and consume the resource.
  • #T type has to be consumed by a function call or defer statement using a function consuming a #T, or it can be unwrapped with a ##prefix to create a T which you are now responsible of managing
  • Defer statement removes the resource type from the type
fn freeThese() -> %#[]#[]u8 {
    const result = ##(%return allocate([]u8, 3));
    // set all slices to .len = 0 so freeSlice(result[i]) is a no-op.
    mem.set(result, []u8{});
    %defer freeSliceOfSlice(u8, result); // implicitly casting [][]u8 to #[]#[]u8 is ok

    var i: usize = 0;
    while (i < result.len) : (i += 1) {
        result[i] = ##(%return allocate(u8, "free me".len));
        mem.copy(u8, result[i], "free me");
    }
    return result; // implicitly casting [][]u8 to %#[]#[]u8 is ok
}

// the # in the parameter means this function takes ownership of the resource
fn freeSlice(comptime T: type, slice_resource: #[]T) {
    const slice = ##slice_resource;
    if (slice.len == 0) return;
    free(slice.ptr);
}

// the # in the parameter means this function takes ownership of the resource
fn freeSliceOfSlice(comptime T: type, slice_resource: #[]#[]T) {
    const slice = ##slice_resource;
    {var i: usize = 0; while (i < slice.len) : (i += 1) {
        freeSlice(T, slice[i]);
    }}
    freeSlice([]T, slice);
}

fn caller() -> %void {
    const result = %return freeThese(); // @typeOf(result) == #[]#[]u8
    const array = result defer freeSliceOfSlice(u8, result); // @typeOf(array) == [][]u8

    handleArray(array);
}

fn handleArray(array: []const[]const u8) {
    // now the #'s are gone, because they are handled outside this function.
} 
fn allocate() -> #i32 {
    // ...
}

fn bar(x: #i32) {
    // ...
}

fn foo() {
    const resource = allocate();
    const n = ##resource; // You have to manage this yourself now

    const n2 = ##resource; // error: `resource` not valid after first ##resource

    


}

@PavelVozenilek
Copy link

PavelVozenilek commented Sep 24, 2017

Yes another resource management technique (for memory) - allocator which keeps track of unfreed blocks and shows them as error. My implementation it described in #480. To extend it to other resources is (relatively) easy.

No special syntax was needed, no rules to learn, problematic allocation were identified precisely by file/line or even with stack trace.

Similar approach is used by language Lobster ( http://www.strlen.com/lobster/ ). Unfreed memory is reported at the end of app run.


Language Pony ( https://www.ponylang.org/ ) has opposite approach. It uses very complicated system for ownership, invented to make multithreaded data races impossible.

@Ilariel
Copy link

Ilariel commented Sep 25, 2017

Having an allocator that tracks allocations for debugging and profiling is good and it sure helps you debug your memory leaks. However the resource type could make some of bugs and runtime crashes into compile errors which would be more ideal at least according to my interpretation of Zig Zen.

Perhaps ideally we would use an allocator like yours as an resource which gets freed and it will then report the improper usage.

@kyle-github
Copy link

I think the combination of the # checking and preventing assignment (that seems a lot more handwaved right now) would be a good base. It does not force anything, but will allow the compiler to catch 90% of the most common error: errors of omission.

As far as I understand the Zen of Zig, this seems like a good balance. And, neither of these need to add any runtime overhead if I guess correctly.

@andrewrk andrewrk modified the milestones: 0.2.0, 0.3.0 Oct 19, 2017
@pluto439
Copy link

I don't get it, too many words.

Type code + result, or code + error messages. Even if they are fake.

I don't understand in what order I should read const slice = alloc(10) defer |s| free(s);, my eyes just zigzag.

@andrewrk
Copy link
Member Author

too clumsy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

6 participants