Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use case: shared_ptr and unique_ptr from C++ #453

Closed
andrewrk opened this issue Sep 9, 2017 · 10 comments
Closed

use case: shared_ptr and unique_ptr from C++ #453

andrewrk opened this issue Sep 9, 2017 · 10 comments
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented Sep 9, 2017

Let's talk about the use case of shared_ptr and unique_ptr and what zig code would look like for the use cases where you would use these C++ features.

Inspired by this mailing list post: https://groups.google.com/forum/#!topic/ziglang/avaiyTOPcxM

@andrewrk andrewrk added this to the 0.2.0 milestone Sep 9, 2017
@andrewrk
Copy link
Member Author

andrewrk commented Sep 9, 2017

related to #287

@kyle-github
Copy link

The way that I tried to reduce the amount of "verbiage" in my C code was to keep the ref count in the object itself. I do this for generic memory blobs by prepending a refcount struct before the memory I allocate. Then, I have a special allocator that allocates memory to be refcounted and takes a size and a destructor.

C code here: refcount.c

(Slightly older version than what I am working on now with strong and weak refs, but nice and simple and hopefully easy to understand.)

I have to explicitly increment or decrement the ref count when I pass around these pointers.

Now that I am thinking about it, I can see how to leverage defer() to do some of this if I implemented it more like C++...

C++'s implementation is vastly more complicated under the hood because it does not change the object being refcounted. So, it relies heavily on overloading assignment and doing a fair amount of invisible book keeping behind the scenes. RAII gives the last little push to make use of shared_ptr (this is the one I really care about) be really simple for the end user.

For my own uses, C++'s implementation is slick as an end user, but there is so much extra stuff going on under the hood that it is a bit concerning. I do not like magic happening unless it is really clear what is going on. My own C implementation is probably pushing the limits of how "magic" I want to go.

I do not mind modifying the allocator as I do in my code. That is fine. It just "feels" like there is a way to make this even cleaner to use in Zig.

@andrewrk
Copy link
Member Author

andrewrk commented Sep 9, 2017

Here's an idea that works with status quo zig:

const mem = @import("std").mem;

fn RefCounted(comptime T: type) -> type {
    struct {
        const Self = this;

        const TaggedData = struct {
            data: T,
            ref_count_ptr: &usize,
            allocator: &mem.Allocator,
        };

        tagged_data_ptr: &TaggedData,

        pub fn create(allocator: &mem.Allocator) -> %Self {
            const ref_count_ptr = %return allocator.create(usize);
            %defer allocator.free(ref_count_ptr);

            const tagged_data_ptr = %return allocator.create(TaggedData);
            %defer allocator.free(tagged_data_ptr);

            *ref_count_ptr = 1;

            *tagged_data_ptr = TaggedData {
                .data = undefined,
                .ref_count_ptr = ref_count_ptr,
                .allocator = allocator,
            };

            Self {
                .tagged_data_ptr = tagged_data_ptr,
            }
        }

        pub fn deinit(self: &Self) {
            *self.tagged_data_ptr.ref_count_ptr -= 1;
            if (*self.tagged_data_ptr.ref_count_ptr == 0) {
                const allocator = self.tagged_data_ptr.allocator;
                const ref_count_ptr = self.tagged_data_ptr.ref_count_ptr;
                allocator.free(self.tagged_data_ptr);
                allocator.free(ref_count_ptr);
            }
        }

        /// A call to strongRef should be followed by a defer which
        /// calls deinit() on the result.
        pub fn strongRef(self: &Self) -> Self {
            *self.tagged_data_ptr.ref_count_ptr += 1;
            Self {
                .tagged_data_ptr = self.tagged_data_ptr,
            }
        }

        pub fn weakRef(self: &Self) -> &T {
            &self.tagged_data_ptr.data
        }
    }
}

const debug = @import("std").debug;

test "ref counted" {
    var ref_counted_int = %%RefCounted(i32).create(&debug.global_allocator);
    defer ref_counted_int.deinit();

    (*ref_counted_int.weakRef()) = 1234;
}

@kyle-github
Copy link

Interesting. I am not up to speed enough on Zig to get all of this, but I think I understand most of it.

The one part that is throwing me a bit is the line:
const Self = this;
This looks like a value assignment, but later it looks like Self is a type.

I am also not sure why you have an anonymous struct wrapping everything in the definition. I will need to keep reading the examples :-)

@andrewrk
Copy link
Member Author

this refers to the thing in most immediate scope. In this case it refers to the anonymous struct. TODO put docs here http://ziglang.org/documentation/#this

I am also not sure why you have an anonymous struct wrapping everything in the definition. I will need to keep reading the examples :-)

This is how generics are done - we execute the RefCounted function, which returns an anonymous struct using the comptime parameters. One could have also put return struct { ... }; in the function definition.

@kyle-github
Copy link

Nice!

@Ilariel
Copy link

Ilariel commented Sep 21, 2017

Rather than discussing use cases I'll use @andrewrk's example to address some concerns of mine about shared_ptrs and resource management in zig.

Current situation:

Currently based on what I'have understood the only way to use these shared_ptrs would be "create and deinit" -call pairs

test "ref counted" {
    var ref_counted_int = %%RefCounted(i32).create(&debug.global_allocator);
    defer ref_counted_int.deinit();

    (*ref_counted_int.weakRef()) = 1234;
}

Which forces us to always defer and if we miss it we get bugs.

Incrementing can be done rather easily by wrapping create calls to increments and so on, but it seems that decrementing and freeing is harder.

    /// A call to strongRef should be followed by a defer which
    /// calls deinit() on the result.
    pub fn strongRef(self: &Self) -> Self {
        *self.tagged_data_ptr.ref_count_ptr += 1;
        Self {
            .tagged_data_ptr = self.tagged_data_ptr,
        }
    }

There is also issue of somebody passing the struct to a function which creates an incompatible copy that points to the same resource. It is a ticking time bomb waiting to happen. It also forces the user to ALWAYS use strongRef and weakRef or pointers just to pass them to other functions. What if the the reference should live longer than the function using it and you put a defer to deinit after strongRef? What about pointers? The point of the shared_ptrs and refcounting is to more or less abstract the raw pointer away.

Referring to Zig Zen:
Reduce the amount one must remember. /Minimize energy spent on coding style.
Failure --> Always remember to defer. Remember the semantics of the type. Is it "ok" to copy the resource or is copying a user error? Remember to defer when acquiring resources (memory, files, locks, etc).

With RAII

C++ RAII style or rather notion of destructors/dropping ala rust-lang could be useful when managing resources on stack and graphs on heap.

fn deferring() {
    var shared_ptr_res = %%SharedPtr(ResourceType).create(&debug.global_allocator);    
    //No need for defer, things just "work"
    do_something(shared_ptr_res)
} //Deref and potentially deinit on scope exit

This approach would however force us to define semantics for moving and/or at least for copying.
There is also the problem of "magic" when something happens behind the hood when the name of the type or the comptime type constructor doesn't "Communicate intent precisely." which forces us to think about how we name the things unless we use a keyword somewhere to create special types.

Communicate intent precisely.
Potential failure -> Hides details, forces naming convention on users. Could be solved by adding a keyword for interacting with "resources"

Reduce the amount one must remember.
Partial success -> No need to remember to use defer. RAII as a concept is simple, but transfers burden on knowledge of how a type works. E.g. shared_ptrs can create cycles which have to be broken by weak_ptrs so they can be deconstructed

Only one obvious way to do things.
Potential failure -> Defer can already do things like this and it is explicit and it also communicates intent quite precisely.

"Ideal situation"

Well defined copy and move semantics for more control over whether a type can be copied at all or moved after it is created.
Potentially types can have constructors-like functions, destructors, copy and move operators. Export C wrappers for creating, destructing, copying and moving from initialized memory to other initialized memory with correct semantics to allow using same features in albeit manually from C side too.

Use cases for both defer and RAII based resource management in line with Zig Zen or perhaps break Only one obvious way to do things.

@kyle-github
Copy link

@Ilariel mostly agreed. I think the discussion in #494 is closer to Zig Zen than this. Magic at a distance is still magic.

Where I am not sure I agree is with defer. That is only within the current scope so if you want to save something into a struct pointer passed into a function you need something like # from #494.

Good discussion of RAII.

Note that there is another approach which is that used by Python in the with statement. Some variation on that could be used, but really only covers the defer case as far as I can see.

@andrewrk andrewrk modified the milestones: 0.2.0, 0.3.0 Oct 19, 2017
@andrewrk andrewrk modified the milestones: 0.3.0, 0.4.0 Feb 28, 2018
@andrewrk andrewrk modified the milestones: 0.4.0, 0.5.0 Feb 5, 2019
@andrewrk andrewrk modified the milestones: 0.5.0, 0.6.0 Sep 11, 2019
@andrewrk andrewrk modified the milestones: 0.6.0, 0.7.0 Oct 23, 2019
@kayomn
Copy link

kayomn commented Apr 11, 2020

I think RAII is sound in theory but lackluster in most implementations. defer is a huge step in the right direction but what I think this situation might call for is a "transitive defer".

@bind(self, @onDestruct, self.release);
@bind(self, @onCopy, self.strongRef);

This would add a new built-in function that would allow you to specify at a call site what intrinsic behaviors an instance of a structure has. These behaviors would transfer to every copy of the original data.

The semantics for these built-in events would map to a more convention C++ approach like so:

  • onDestruct -> automatic destructor invocation upon leaving scope.
  • onCopy -> automatic copy constructor invocation upon passing by value.

I'm trying to think of how you would implement such a thing without involving vtables or having a very competent static analyzer but am currently drawing blanks.

@andrewrk
Copy link
Member Author

andrewrk commented Oct 9, 2020

I don't think there is anything actionable here.

@andrewrk andrewrk closed this as completed Oct 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants