-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Constness inconsistencies in the standard library #9814
Comments
Related to #1629 |
I made a proposal that deinit(would also apply to close) never invalidate here: #6322 but it was rejected. So the current answer to your question is that definit/close should always invalidate, so However I'm not convinced this is the right course of action yet. My argument for this is that allowing types to be const makes code simpler. It avoids the problems with const propgation, reduces cognitive load on mutable state and makes it more clear what the object is doing without having to read its source, etc. I argue these make the code more "readable". The counter-argument is that by requiring all instances to be mutable, the code can always invalidate during cleanup and will catch more bugs at runtime. So what's more important, code readability or always having these runtime checks? I also argue that in 90% of cases cleanup is done inside a defer block, which makes the invalidation much less useful since you can't access the data anyway. In the cases where invalidation is useful, developers still have the option to make a mutable version that does invalidate. The argument against this is that this requires 2 ways to cleanup an object, for example you would need both a Andrew has decided that losing the ability to use const is less important than gaining the ability to invalidate all instances of an object. However, this would also imply that some of the std library needs to change, for example the Allocator interface would need to modify the const slice = allocator.alloc(u8, 100);
defer allocator.free(slice); You would now have to do this: var slice = allocator.alloc(u8, 100);
defer allocator.free(&slice); I've asked Andrew to clarify if I understand his reasoning and whether this implication is consistent with it: #10213 (comment) |
Quick follow up. The |
@marler8997 I've been following those issues as I've seen them pop up, and I am in agreement with Andrew on
I agree with you in theory, but zig currently does not make strong guarantees that const T = struct {
counter: u32,
ptr: *u32,
fn create(self: *T) void {
self.counter = 0;
self.ptr = &self.counter;
}
fn bump(self: T) u32 {
self.ptr.* += 1;
return self.ptr.*;
}
};
var x: T = undefined;
T.create(&x);
const y = x;
assert(y.counter == 0);
y.bump(); // uh oh this was supposed to be const
assert(y.counter == 1); While I don't think this is a particularly motivating example yet, it will be very easy if #2765 is implemented since you'll be able to skip Even without such a contrived example, I would still make the argument that types like
Pointer aliasing does allow you to observe the invalidation elsewhere, something I've run into while writing multithreaded code. But this is probably a good point otherwise, since programs generally shouldn't be designed in a way that allows you to access a destroyed resource.
I strongly agree. This is something that should be avoided as much as possible IMO. There are already types that make this sort of distinction, e.g.
This is true for most
True, but my issue is that methods taking const self allow observable state to change. I would make the argument that reading from a file should be done through
Maybe. But I'm more generally interested in what (if any) semantic meaning |
The semantic meaning of |
I disagree with the idea that a handle should be mutable because the state that the handle represents is mutable. We don't do this with pointers so why do we do this with handles? This would make sense if var i: usize = 10;
array[i] = 100; Can you imagine writing a function that takes an index to an array and taking a mutable pointer to it, not because you are writing to the index but because the array is mutable? fn modifySlice(slice: []u8, index: *usize) void {
slice[index.*] = 100;
} Furthermore, regardless of whether a handle represents a "mutable object", the mutability of the handle itself has a semantic difference. Having a single handle value that can change is very different from one that can't, and that mutability objectively adds to the cognitve burden when working with that value. In Zig const is not transitive, using a convention to pretend it is only makes code more confusing in my experience. |
Does this meaning also extend to C APIs that use opaque handles? I've seen it done both ways, e.g.
The reason I opened this issue is because there are cases where it appears zig does this with pointers, but perhaps I read too much into it and there's not anything more to the inconsistency than "sometimes we invalidate objects when destroyed, sometimes we don't."
I draw the line at handles that manage a resource. An index is just an offset into a memory resource managed by a pointer, so it doesn't need to be mutable itself.
Like I mentioned above, I'm more confused by the fact that I can have an apparently "const" |
I'm not sure how you can draw a line between array indices to arrays of objects and handles in this regard. It would be like drawing a line between an "array index" and an "integer", array indices are integers. Array indices are one of the most common form of handle implementation, it's what linux uses for file descriptors and what Andrew has suggested people use when replacing pointers with DOD-style array indices. In fact I argue that handles, pointers and array indices are all equivalent in that they are all values that represent objects that "live elsewhere". The argument being made here is that if the object these values represent is mutable, then the value should also be mutable. The problems this causes are somewhat mitigated when you consider a handle that lives inside a struct, because you can modify the This being said, array indices, pointers and handles are obviously not identical in all respects. It's true that they are all values representing objects that live elsewhere, but they also have differences. So what I'm looking for is, what reason(s) that are unique to handles (don't apply to pointers and array indices) justify requiring that handles be mutable if the underlying object has mutable state (like a file handle)? |
I generally agree with this statement, but it is a somewhat simplified model. I would take it even further and make the claim that handles are the abstraction over pointers, indices, etc. First of all let me define what I mean when I use these terms:
Pointers are handles, file descriptors are handles, etc. But the key difference is that pointers are implicitly paired with their context (memory) and once you have one, you can access the resource the pointer refers to freely. Other handles do not have this property. For example an array index is useless without a base pointer, a file descriptor is useless without an OS coordinating access to the filesystem, and so on. Because pointers can be freely accessed, it makes sense to put a Anyways bringing this back from the abstract and into Zig-land... I admit there is not a clean 1:1 mapping of this idea. |
I'm also in the camp of However, that doesn't actually prevent us from invalidating in fn deinit(self: *const @This()) void {
self.allocator.free(self.items);
discardConst(self).* = undefined;
} This nearly gets us the best of both words. Nearly all structures, that you can const stdout: fs.File = ...;
fn main() void {
// Do all the io
// I am now done with all the io. I can close stdout to communicate with the receiving process that I am done.
stdout.close();
// Do the rest
} Here, if |
I'm not entirely confident either way and I'm willing to walk back the current convention, which is leaning towards mutable self pointer for deinit. I do think it makes a lot of sense in a lot of cases but I can see the value in e.g. std lib File being immutable. Note that for File in particular, we might want mutability for event loop purposes but that remains to be seen for certain. |
Reading the standard library is, in my opinion, the best way to learn Zig currently. However, I have noticed that a couple similar APIs differ in how
self
is passed to methods. This has caused me significant confusion when it comes to designing my own APIs, because it's not clear whether this is a semantic choice, a practical one, or neither.In my mind there are two types of state:
Zig's notion of const only enforces physical immutability; to achieve the latter you must design it into your API.
Consider:
std.fs.Dir.close(self: *Dir)
: closes the fd and setsself.* = undefined
. This physically and logically mutatesself
.std.fs.File.close(self: File)
: closes the fd and does nothing else. This is physically immutable, but logically mutatesself
.Both of these functions do essentially the same thing: the fd is closed and the object goes into an unusable state afterwards, which implies mutation.
File.close
does not communicate this to the caller, butDir.close
does. Should it? One might argue that for a well-designed API, if any observable property of an object changes, you should pass*T
, andT
/*const T
otherwise. It signals an intent to callers and can reduce the likelihood of API misuse. For example, you can freely pass aroundFile
s and know that they won't suddenly be closed in a far away part of your codebase (though I admit you cannot guarantee this in Zig.)Designing around logical state in Zig is hard. For one, we cannot ensure that a
Dir
orFile
isn't copied, so it doesn't matter whether we have a*File
or aFile
: closing the fd closes it for allFile
s that share that fd. Another problem is that there's no way to prevent consumers of an API from inspecting changes in object state, since all struct fields are public. Since we cannot make guarantees about logical state, it's reasonable to concede that it's simply more practical to passT
even for operations that logically mutate an object.Regardless of which approach is correct, the current state of things causes some differences in usability.
Dir
cannot be closed as easily:I have not scanned the stdlib to find every example of this, however I have noticed that
ArrayListAligned
andArrayListAlignedUnmanaged
have the same by-value/by-pointer difference in theirdeinit
methods.EDIT: Also consider also the issue of const propagation. Imagine that you have a type that contains an
ArrayList
, and you depend onArrayList.deinit()
taking itself by-value (so your type'sdeinit()
method also takes itself by-value). It is entirely reasonable that at some point, you may want to change your type to useArrayListUnmanaged
internally. Suddenly, yourdeinit()
method also needs to takeself
by-mutable-pointer. This propagates through your codebase; everywhere you're callingMyType.deinit()
you need to make sure you have a non-constMyType
. Types that containMyType
and deinit by-value will now also need to deinit by-mutable-pointer, which propagates to types containing those types and so on. You can see how this can quickly become a maintenance nightmare.Should these APIs be consistent in the way they take
self
, and if so, which style is preferred?The text was updated successfully, but these errors were encountered: