-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
change pointers to zero-bit types to be actual pointers with addresses #6706
Comments
Regarding the actual value of the pointer, it should be legal to have any number but 0, since the compiler won't let you actually dereference it. To get a pointer to one, you could use |
I would also like to note that this optimization makes Zig more complex, and makes it harder to communicate intent. |
I totally agree with this. It's such an edge-case optimization that reduces the general consistency of the language. Also, it forces the compiler to do a lot of special cases for whether or not a pointer is zero sized. It would make sense for any type |
Another case where this optimization is fairly annoying is generic code that performs type erasure and requires an opaque "context pointer." In such a case, the 8 bytes are already "wasted", so the optimization does not help. In practice, it simply leads to special-cased boilerplate involving I argue that, in practice, this optimization is so rare that it causes more special casing to sidestep it than we'd have to apply to generic code upon removal of the optimization. By the way, what examples do we even have of this optimization generating better code in the standard library? |
What does this mean for addresses of zero sized struct members? Are
|
Some notes on this decision: The original motivation behind zero-sized pointers was not so much an optimization, but the result of an observation that a pointer is how you find a thing, and since zero-sized values can be anywhere, you don't need any information to find one. The insight that we found most persuasive in reversing this is that pointers are also how you find a part of something larger, and you do need information to find a zero-sized piece of a larger structure or array. Since this is a big change for the language, @andrewrk has suggested that work on this issue be put in a branch to make sure there aren't any major unforseen issues. For now I'm going to close or accept any design issues related to this, but leave any stage 1 bugs involving zero-sized types open until the temporary branch is merged into the main line. We can probably close all of these when that happens: Multiple zero-sized values may share the same address. In particular, this means that the following loop is not safe with zero sized types: var curr = slice.ptr;
const end = slice.ptr + slice.len;
while (curr != end) : (curr += 1) { ... } This will execute zero times with a zero sized type, because all elements in the slice share the same address. No guarantees are made about the address of globals or stack variables that are zero-sized, except that their values are nonzero and have the correct alignment. Zero-sized variables that are fields or array items are pointers to their parent with the appropriate offset. So for @N00byEdge 's example: pub S = struct {
a: u0,
b: u8,
}; Zig makes no guarantees about the order of fields in a bare struct, but the address of a is guaranteed to be within the range of the instance of S. It may be at the beginning or end of S, or between fields if there were more nonzero-sized fields. But it can't be inside of a larger field. If S were a packed struct, where ordering is guaranteed, a and b would both have the same address as the parent instance. |
The important observation here is that:
and
This proposal considers all pointers to fall into category (b), but what if we made a distinction in the type system between these two categories? Assuming attribute
While it is most evident with zero-sized types, this distinction is also important for non-zero-sized types. Consider this example: fn foo(str: *const [4]u8) void { ... }
fn bar() void {
foo("somelongstring"[0..4]);
} Is the compiler here allowed to perform the following optimization? fn bar() void {
foo("some");
} Currently, no, since the compiler can't know if With this attributes, we could declare |
Sounds like this issue would close #2325 as well. |
I think what you're proposing would work, but it'd still keep a lot of the complexity of zero-sized pointers in the language. |
* LLVM backend: The `alloc` AIR instruction as well as pointer constants which point to a 0-bit element type now call a common codepath to produce a `*const llvm.Value` which is a non-zero pointer with a bogus-but-properly-aligned address. * LLVM backend: improve the lowering of optional types. * Type: `hasCodeGenBits()` now returns `true` for pointers even when it returns `false` for their element types. Effectively, #6706 is now implemented in stage2 but not stage1.
This is implemented in self-hosted which is now the default compiler. |
I don't think this optimization pulls its weight.
#6239 #1553 #4537 #6700
With #6432, you can't cast
*[1]u8
to*[0]u8
.I think that's a reasonable cast, and standard library relies on this working.
There are special cases that happen for all zero-bit types and should still be handled by generic code, but when some generic code is using a type solely as a pointer, it still needs to deal with them.
I think the main user of the
void
type in generic code is HashMap, and it doesn't seem to use*V
anywhere.The text was updated successfully, but these errors were encountered: