change pointers to zero-bit types to be actual pointers with addresses #6706

tadeokondrak · 2020-10-17T02:06:06Z

I don't think this optimization pulls its weight.

With #6432, you can't cast *[1]u8 to *[0]u8.
I think that's a reasonable cast, and standard library relies on this working.

There are special cases that happen for all zero-bit types and should still be handled by generic code, but when some generic code is using a type solely as a pointer, it still needs to deal with them.

I think the main user of the void type in generic code is HashMap, and it doesn't seem to use *V anywhere.

The text was updated successfully, but these errors were encountered:

tadeokondrak · 2020-10-17T02:09:45Z

Regarding the actual value of the pointer, it should be legal to have any number but 0, since the compiler won't let you actually dereference it. To get a pointer to one, you could use @intToPtr(*T, 1).
As far as I know this is what Rust does.

Snektron · 2020-10-17T02:21:00Z

I would also like to note that this optimization makes Zig more complex, and makes it harder to communicate intent.

foobles · 2020-11-06T04:43:35Z

I totally agree with this. It's such an edge-case optimization that reduces the general consistency of the language. Also, it forces the compiler to do a lot of special cases for whether or not a pointer is zero sized.

It would make sense for any type T with size 0, that any non-null pointer with proper alignment is a valid pointer to that type (this is what Rust does).

katesuyu · 2020-11-06T21:50:16Z

Another case where this optimization is fairly annoying is generic code that performs type erasure and requires an opaque "context pointer." In such a case, the 8 bytes are already "wasted", so the optimization does not help. In practice, it simply leads to special-cased boilerplate involving @bitSizeOf(T) == 0.

I argue that, in practice, this optimization is so rare that it causes more special casing to sidestep it than we'd have to apply to generic code upon removal of the optimization. By the way, what examples do we even have of this optimization generating better code in the standard library?

N00byEdge · 2020-11-20T17:12:49Z

What does this mean for addresses of zero sized struct members? Are a and b be defined to have the same address here, or will that be undefined?

pub S = struct {
  a: u0,
  b: u8,
};

SpexGuy · 2020-11-21T21:22:27Z

Some notes on this decision:

The original motivation behind zero-sized pointers was not so much an optimization, but the result of an observation that a pointer is how you find a thing, and since zero-sized values can be anywhere, you don't need any information to find one. The insight that we found most persuasive in reversing this is that pointers are also how you find a part of something larger, and you do need information to find a zero-sized piece of a larger structure or array.

Since this is a big change for the language, @andrewrk has suggested that work on this issue be put in a branch to make sure there aren't any major unforseen issues. For now I'm going to close or accept any design issues related to this, but leave any stage 1 bugs involving zero-sized types open until the temporary branch is merged into the main line. We can probably close all of these when that happens:
#6983, #6951, #6947, #6937, #6936, #6861, #4282, #4246, and #3610.

Multiple zero-sized values may share the same address. In particular, this means that the following loop is not safe with zero sized types:

var curr = slice.ptr;
const end = slice.ptr + slice.len;
while (curr != end) : (curr += 1) { ... }

This will execute zero times with a zero sized type, because all elements in the slice share the same address.

No guarantees are made about the address of globals or stack variables that are zero-sized, except that their values are nonzero and have the correct alignment. Zero-sized variables that are fields or array items are pointers to their parent with the appropriate offset. So for @N00byEdge 's example:

pub S = struct {
  a: u0,
  b: u8,
};

Zig makes no guarantees about the order of fields in a bare struct, but the address of a is guaranteed to be within the range of the instance of S. It may be at the beginning or end of S, or between fields if there were more nonzero-sized fields. But it can't be inside of a larger field. If S were a packed struct, where ordering is guaranteed, a and b would both have the same address as the parent instance.

zigazeljko · 2020-11-28T17:56:29Z

The important observation here is that:

(a) a pointer is how you find a thing, and since zero-sized values can be anywhere, you don't need any information to find one.

and

(b) pointers are also how you find a part of something larger, and you do need information to find a zero-sized piece of a larger structure or array.

This proposal considers all pointers to fall into category (b), but what if we made a distinction in the type system between these two categories?

Assuming attribute noexpand stands for (a) and expand for (b), we would have:

@sizeOf(*noexpand u0) == 0
@sizeOf(*expand u0) != 0

While it is most evident with zero-sized types, this distinction is also important for non-zero-sized types. Consider this example:

fn foo(str: *const [4]u8) void { ... }

fn bar() void {
    foo("somelongstring"[0..4]);
}

Is the compiler here allowed to perform the following optimization?

fn bar() void {
    foo("some");
}

Currently, no, since the compiler can't know if str falls into category (a) or (b).

With this attributes, we could declare foo as fn foo(str: *noexpand const [4]u8) void, which would allow the optimization above to happen.

ghost · 2020-12-11T10:08:20Z

Sounds like this issue would close #2325 as well.

tadeokondrak · 2020-12-14T17:37:21Z

@zigazeljko,

I think what you're proposing would work, but it'd still keep a lot of the complexity of zero-sized pointers in the language.
I haven't seen any real code where zero-sized pointers would provide a benefit, so I think the optimization isn't worth it even if it's turned off by default.

* LLVM backend: The `alloc` AIR instruction as well as pointer constants which point to a 0-bit element type now call a common codepath to produce a `*const llvm.Value` which is a non-zero pointer with a bogus-but-properly-aligned address. * LLVM backend: improve the lowering of optional types. * Type: `hasCodeGenBits()` now returns `true` for pointers even when it returns `false` for their element types. Effectively, #6706 is now implemented in stage2 but not stage1.

andrewrk · 2022-08-23T05:19:59Z

This is implemented in self-hosted which is now the default compiler.

tadeokondrak changed the title ~~Reconsider whether pointers to zero-bit types should be zero-bits~~ Reconsider whether pointers to zero-bit types should be zero bits Oct 17, 2020

andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Oct 17, 2020

andrewrk added this to the 0.8.0 milestone Oct 17, 2020

andrewrk added the accepted This proposal is planned. label Nov 19, 2020

andrewrk changed the title ~~Reconsider whether pointers to zero-bit types should be zero bits~~ change pointers to zero-bit types to be actual pointers with addresses Nov 19, 2020

foobles mentioned this issue Nov 28, 2020

[incomplete] Remove Zero-Sized Pointers (now all pointers hold addresses) #7246

Closed

SpexGuy mentioned this issue Dec 12, 2020

Zero sized string having address null have unexpected results when slicing and taking the pointer #1831

Closed

SpexGuy mentioned this issue Apr 28, 2021

std/ArrayList: Allow ArrayList(u0) to be created #8632

Merged

andrewrk modified the milestones: 0.8.0, 0.9.0 May 19, 2021

ikskuh mentioned this issue Nov 5, 2021

Special code is needed for zero-sized types. ikskuh/any-pointer#3

Open

andrewrk modified the milestones: 0.9.0, 0.10.0 Nov 20, 2021

andrewrk mentioned this issue Jan 7, 2022

Stage2 bit_shifting.zig passing #10532

Merged

perillo mentioned this issue May 11, 2022

read, etc. should check slice.len == 0 #11604

Closed

This was referenced Jul 5, 2022

Using @TypeOf for peer type resolution can trigger false dependency cycles #12000

Closed

Stage2 validate extern types #12075

Merged

Vexu mentioned this issue Jul 15, 2022

no align available for type '.cimport:1:15.struct_google_protobuf_Timestamp' #12122

Closed

martinhath mentioned this issue Aug 17, 2022

stage2 segfault slicing zero length array field of struct #11787

Closed

andrewrk closed this as completed Aug 23, 2022

ethernetsellout mentioned this issue Oct 30, 2022

stage2: miscompilation w/ pointers to zero sized types #13363

Closed

Vexu mentioned this issue Dec 30, 2022

Assertion failed at analyze.cpp:605 in get_pointer_to_type_extra2 #4246

Closed

nektro mentioned this issue Jan 8, 2023

Comptime slice of undefined zero-sized array pointer causes compile error #6936

Closed

nektro mentioned this issue Jul 19, 2023

Why isn't using *void in a struct field an error? #16444

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

change pointers to zero-bit types to be actual pointers with addresses #6706

change pointers to zero-bit types to be actual pointers with addresses #6706

tadeokondrak commented Oct 17, 2020

tadeokondrak commented Oct 17, 2020 •

edited

Loading

Snektron commented Oct 17, 2020

foobles commented Nov 6, 2020 •

edited

Loading

katesuyu commented Nov 6, 2020 •

edited

Loading

N00byEdge commented Nov 20, 2020

SpexGuy commented Nov 21, 2020

zigazeljko commented Nov 28, 2020

ghost commented Dec 11, 2020

tadeokondrak commented Dec 14, 2020

andrewrk commented Aug 23, 2022

change pointers to zero-bit types to be actual pointers with addresses #6706

change pointers to zero-bit types to be actual pointers with addresses #6706

Comments

tadeokondrak commented Oct 17, 2020

tadeokondrak commented Oct 17, 2020 • edited Loading

Snektron commented Oct 17, 2020

foobles commented Nov 6, 2020 • edited Loading

katesuyu commented Nov 6, 2020 • edited Loading

N00byEdge commented Nov 20, 2020

SpexGuy commented Nov 21, 2020

zigazeljko commented Nov 28, 2020

ghost commented Dec 11, 2020

tadeokondrak commented Dec 14, 2020

andrewrk commented Aug 23, 2022

tadeokondrak commented Oct 17, 2020 •

edited

Loading

foobles commented Nov 6, 2020 •

edited

Loading

katesuyu commented Nov 6, 2020 •

edited

Loading