Something is wrong with arrays of weird-length primitives #26026

chethega · 2018-02-13T11:01:21Z

Discussion on https://discourse.julialang.org/t/odd-byte-length-primitive-types-and-reinterpret/9025/3.

Reproduce:

primitive type Int24 24 end
Int24(x::Int) = Core.Intrinsics.trunc_int(Int24, x)
Base.zero(::Type{Int24})=Int24(0)

V=Vector{Int24}(1000);
p=reinterpret(Ptr{UInt8}, pointer(V));
for i=1:sizeof(V) unsafe_store!(p+i, UInt8(i)) end

@show V[1:4];
#V[1:4] = Int24[Int24(0x020100), Int24(0x060504), Int24(0x0a0908), Int24(0x007f37)]

So we see that the arrayref behaves like Int24 had size 4 (tree bytes plus one padding). The same happens for pointerref:

@show unsafe_load(pointer(V,2));
@show unsafe_load(pointer(V)+3);
@show unsafe_load(pointer(V),2);
#unsafe_load(pointer(V, 2)) = Int24(0x050403)
#unsafe_load(pointer(V) + 3) = Int24(0x050403)
#unsafe_load(pointer(V), 2) = Int24(0x060504)

Of course that means that we see oob memory and corrupt on writes:

@show V[length(V)];
# V[length(V)] = Int24(0x007f37)
V=zeros(Int24, 1000)
#signal (11): Segmentation fault

Tested both on 0.62 and current master 0.7.

The text was updated successfully, but these errors were encountered:

elextr · 2018-02-13T13:14:02Z

Is it because Vector has a layout that matches C layout which requires aligments to be a power of 2, so odd element lengths get padded.

chethega · 2018-02-13T14:50:46Z

Yes, something definitely gets confused about alignment. sizeof(Int24)=3 and sizeof(Tuple{Int24})=4 and sizeof(Tuple{Int8,Int8,Int8})=3 and

struct baz
       x::Int8
       y::Int8
       z::Int8
       end
@show sizeof(baz);
#sizeof(baz) = 3

struct foo 
       x::Int24
       end
@show sizeof(foo)
#sizeof(foo) = 4

struct foo2
x::Int24
y::Int8
end
@show sizeof(foo2);
#sizeof(foo2) = 4

So we see that Int24 requires an alignment of 4 bytes and has a size of 3. This means that it should have size 4 in isolation / in arrays, and size 3 when packing structs.

Since weird-size primitives are ultimately only good for alignment/size hacking of structs, I kinda would prefer more explicit placement control for structs (optional syntax to just specify what goes where) and deprecation of primitive type.

StefanKarpinski · 2018-02-13T14:53:26Z

Why is deprecation of primitive type necessary? How else are we going to bootstrap the language?

chethega · 2018-02-13T15:27:33Z

Why is deprecation of primitive type necessary? How else are we going to bootstrap the language?

It is not necessary, but might be convenient. I don't know how to bootstrap instead; use builtins? Just don't export primitive type to users?

I mean: What can users possibly do with primitive type that is impossible with structs?

The only example I see is foo2 given above: The struct wastes no padding and allows aligned access to x. Do you see other advantages?

Disadvantage of primitive type: Complexity in codegen. As long as user-definable primitive types of weird length exist, codegen needs to cope with weird cases where the size in isolation (4 bytes) differs from the size in structs (3 bytes, but 4-byte-aligned).

The simplest possible fix would be to just not export the creation of primitive types; then codegen only has to deal correctly with primitive types that actually exist in base ("it's a feature, not a bug").

As an apology to bit-fiddlers one could allow more low-level struct definitions; I think these could replace all current uses of user-defined primitive types.

elextr · 2018-02-14T01:20:29Z

Yes, something definitely gets confused about alignment. sizeof(Int24)=3 and sizeof(Tuple{Int24})=4 and sizeof(Tuple{Int8,Int8,Int8})=3 and

Chips generally have loads for power of two sizes and load power of two aligned data faster than unaligned data, and in some cases require alignment, so members of composites (in all languages, not just Julia) tend to be aligned by default. Some languages do allow optional packing on unaligned boundaries, but its never the default for those performance reasons. So any composite that is sharable with C/C++/Fortran must be aligned and AFAICT Julia does not support packed composites anyway.

Alignment is rounded up to the next power of two. So in a composite of Int8s each one is on a power of two boundary (1) and in a composite an Int24 is padded to a power of two boundary (4).

Note that although an object of Int24 has a size of 3 it will always be allocated a power of two aligned space in the heap or stack, and so will the next object in memory, so your Int24 still use (at least) 4 bytes, its just you can't see the padding.

As for complicating codegen, the way it is its always aligned, so its actually simpler since it matches the chips.

(cherry picked from commit 350f514)

rfourquet mentioned this issue Sep 5, 2018

Corruption/segfault with Array of primitive type with (some) non-power of 2 length #29053

Closed

JeffBezanson added the bug Indicates an unexpected problem or unintended behavior label Sep 5, 2018

felipenoris mentioned this issue Sep 14, 2018

Segmentation fault when custom primitive type is put into a vector #29193

Closed

chethega mentioned this issue Apr 10, 2019

Add primitive type tutorial blog JuliaLang/www.julialang.org#321

Closed

JeffBezanson added a commit that referenced this issue Sep 17, 2019

Array fixes for non-power-of-2 sized elements, fixes #26026

2b6ddaf

JeffBezanson added a commit that referenced this issue Sep 19, 2019

Array fixes for non-power-of-2 sized elements, fixes #26026

ba2f6cc

JeffBezanson added a commit that referenced this issue Sep 19, 2019

Array fixes for non-power-of-2 sized elements, fixes #26026

e29fd4a

JeffBezanson added a commit that referenced this issue Sep 20, 2019

Array fixes for non-power-of-2 sized elements, fixes #26026

6c13b63

JeffBezanson closed this as completed in 350f514 Oct 1, 2019

JeffBezanson added a commit that referenced this issue Nov 7, 2019

Array fixes for non-power-of-2 sized elements, fixes #26026 (#33283)

676aa50

(cherry picked from commit 350f514)

JeffBezanson added a commit that referenced this issue Nov 8, 2019

Array fixes for non-power-of-2 sized elements, fixes #26026 (#33283)

58a8c4a

(cherry picked from commit 350f514)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Something is wrong with arrays of weird-length primitives #26026

Something is wrong with arrays of weird-length primitives #26026

chethega commented Feb 13, 2018

elextr commented Feb 13, 2018

chethega commented Feb 13, 2018

StefanKarpinski commented Feb 13, 2018

chethega commented Feb 13, 2018

elextr commented Feb 14, 2018

Something is wrong with arrays of weird-length primitives #26026

Something is wrong with arrays of weird-length primitives #26026

Comments

chethega commented Feb 13, 2018

elextr commented Feb 13, 2018

chethega commented Feb 13, 2018

StefanKarpinski commented Feb 13, 2018

chethega commented Feb 13, 2018

elextr commented Feb 14, 2018