Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make an isdense trait #54715

Open
LilithHafner opened this issue Jun 6, 2024 · 8 comments
Open

Make an isdense trait #54715

LilithHafner opened this issue Jun 6, 2024 · 8 comments
Labels
arrays [a, r, r, a, y, s]

Comments

@LilithHafner
Copy link
Member

Density is orthogonal to many types (e.g. StaticArray, CodeUnits, SubArray, etc) so it makes more sense as a trait instead of an abstract type.

As a starting point,

isdense(::DenseArray) = true
isdense(::Array) = false
isdense(::String) = true # Maybe

Maybe also applies to strings?

See also:

@Seelengrab
Copy link
Contributor

Duplicate of #54581?

@jariji
Copy link
Contributor

jariji commented Jun 6, 2024

From Core.DenseArray

The elements of a dense array are stored contiguously in memory.

isdense isn't specific enough a term, imo. I'd consider contiguous_storage or contiguous_memory or is_memory_contiguous.

@jakobnissen
Copy link
Contributor

A few comments off the top of my head:

Denseness should also be documented to say that it has a stride of 1. E.g. dense things should not be stored in reverse order, such that subsequent elements are placed on lower memory addresses.

My proposal in #54581 is related. However, in that proposal, I don't include a trait like isdense. Such a trait would make sense both with and without the existence of a memory view type.
However, I propose the IsMemory trait, which signifies that an object is equal to its memory. For example, a dense view of a Vector is equal to a Memory with the same element type and content. In contrast, a string is not equal to the content of its memory. Which sense of the word "dense" is meant here? Merely that the data of an object is stored contiguous in memory, or also that the object is semantically equal to its own memory?
The distinction is important, because:

  • The former definition includes strings, and as such means something like "I can take a pointer to this object and copy it". However, because it carries little semantic meaning, there isn't much else you can do with this isdense definition.
  • The latter definition allows you to treat the memory of the object as actually being the object (i.e. the object is an AbstractArray and is interchangeable with a Memory object pointing to the same memory), which then again allows you to e.g. pass the object into C functions. This is more useful, but it then leaves strings a little in the dust.

I think the latter, more restrictive, IsMemory-like definition is the most useful, in which case we would also need to say that an object implementing isdense should be an AbstractArray and equal to Memory with the same content.
Can anyone think of a counterexample, where it would be useful to have an object which is not IsMemory, but still isdense, and where such an object wouldn't be handled in a custom function anyway, such that its implementation of isdense is actually used for dispatch?

@nsajko nsajko added the arrays [a, r, r, a, y, s] label Jun 8, 2024
@nhz2
Copy link
Contributor

nhz2 commented Jun 11, 2024

If isdense(A) returns true, I think that should guarantee write(io::IOBuffer, A) writes the same bytes out as GC.@preserve A unsafe_write(io::IOBuffer, pointer(A), UInt(elsize(A)) * UInt(length(A))).

This could include Matrix{Float64}, even though it might not be isequal equal to a Memory{Float64} with the same content, due to the different shape.

Though I think also restricting isdense to only AbstractVector with one-based indexing could be more useful.

@Seelengrab
Copy link
Contributor

I think that should guarantee write(io::IOBuffer, A) writes the same bytes out as GC.@preserve A unsafe_write(io::IOBuffer, pointer(A), UInt(elsize(A)) * UInt(length(A))).

I don't think that's possible, since that would write out SIMD alignment/padding in objects too.

@nhz2
Copy link
Contributor

nhz2 commented Jun 12, 2024

What about if isdense(A) returning true guarantees:

@assert isbitstype(eltype(A))
Base.require_one_based_indexing(A)
B = Memory{eltype(A)}(undef, length(A))
GC.@preserve A B unsafe_copyto!(pointer(B), pointer(A), length(A))
@assert isequal(A, B)

I think this is the same as the IsMemory-like definition.

@oscardssmith
Copy link
Member

The first of these requirements definitely seems wrong.

@nhz2
Copy link
Contributor

nhz2 commented Jun 12, 2024

Without that there is a problem with Union vectors.

A = Union{Float64,Int64}[1,2.5,3]
B = Memory{eltype(A)}(undef, length(A));
GC.@preserve A B unsafe_copyto!(pointer(B), pointer(A), length(A));
isequal(A,B)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s]
Projects
None yet
Development

No branches or pull requests

7 participants