-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Array{Union{T, Missing}} constructors #24939
Comments
I'm surprised to see that |
AFAIK that behavior is needed because arrays of The problem is always the same: unless you return random contents, people will eventually depend on the implemented behavior even if its not part of the public API. But returning random contents is not always the most efficient solution. |
Ok thanks for the clarification! I'm fine then with having consistent filling for |
Modulo a few details, option three seems mostly in keeping with the direction in #24595? :) |
Agree that option 3 seems like the way to go. |
#25054 implements option 3. |
I like option 3 because it shares the API of what I feel to be a more general solution here (#24595 (comment)), which works a bit like: out = Array{T,N}(input, dims)
# is equivalent to
out = Array{T,N}(unintialized, dims)
out .= input Replace |
Fixed by #25054 |
We need a way to construct
Array{Union{T, Missing}}
objects filled withmissing
easily. Here are three proposals which can be seen either as complementary or as alternatives.The Missings package provides the
missings(::Type, dims...)
function for that, which is really convenient since it avoids the need to writeUnion
andMissing
. I could make sense to add it to Base.More generally, if would be nice if
Array{Union{T, Missing}}(uninitialized, dims...)
implemented a simpler behavior. Currently, it returns an array filled withmissing
forisbits
element types, but an array full of#undef
for other types:There are technical reasons for the first case of course. But it would sound reasonable to apply the same rule for all types, if only to make it easier to explain and remember. For example, in the section of the manual about missing values, it would be annoying to have to present an example for
isbits
types, and an example for other types.A possible rule would be that uninitialized arrays are filled with the first singleton type, "first" being defined not by the user-specified order but by the internal sorting of types. In terms of performance, I'm not sure whether it would have an impact. Maybe the current behavior allows using zeroed memory pages provided by
calloc
, which can be more efficient than filling them with a custom value?Array{Union{T, Missing}}(uninitialized, dims...)
, it seems appealing to be able to writeArray{Union{String, Missing}}(missing, dims...)
. If that was supported, it would be less of an issue that the rules determining the contents of the array created usingArray{Union{T, Missing}}(uninitialized, dims...)
are complex: it would only be used when you really want an uninitialized array (as the syntax indeed indicates).PS: before somebody proposes it, using
fill
is not really an option sincefill(missing, 2)
gives anArray{Missing}
, and the only alternative is very long:fill!(Array{Union{T, Missing}}(uninitialized, dims...), missing)
.The text was updated successfully, but these errors were encountered: