Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

var and std do not work for Any[] #8319

Closed
jakebolewski opened this issue Sep 12, 2014 · 32 comments
Closed

var and std do not work for Any[] #8319

jakebolewski opened this issue Sep 12, 2014 · 32 comments
Assignees
Labels
bug Indicates an unexpected problem or unintended behavior needs decision A decision on this change is needed

Comments

@jakebolewski
Copy link
Member

All other statistic functions in Base work work for Any arrays of numeric values.

julia> test = Any[1,2,3]
3-element Array{Any,1}:
 1
 2
 3

julia> std(test)
ERROR: `zero` has no method matching zero(::Type{Any})
 in var at statistics.jl:162

julia> var(test)
ERROR: `zero` has no method matching zero(::Type{Any})
 in var at statistics.jl:162
@jakebolewski jakebolewski changed the title var and std do not work for Any arrays var and std do not work for Any[] Sep 12, 2014
@eschnett
Copy link
Contributor

Since var (and std) do not work for empty iterables anyway, the loop in "var" can be unrolled by one to avoid the need for calling zero.

@ivarne ivarne added the help wanted Indicates that a maintainer wants help on an issue or pull request label Sep 15, 2014
@timholy
Copy link
Member

timholy commented Sep 21, 2014

Just to state the obvious, it's not limited to just Any: it's arrays of (non-numeric) immutables, arrays-of-arrays, etc.

@andreasnoack
Copy link
Member

I like generic code, but how do you define the variance of an array of arrays?

@timholy
Copy link
Member

timholy commented Sep 21, 2014

I guess it only makes sense for Vectors, not general arrays, where the obvious definition is the covariance.

It's also pretty clear what you want in the case of ColorValues, which is what started my interest in this issue.

@andreasnoack
Copy link
Member

Yes, variance of a vector of vectors makes sense. What is ColorValue a subtype of?

@timholy
Copy link
Member

timholy commented Sep 21, 2014

It's an abstract type defined in the Color package. This came up in JuliaImages/Images.jl#187.

@timholy timholy self-assigned this Sep 22, 2014
@jakebolewski
Copy link
Member Author

Does it make sense to define this for all possible numeric types? What does the variance of a complex number even mean? I guess this is the tension with allowing the definition to be completely generic.

@StefanKarpinski
Copy link
Member

According to Wikipedia, the variance of a complex random variable does make sense and is defined as E[(X - µ)(X - µ)'] where ' denotes conjugate transpose. So we could make this correct but I don't think it is right now since we're not taking the conjugate.

@johnmyleswhite
Copy link
Member

I think variance is well-defined whenever there's an L2 norm and a definition of expectation.

@eschnett
Copy link
Contributor

As I mention above, the code can be easily rewritten to not call zero by re-writing the loop. This is a mechanical change that does not depend on any properties of the types involved.

The current code expects there to be a neutral element (zero) for the reduction operation, but only works if there is at least one element. In this case, one does not need the neutral element.

@timholy
Copy link
Member

timholy commented Sep 22, 2014

It's not just the zero; in the example from Images, there is a zero(RGB{Ufixed8}), but because an RGB is not a Number, the argument-typing means there is no suitable definition of varm.

@StefanKarpinski
Copy link
Member

What's the inner product on colors?

@timholy
Copy link
Member

timholy commented Sep 22, 2014

In that case, the user "clearly" wants elementwise. It's basically the diagonal of the covariance=outer product (I would actually say it needs the outer product, not the inner product).

@jiahao
Copy link
Member

jiahao commented Sep 22, 2014

From a strict, purely mathematical perspective, it is not possible to define a unique generalization of variance to objects with more than one component because there is no unique way to generalize the notion of "squaring" the random variable. So I would lean heavily in favor of defining var, std and the like only for real variables and let users define other methods suitable for their applications.

(Long rambling discussion warning)

For complex numbers the Wikipedia definition, taken literally, is either wrong or incomplete. You could define the variance of z as literally the expectation value E[ (z - E(z)) (z - E(z))* ], and the transposition on a scalar is trivial; this would generate a scalar not a matrix. However if we associate with z its Cartesian representation as a 2-vector (x, y), then it is still unclear what this definition means. (x, y) is real valued and the complex conjugation is a no-op, and you would get [ E(xx) E(xy) ; E(xy) E(yy)] as the variance (but is really the covariance). Or maybe it's actually the diagonal of the covariance, which is yet a different matrix still. (So it's still unclear what you would write down a definition that has a nontrivial "conjugate transpose".) Furthermore you can also define yet more notions of variance that do not violate the basic properties of expectations, like E[(z - E(z))^2] or E[|z - E(z)|^2, which lead to yet different results.

Similarly for vector-valued quantities it is a mistake to assume that the covariance is the only generalization of the variance; you can also have the "inner product" version E(X' X) as opposed to the "outer product" E(X X') which is the covariance. Of course it isn't clear how useful the "inner product variance" is, but it is still a perfectly well-defined cumulant.

@jiahao
Copy link
Member

jiahao commented Sep 22, 2014

With regard to colors specifically, I'm concerned that applications taking means and variances of color objects may not be taking care of the underlying curvature of the space. (Which is yet another generalization to variance to something like E[ x_μ x^μ ] = E[ x^μ g_{μν} x^ν].)

Presumably it is desirable for the results of arithmetic on colors to be independent of the working color space. If so, the issue that arises is that the only flat color space is XYZ (i.e. XYZ is a linear vector space), so addition and multiplication in XYZ does not require additional curvature corrections, but they would be required for all other color spaces, which are not linear vector spaces, but manifolds with nontrivial curvature. Otherwise, you would get that taking the mean/variance does not commute with the convert operation going between different color spaces. Presumably ignoring the curvature of the RGB color space is why one gets phenomena like orange being the average color of every picture on the Internet.

@timholy
Copy link
Member

timholy commented Sep 22, 2014

Good points, @jiahao. I hadn't really even thought of defining all these operations only for XYZ colorspace, but technically you're right. OTOH, I really worry that people will be annoyed if taking the mean over an RGB image is slow because it requires two matrix multiplications per pixel.

Not really sure what to do here.

@jiahao
Copy link
Member

jiahao commented Sep 22, 2014

Perhaps you can define some tolerance in terms of color differences within which you wouldn't bother to do to the transformations back and forth into XYZ. If all you're doing is averaging different colors of (say) bright orange, then for sufficiently similar colors the distance between each sample is going to be small enough that the curvature of the space is not going to change the answer much.

Presumably averaging over many similar images is going to be the more common use case.

@jiahao
Copy link
Member

jiahao commented Sep 22, 2014

Having said that, I suspect that the error in ignoring curvature could be systematically biased. Might be worth testing.

@timholy
Copy link
Member

timholy commented Sep 23, 2014

This is beginning to sound a little like another "vectorized functions are evil" (meaning, it's not really clear what the user wants here), and I should just encourage use of mapreduce.

@MarkusQ
Copy link

MarkusQ commented Sep 26, 2014

From a purist perspective, I suspect it's even worse than @jiahao's take, since human colour perception is decidedly non-linear.

That said, it's not that uncommon for people in a wide range of fields to want to compute statistics on color values

http://iopscience.iop.org/1538-4357/615/2/L101
http://www.sciencedirect.com/science/article/pii/010956419190038Z
http://ieeexplore.ieee.org/xpls/abs_all.jsp/arnumber/1613079
etc.

@timholy
Copy link
Member

timholy commented Sep 26, 2014

Yeah, I've decided I'm going to ignore the purists' perspective on this one (aside from a likely note in the documentation).

@rennis250
Copy link
Contributor

Just to chime in, CIE XYZ is not the only non-curved colour space. In fact, there are quite a few (for a compendium of colour spaces, see "Color Ordered" by Rolf Kuehni). Anyway, that really doesn't change the issue here.

Actually, I'm curious @jiahao, where have you used RGB as a curved colour space? The operation to go from XYZ to RGB is a linear transformation. The curvature you are seeing in it is a by-product of the (admittedly, cheap) representations of RGB space, where every point is coloured in, and your perception imputes a curvature (the coloured representation is inaccurate because it doesn't control for simultaneous contrast effects, nor does it properly represent the influence of adaptation state). If you represent the XYZ space in the same way, you see the same phenomena. The main curved colour space in use is the sRGB space, probably followed by the LAB and LUV colour spaces. (EDIT: My previous statement about the main colour spaces in use was incorrect, as discussed below.) The question is how does changing the tristimulus values correspond to a change in the physical stimulus you are working with; in other words, is the relationship between the space and the stimulus dimensions linear or non-linear. The relationship between the stimulus and the resulting perception is another question. (Also, I'm rather suspicious of that Atlantic article. It would be nice to take a look through his image set for any biases.)

With respect to the variance of colours, in our lab, we have either gone with the spherical (or circular) standard deviation of the hue angles or the standard deviation of a colour distribution after it has been projected onto a given colour axis (all of this is done in a linear colour space). The reason for choosing one over the other has been dependent on the application, so I agree with everyone else that it's best to leave the choice to the user, for colours at least, but providing some recommendations to users would probably be helpful, since it depends on what level of encoding in the visual system you care about or if you even care about the visual system at all (c.f., the galaxy colour distribution paper that @MarkusQ linked to).

To answer the question of @StefanKarpinski, there is no inner product defined for raw tristimulus values, since basic colour matching spaces fall in the class of Affine vector spaces. There have been efforts to find transformations of these spaces to provide them with a structure that allows a sensible metric to be applied (two of these are the LAB and LUV spaces), but unfortunately, they don't achieve that goal.

Best,
Rob

@timholy
Copy link
Member

timholy commented Sep 29, 2014

I suspect RGB/sRGB confusion is at play here.

@rennis250
Copy link
Contributor

Ah, right, sorry about that @jiahao. Ignore the majority of that paragraph then. Thanks, @timholy.

@timholy
Copy link
Member

timholy commented Sep 29, 2014

Well, actually, your comment was very informative and helped clarify things, so thanks.

@jiahao
Copy link
Member

jiahao commented Sep 29, 2014

Fair enough, I am not an expert in color theory, so I'm perfectly willing to admit the existence of other flat color spaces. I didn't know the definition of sRGB offhand, and it does look like a flat color space.

Is LCHuv flat? I don't think so - and we have definitions that assume its flatness

@rennis250
Copy link
Contributor

Nema problema. Hope the post came off as more over-caffeinated than anything else. :P No hard feelings meant; I just get too excited about this stuff sometimes!

But, I think the sRGB is non-linear and not flat. Base RGB assumes that your monitor has primaries that have a linear input (voltage) to output (luminance of primary) relationship (if you're working with a CRT for example, but the assumption can be generalised). However, this is not true for any monitor that I know of, so the sRGB space provides an additional non-linear encoding that saves the base RGB values in a gamma-corrected format, allowing you to send those directly to the monitor to get a linearised image (which is hopefully as accurate a reproduction of the original, imaged scene as possible). As far as I understand, the space was developed with cameras in mind, which ideally have an inverse gamma relationship to most monitors (gamma of ideal CRT = 2.2, gamma of ideal camera = 1/2.2), allowing one to use an arbitrary camera to take an image, send it to your buddy on the 'net, and have it reproduced as accurately as possible on his arbitrary monitor. My presentation here is rough however, and doesn't account for some quirks of the space at low light levels, where LAB and LUV also have similar quirks. Plus, there are perceptual benefits to this encoding scheme. There may also be other factors that influenced the gamma choices (e.g., maybe it's just cheaper to produce electronics with these response characteristics), but I never bothered to look into that. Anyway, this goes against my statement that LAB and LUV are the two main non-linear spaces. By far, sRGB is the main one in use. LAB and LUV probably follow second, at least based on my experiences in colour research.

I would be very happy for an industrial colour person to correct me here though, since all of my work involves first undoing all of the automatic corrections that sRGB, monitors, and colour management systems perform, so I've never become too intimate with them. :P Doing some additional reading now and I will report back if I've completely botched my understanding and explanation here.

If I understand correctly, the LCHuv space is just a cylindrical parametrisation of the underlying LUV space? If so, then yes, I would say that it is not flat.

I really need to start getting my hands dirty in the Color.jl package. There's plenty of fun to be had. :)

@StefanKarpinski
Copy link
Member

@rennis250, it would be great to have some more expert input into the package. It seems like there's no obviously correct way to do linear operations on colors. Given that, one option is just not to define such operations. However, I suspect that's going to prove quite annoying. Instead we can just define the linear operations in the naive obvious way but make more sophisticated mechanisms available as well.

@jiahao
Copy link
Member

jiahao commented Sep 29, 2014

Ok, let's not hijack this thread further for color discussions. We can do it in JuliaAttic/Color.jl#64

@ihnorton ihnorton added needs decision A decision on this change is needed and removed help wanted Indicates that a maintainer wants help on an issue or pull request labels Dec 13, 2014
@stevengj
Copy link
Member

In #4039, I fixed var for complex matrices and I added a test case, but it looks like it got broken again when the version for arbitrary iterables was added (since that case was not tested)?

@jiahao, I don't understand why you think the Wikipedia article is wrong when it defines the covariance matrix of a complex vector (and hence the variance of a complex scalar). That definition is totally standard as far as I can tell, and is the only reasonable definition if you want to do the usual algebra things (SVD etc) with the covariance matrix. Julia should follow it.

(More generally, similar to what @johnmyleswhite wrote above, it seems like the variance could be defined for arbitrary Banach spaces as ⟨|v - ⟨v⟩|²⟩, where |...| is a norm and ⟨...⟩ is expectation. For real and complex numbers over the complex field, there is only one norm, up to an overall scale factor, so the variance is uniquely defined. For other vector fields, there are many choices of norms, but of course we could allow a norm to be passed as a parameter. I've never heard of anyone defining a scalar variance that did not correspond to a norm², have you? I would think that matrix-valued generalizations, i.e. the covariance matrix and friends, should be a different function than var.)

@jakebolewski jakebolewski added the bug Indicates an unexpected problem or unintended behavior label May 29, 2015
timholy referenced this issue in JuliaLang/METADATA.jl Aug 9, 2015
@oscardssmith
Copy link
Member

Do the recent changes to var and std make this not an issue?

@fredrikekre
Copy link
Member

julia> std(Any[1,2,3])
1.0

julia> var(Any[1,2,3])
1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior needs decision A decision on this change is needed
Projects
None yet
Development

No branches or pull requests