Invariant sizes #85

ivirshup · 2020-02-19T02:46:12Z

Trying to enforce length.(dims(da)) == size(da) == size(data(da)) following up from #83.

This is pretty messy at the moment, but it kept bugging me throughout the day. There are a few cases that need thinking about.

How "unevaluated" dimensions are used. That is, does it make sense to have an instance of X()?
I'm not sure how TransformedGrids are supposed to work. Why are the dims of an array on this grid the transformer, and not the original dimensions with the transform applied?
formatdims could be cleaner
Whyare cov and cor defined the way they are?

codecov-io · 2020-02-19T06:18:41Z

Codecov Report

❗ No coverage uploaded for pull request base (master@4e4a69d). Click here to learn what that means.
The diff coverage is 82.35%.

@@          Coverage Diff           @@
##             master   #85   +/-   ##
======================================
  Coverage          ?   84%           
======================================
  Files             ?    12           
  Lines             ?   650           
  Branches          ?     0           
======================================
  Hits              ?   546           
  Misses            ?   104           
  Partials          ?     0

Impacted Files	Coverage Δ
src/methods.jl	`84.69% <100%> (ø)`
src/primitives.jl	`91.33% <75%> (ø)`
src/array.jl	`66.66% <80%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4e4a69d...9483a43. Read the comment docs.

rafaqz · 2020-02-19T12:56:57Z

Sorry I wont be able to review this for a couple days.

src/methods.jl

ivirshup · 2020-02-22T07:24:49Z

Not sure if you've started looking at this yet, but do you have a preference on dealing with conflicts? E.g. should I rebase or merge?

rafaqz · 2020-02-22T23:25:37Z

rebase when its easy but I dont mind really.

maybe I should do this merge, but not much time currently so have a go.

This could be done more cleanly. I'm thinking it should probably only do this if the dimension type is passed, not Dim()

This reverts commit c054e5f.

* Now can do `DimensionalArray(a, (X(), Y))` and similar * Made the code paths for different types clearer * Reduced redundant code * Errors for too many and too few dimensions should be similar now

ivirshup · 2020-02-23T01:15:16Z

No worries, it was a pretty easy rebase.

ivirshup · 2020-02-23T03:32:09Z

Oh, and my point about the test on the TransformedGrid. My understanding of TransformedGrid is that it's supposed to allow regridding operations like xESMF.

It doesn't make sense to me that the value of the dimension would be the transformation. It seems more like a property of the dimension's grid. Right now the test fails since the value of the dimension doesn't have length.

Previously this would throw an error due to an intermediate subset DimensionalArray being created with invalid dimensions.

rafaqz · 2020-02-23T05:14:48Z

Sure the transormation can be in a grid field. The choice was pretty arbitrary.

rafaqz · 2020-02-23T09:32:31Z

My understanding of TransformedGrid is that it's supposed to allow regridding operations like xESMF.

Maybe eventually. Currently it's just to be able to index points using a transformed coordinate system, and to be able to actually represent gdal etc files than use a coordinate transform.

It's also not at all my area of expertise, I don't actually use these grids. But we need to be able to represent them somehow and I wanted to be sure that could happen with the existing architecture - and it can. But it's not a complete implementation.

rafaqz · 2020-02-24T04:33:20Z

Also realised I never answered these (too busy...):

How "unevaluated" dimensions are used. That is, does it make sense to have an instance of X()?

The instance of X() is for indexing the x dim with Colon(). Or for passing to mean etc as a dimension.

I'm not sure how TransformedGrids are supposed to work. Why are the dims of an array on this grid the transformer, and not the original dimensions with the transform applied?

Because then you have to compute every transform for every dim combination, and its at minimum 2d, so that also means allocating a matrix.

formatdims could be cleaner

Absolutely. I have a rework of it in progress.

Why are cov and cor defined the way they are?

I have no idea, but also not sure exactly what you mean

ivirshup · 2020-02-24T06:14:38Z

Does anyone currently use the grids? I'm not sure I'm the person to redefine them, since I definitely don't use them. Would it be alright to leave the test disabled and open an issue for fixing this?

About what the dim values for transformed grid are, I was thinking something like:

DimArray A with dims θ, r is on a transformed grid that maps (X, Y) -> (θ, r). So if you access A with θ(Between(τ//4, τ//2)), r(Near(3.)), no transformation occurs. But if indexed with: X(Near(5)), Y(Near(10)) the transformation would be applied to the coordinates being searched for before the lookup happens.

The instance of X() is for indexing the x dim with Colon(). Or for passing to mean etc as a dimension.

I think what I mean by this, is should X() and X mean different things? I've gone ahead and made them equivalent in the cases I've come across, since that fits the tests, but I think it's worth considering.

format dims

Oh, I ended up doing a bit of cleanup here while fixing a bug...

Why are cov and cor defined the way they are?

Basically, if you did cov(DimArray(ones(5, 4), (X, Y)), dims=X) you get an array of size (4, 4) but with two X dimensions of length five. I've changed this, but there's a comment in the tests (tests/methods.jl) that says:

    # Need to think about dims for these, currently (Y, Y) etc.
    # But you can't index (Y, Y) with dims as you get the
    # first Y both times. It will plot correctly at least.

And I wasn't quite sure what this meant.

rafaqz · 2020-02-24T07:57:47Z

No-one uses the TransformedGrid as far as I know, but I use the others every day in GeoData.jl - they are what lets it abstract the array access over different base data types, with different grid orders, projections and range/array indices etc. It's the basis of a lot of my other modelling packages and the use case I wrote DD for.

I think the transform is from some coordinates to the actual index integers, so you would only use Between with the untransformed values A[X(Between(35.9, 47.4)), Y(Near(145))]. Otherwise you can just use a unit range A[θ(1:3), r(7)].

We need both X and X() because X() is more reliably type stable than X. But you should be able to use them interchangeably everywhere. I think the X syntax feels cleaner.

cov/cor just sounds like an error from rushing the implementation. That comment is from before I fixed #46, and can be deleted now.

ivirshup · 2020-02-24T09:00:22Z

No-one uses the TransformedGrid as far as I know, but I use the others every day in GeoData.jl

Oh, I definitely just meant TransformedGrid. So, do you mind if I leave the tests for TransformedGrid commented out with a TODO?

We need both X and X() because X() is more reliably type stable than X.

Yeah, the typeof((X, Y)) === Tuple{UnionAll, UnionAll} had me frustrated for a bit.

I think the X syntax feels cleaner.

Me too, that's what I've been going for. I've got another commit that makes mean(da, dims=(X, Y)) work (for a few other reducers as well). Could I add that here, or should it be another PR?

Apart from deciding on the multidimensional reduction stuff, does this need any changes?

rafaqz · 2020-02-24T11:16:33Z

I guess we can leave TransformedGrid tests out, it feels a bit dirty but at least we'll know if anyone else is using it...

Maybe I'll merge this and you can make a new PR for the reduce methods, we should try to keep these focussed.

Also are there actual tests for invariant size?

test/array.jl

ivirshup · 2020-02-25T12:53:05Z

Travis is just failing due to build errors for the plotting tests. The relevant tests are passing.

rafaqz · 2020-02-26T08:12:50Z

Ok this is a Plots/FFMPEG_jll problem on windows builds. We might need to disable windows testing temporarily to get this passing.

Actually we should put a conditional block around plot tests to avoid windows.

if !Sys.iswindows() 
    include("plotrecipes.jl")
end

ivirshup force-pushed the invariant-sizes branch from 30ec255 to 1ddce11 Compare February 19, 2020 02:47

ivirshup changed the title ~~[WIP] Invariant sizes~~ Invariant sizes Feb 19, 2020

ivirshup requested a review from rafaqz February 19, 2020 12:53

ivirshup commented Feb 19, 2020

View reviewed changes

src/methods.jl Show resolved Hide resolved

ivirshup mentioned this pull request Feb 19, 2020

Documentation of invariants #83

Closed

ivirshup added 9 commits February 23, 2020 11:45

Start enforcing dim sizes

f80295d

Initial (maybe temporary) test modifications

65223c7

More readable error message from checklen

f0ad9d8

Revert Dim definition changes

c485dfa

Add case for automatically creating dimension of the right size

85ca564

This could be done more cleanly. I'm thinking it should probably only do this if the dimension type is passed, not Dim()

Get tests to pass

1a5644a

Revert "Initial (maybe temporary) test modifications"

ea13434

This reverts commit c054e5f.

Clean up formatdims

cedb0e6

* Now can do `DimensionalArray(a, (X(), Y))` and similar * Made the code paths for different types clearer * Reduced redundant code * Errors for too many and too few dimensions should be similar now

Clean up cor/ cov wrapper

ab96b1d

ivirshup force-pushed the invariant-sizes branch from 269bcf0 to ab96b1d Compare February 23, 2020 01:00

Bugfix for indexing into result of dropdims

4cb3bbd

Previously this would throw an error due to an intermediate subset DimensionalArray being created with invalid dimensions.

Remove old comment

fbc8c3d

similar should still give the right size and eltype

f93c390

Add tests for constructor, and improve error message

f1e0629

ivirshup commented Feb 24, 2020

View reviewed changes

test/array.jl Outdated Show resolved Hide resolved

Similar returns wrapped array if the size changes.

9483a43

rafaqz merged commit bcef609 into rafaqz:master Feb 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invariant sizes #85

Invariant sizes #85

ivirshup commented Feb 19, 2020 •

edited

Loading

codecov-io commented Feb 19, 2020 •

edited

Loading

rafaqz commented Feb 19, 2020

ivirshup commented Feb 22, 2020

rafaqz commented Feb 22, 2020 •

edited

Loading

ivirshup commented Feb 23, 2020

ivirshup commented Feb 23, 2020

rafaqz commented Feb 23, 2020 •

edited

Loading

rafaqz commented Feb 23, 2020

rafaqz commented Feb 24, 2020 •

edited

Loading

ivirshup commented Feb 24, 2020

rafaqz commented Feb 24, 2020 •

edited

Loading

ivirshup commented Feb 24, 2020

rafaqz commented Feb 24, 2020

ivirshup commented Feb 25, 2020

rafaqz commented Feb 26, 2020 •

edited

Loading

Invariant sizes #85

Invariant sizes #85

Conversation

ivirshup commented Feb 19, 2020 • edited Loading

codecov-io commented Feb 19, 2020 • edited Loading

Codecov Report

rafaqz commented Feb 19, 2020

ivirshup commented Feb 22, 2020

rafaqz commented Feb 22, 2020 • edited Loading

ivirshup commented Feb 23, 2020

ivirshup commented Feb 23, 2020

rafaqz commented Feb 23, 2020 • edited Loading

rafaqz commented Feb 23, 2020

rafaqz commented Feb 24, 2020 • edited Loading

ivirshup commented Feb 24, 2020

rafaqz commented Feb 24, 2020 • edited Loading

ivirshup commented Feb 24, 2020

rafaqz commented Feb 24, 2020

ivirshup commented Feb 25, 2020

rafaqz commented Feb 26, 2020 • edited Loading

ivirshup commented Feb 19, 2020 •

edited

Loading

codecov-io commented Feb 19, 2020 •

edited

Loading

rafaqz commented Feb 22, 2020 •

edited

Loading

rafaqz commented Feb 23, 2020 •

edited

Loading

rafaqz commented Feb 24, 2020 •

edited

Loading

rafaqz commented Feb 24, 2020 •

edited

Loading

rafaqz commented Feb 26, 2020 •

edited

Loading