Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

length and iteration? #39

Closed
JeffBezanson opened this issue Oct 3, 2017 · 6 comments
Closed

length and iteration? #39

JeffBezanson opened this issue Oct 3, 2017 · 6 comments

Comments

@JeffBezanson
Copy link

I think this package is going very well and I'm on board with most of it (e.g. the behavior of comparisons and logical operators), but I find the definitions of length, iteration, and the array interface methods pretty sketchy. Those would be OK if null always represented a missing number, but if it's going to represent a missing string (or something else) then those will give misleading results. Are we 100% sure we need those methods? The approach I favor is to add no methods to Null until it becomes extremely clear that they're needed to avoid major pain. Are there examples where that threshold has been reached for e.g. length?

@quinnj
Copy link
Member

quinnj commented Oct 3, 2017

I think for length and iteration, they were carry-overs from porting NAtype, so I'm not too familiar w/ their use-case (my approach in porting, was actually as you described: only port if it was needed somewhere). I think @ararslan mentioned at one point that null should behave like a Number in those cases, but I don't remember if there was an explicit reason or not. What are the array interface methods you mentioned?

One way to test some of these out would be to take @nalimilan's branch to port DataArray's to Nulls and see what tests fail: JuliaStats/DataArrays.jl#288

@nalimilan
Copy link
Member

Let's remove iteration on Number in Base itself? :-)

More seriously, I agree we should try removing these methods and only reintroduce them if we realize they are really needed. Testing DataArrays first is a good idea, I'll do that after updating it to take into account recent changes in Nulls.jl.

@nalimilan
Copy link
Member

Just tried it, there are only a few lines to change. Three of them explicitly tested the removed methods, so that's expected. Two others are of the form:

dvstr = @data ["one", "two", null, "four"]
all([length(x)::Int for x in dvstr] == [3, 3, 1, 4])

I think they also qualify as non-use cases.

So overall I think we should remove them. We could also imagine providing length, size and ndims, but returning null. Probably better remove them first, and add the versions returning null only if they sound useful.

@JeffBezanson
Copy link
Author

For clarity, the full set of methods I'm talking about is: length, size, ndims, getindex, start, next, done

@ararslan
Copy link
Member

ararslan commented Oct 3, 2017

I think @ararslan mentioned at one point that null should behave like a Number in those cases

I did? o_O

I think that null should be able to represent missing data of any kind, as it does in SQL, but I don't think it should silently behave like a number in all cases.

@nalimilan
Copy link
Member

See #40.

@quinnj quinnj closed this as completed in #40 Oct 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants