Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Future-proofing 0.3? #7774

Closed
timholy opened this issue Jul 29, 2014 · 25 comments
Closed

Future-proofing 0.3? #7774

timholy opened this issue Jul 29, 2014 · 25 comments
Labels
needs decision A decision on this change is needed
Milestone

Comments

@timholy
Copy link
Member

timholy commented Jul 29, 2014

With 0.3 around the corner, it belatedly occurred to me to wonder whether there's anything we should do to help users write code that won't break with some of the likely changes in 0.4. I'm not worried about error messages, but I am concerned about changes in behavior that result in bugs. On balance I'm not convinced that we should do anything, but I thought I'd share the thought and ask if anyone else can think of compelling examples.

Possible breaking changes:

  • b = a[2:n-1] returns a view rather than a copy: will break code that assumes b can be modified without affecting a. Could be resolved by introducing into 0.3 something like b = getcopy(a, 2:n-1). In 0.3 getcopy would just be an alias for getindex, but that could change to copy(getindex(a, 2:n-1)) once the change is made.
  • Changes in promotion behavior, like Uint8+Uint8->Uint8 rather than the current Uint. This will cause overflows particularly in code that accumulates over multiple elements of an array. However, we already have widen, so I doubt this requires any changes to julia. EDIT: one way to test this hypothesis is to introduce widen in reduce.jl, reducedim.jl and see what breaks.
  • drop dimensions indexed with a scalar? #5949. Since we don't yet know what we'll do there (if anything), I don't think there's anything that can be done.
@timholy timholy added this to the 0.3 milestone Jul 29, 2014
@catawbasam
Copy link
Contributor

Thank you for raising this.
You have already helped raise (my) awareness by pointing out some of the key changes.
The 0.4 milestone labels in the issues are also somewhat helpful.

Are there other documents available regarding future changes we should be aware of?

@lindahua
Copy link
Contributor

Thanks for raising this.

Changing getindex to return a view is a real concern. One case I just encountered yesterday is:

centers = X[:, iseeds]
... then update centers as using K-means ...

What I am now doing as a safe guard is to do copy(view(X, :, iseeds)).

@lindahua
Copy link
Contributor

My concern is that adding getcopy won't help a lot.

A lot of such codes have been buried deep in current codebase and packages, people will virtually have to examine line by line to spot them. Also, I am not sure how many people will pay attention to getcopy and functions like these that are added right before release.

I would say just change the behavior (if we decide to do so), announce loudly & broadly to notify people, and give them a period to fix all breakages (before we release 0.4).

That being said, getcopy is still very useful, when we want to make sure that it returns a copy but not making copies twice:

x[:, i]  # make a copy in 0.3, return a view in 0.4
copy(x[:,i]) # make copies twice in 0.3, make the copy once in 0.4
getcopy(x, :, i) # always make exactly one copy (in both versions)

@tkelman
Copy link
Contributor

tkelman commented Jul 29, 2014

How different will slices-as-views in 0.4 end up relative to what's available now in the ArrayViews package? If @lindahua's approach with copy(view(x, :, i)) is as fast and functionally identical to both x[:, i] in 0.3 and copy(x[:, i]) in 0.4, maybe we just encourage more packages to start using ArrayViews?

@lindahua
Copy link
Contributor

I have a PR a few months ago to incorporate view in Julia Base. The decision was that this should be done together with changing the behavior of getindex during 0.4 cycle. I think in 0.4 cycle, the new behavior of getindex will be based on view.

ArrayViews have been used in several important packages for months, and I think performance and reliability (in typical cases) has been tested.

@tkelman
Copy link
Contributor

tkelman commented Jul 29, 2014

That's basically my point. Is the package solution good enough for that bullet item then?

(Oh and continued thanks to you, @timholy, and everyone else for writing awesome fast code that we can all make use of.)

@timholy
Copy link
Member Author

timholy commented Jul 29, 2014

@lindahua, I agree that it won't fix the problem. At best it gives conscientious & aware package developers some mechanism to avoid code-breakages at an inconvenient time, and to avoid having to ask all their users to upgrade their packages if they have also upgraded julia. But it may not be worth introducing the extra name, unless we plan to keep it long-term.

@tkelman, I think the problem is that once a[ind] returns view(a, ind) (most of the time; a[[2,3,5]] will still return a copy in 0.4), some people will have started using views who formerly were not. So many people who may not have caught these bugs when developing the code will get bitten by them.

@tkelman
Copy link
Contributor

tkelman commented Jul 29, 2014

Yes, the change to views for a[ind] has major potential for breakage. Needs loud warnings and lots of testing. I don't see how introducing a new name for performance portability helps with that, when a package-based solution exists now with comparable migration effort in terms of manually modifying syntax.

@JeffBezanson
Copy link
Member

Yesterday I had the idea --- unfortunately I didn't post it at that point --- that the next person to add something to the 0.3 milestone should have to buy everybody dinner.

@ivarne
Copy link
Member

ivarne commented Jul 29, 2014

👍

@timholy
Copy link
Member Author

timholy commented Jul 29, 2014

Indian or Mexican?

@jiahao
Copy link
Member

jiahao commented Jul 29, 2014

craigieonmain.com

@lindahua
Copy link
Contributor

@timholy While ArrayViews hasn't supported this, I think it is reasonable to have a[[2, 3, 5]] to return a view, an entity that would maintain a reference to the index vector.

@JeffBezanson
Copy link
Member

+1 for Jiahao's suggestion.

@timholy
Copy link
Member Author

timholy commented Jul 29, 2014

@tkelman, my suggestion applies for people who are not even using the ArrayViews package. Silly example:

a = rand(5)
b = a[2:4]
if b[2] < 1
    b[2] = 1
end
if sum(b) > sum(a[2:4])
    error("Whoa, your inputs messed up. Please use valid inputs")
end

This code will trigger an error as intended on 0.3 but silently proceed on 0.4.

@timholy
Copy link
Member Author

timholy commented Jul 29, 2014

Boring. How about this one.

@johnmyleswhite
Copy link
Member

FWIW, I think the best solution isn't to modify 0.3, but to make this change in the first few weeks after 0.3 is launched. That gives people a long period of time to get things to start working with views.

@timholy
Copy link
Member Author

timholy commented Jul 29, 2014

Not a bad idea, especially if we don't really want getcopy hanging around long-term. Supporting both 0.3 and 0.4 with the same package may force each package author to define getcopy for 0.3, but that's not horrible.

@timholy
Copy link
Member Author

timholy commented Jul 29, 2014

I'm actually a little more concerned about the breakage from the second bullet item, but I suspect we're largely OK with widen. I imagine that functions that perform reductions should, in the future be written something like this:

function mysum{T}(A::AbstractArray{T})
    Taccum = typeof(one(widen(T)) + one(T))
    s = zero(Taccum)
    for a in A
        s += a
    end
    a
end

There's one problem, however: currently

julia> widen(Array{Int,2})
ERROR: `widen` has no method matching widen(::Type{Array{Int64,2}})

So for writing generic code, maybe we should add a fallback widen(T::Type) = T.

@StefanKarpinski
Copy link
Member

I suspect (especially since my %if proposal was reviled by every single person in the history of the universe) that supporting both 0.3 and 0.4 in the same code base is going to be a lost cause, so I'm against bothering with this since I suspect there's going to have to be a versioning split for almost every significant package. If you look at the 0.3 release notes, it was really a very conservative release. I don't think that 0.4 is going to be nearly so conservative.

@timholy
Copy link
Member Author

timholy commented Jul 29, 2014

OK, I'll close this. Sorry about the dinner.

@timholy timholy closed this as completed Jul 29, 2014
@quinnj
Copy link
Member

quinnj commented Jul 29, 2014

title

@IainNZ
Copy link
Member

IainNZ commented Jul 29, 2014

My concern is that Julia was usable pretty much from the start for real work, so naturally people started using it. Unfortunately that kinda makes it feel bad to break peoples code, but I worry that'll limit what Julia 1.0 can be. I say: break everything, don't hold back. Maybe even do what Linux used to do with alternating releases?

0.3 is going to be pretty pleasant to use for quite a while (although if precompiled packages become a thing... it'll be hard to stick with 0.3 :D).

@StefanKarpinski
Copy link
Member

Since 0.3 is so stable and a number of things are likely to break in 0.4, it may be worth the additional effort to do more maintenance of the 0.3 release branch than we have previously.

@JeffBezanson
Copy link
Member

Yes, I think so. I also think another effect of the long 0.3 release cycle has been to provide more time to think about what might go into 0.4, so I am all charged up to do lots of big changes. This makes me see the value of alternating stable/less-stable releases.

vtjnash referenced this issue Aug 8, 2014
…mat.

ref #7893

- remove unnecessary extra serialization of length(x.names)
- UTF16String and UTF32String now have compact type tags
- don't serialize () for types with no parameters
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs decision A decision on this change is needed
Projects
None yet
Development

No branches or pull requests