Determine the size of field descriptors using a estimated type size #11888

yuyichao · 2015-06-27T03:11:25Z

This fixes #11884 .

Passing too many arguments to a function or constructing a structure/tuple of GB's size is arguably not a good idea. The primary goal of this PR is not to encourage that but I do feel like the current limit 16bit integer is a little bit too small.
This does not solve the problem in issue long tuple problems #11320 (hopefully github's close issue hook does not pick this one up) so passing big tuples around / constructing them might still be an issue. On the other hand, this change should be almost orthogonal to that issue and won't make the situation much worse.
The current limit is 2^31 - 1 and this is mainly because DataType.size is a Int32. According to @jiahao , this was because of not letting unsigned integer types polluting calculations. It might not be a problem now anymore but I think this limit should be good enough for now and I don't want to risk introducing subtle type stability issues.

carnaval · 2015-06-28T20:14:20Z

I don't remember if you started doing it this way yesterday but it would make more sense IMO to switch between this 32 bit repr and an 8 bit repr (which will be "most" cases anyway, so we might at least take advantage of the change to scrap a little space here) based on a bit flag in the object header.

yuyichao · 2015-06-28T20:29:10Z

@carnaval I had a look yesterday but I haven't started to fully implement it yet.

The issue is that we don't necessarily know the size of a field when we instantiate the type and I would have to give up and be conservative in such case. (Unless the GC support moving and resizing :P ). This can probably already save us a lot of memory and I just need to make sure the correct one is picked everywhere.

carnaval · 2015-06-28T20:32:23Z

Yep, I remember now. So the "only" fully general solution is two passes ? I kinda like it but I'm not sure why. I'll probably reserve any opinion until tomorrow :)

yuyichao · 2015-06-28T20:36:55Z

Yes, two pass may do it although you still need to be careful what to do with self referencing (direct or indirect) types and the type cache as well as where to store the information in the first pass.

IMHO, using two passes to solve this issue is a little bit too complicated and requires more work than if we only use it to resolve the issue of having invalid types in the type cache (which should arguably be fixed...). After all, this is invisible to user and is only for saving memory...

carnaval · 2015-06-28T20:45:47Z

Self referencing is fine for sizing because it's always through a pointer. I agree that it's a bit over complex, but 4x (well, 2x for now) waste to cover edge cases bothers me.

yuyichao · 2015-06-28T20:53:47Z

Yes, direct self referencing is fine since it can be special cased. However, I don't think indirect self referencing can be identified without actually trying to instantiate the type.

E.g. (I guess this is your "corner case")

julia> immutable A{T}
       c::T
       end

julia> immutable C{T}
       a::A{C{T}}
       end

julia> xdump(A{C{Int}})
A{C{Int64}}::DataType  <: Any
  c::C{Int64}::DataType  <: Any
    a::A{C{Int64}}::DataType  <: Any
      c::C{Int64}::DataType  <: Any
        a::A{C{Int64}}::DataType  <: Any
          c::C{Int64}::DataType  <: Any

JeffBezanson · 2015-08-24T20:26:02Z

I rebased this. It would be really nice to get some version of this working before 0.4-final. Even a conservative approximation of when to use 16 or 32 bits would be better than the current situation. It's just too easy to write a big array literal and get an overflow.

yuyichao · 2015-08-24T20:43:01Z

OK, I'll try to get a conservative version soon.

JeffBezanson · 2015-08-24T21:07:16Z

Oops, I missed one merge conflict. Should be fixed now.

yuyichao · 2015-09-01T18:02:58Z

I added a conservative estimation of the type size. It was a little hard to find a balance between doing too much work and being too conservative. The current version should handle most of the cases except parametrized immutable types.

The current version passes compilation locally although there might still be issues. The overflow check also needs to be updated.

Let's see what the CI think and I'll also go through this again.

yuyichao · 2015-09-01T19:58:48Z

Surprised that it actually didn't segfault during the test...

yuyichao · 2015-09-08T14:04:46Z

@JeffBezanson Could you please have a look at the implementation of the size estimation?

I think there might be some cases where I'm too conservative but I believe (although I might be wrong). This is also not looking in the type cache/stack so it might be doing more passes than what is necessary.

tkelman · 2015-09-09T17:58:05Z

@JeffBezanson did you switch the 0.4.1 and 0.4.x milestones? I'd really rather have 0.4.x be nonspecific things and leave 0.4.1 for actually tracking what makes it into the first backport.

JeffBezanson · 2015-09-09T19:05:38Z

Yes let's switch them. This particular issue is a good candidate for 0.4.1
though.
On Sep 9, 2015 1:58 PM, "Tony Kelman" notifications@github.com wrote:

@JeffBezanson https://github.com/jeffbezanson did you switch the 0.4.1
and 0.4.x milestones? I'd really rather have 0.4.x be nonspecific things
and leave 0.4.1 for actually tracking what makes it into the first backport.

—
Reply to this email directly or view it on GitHub
#11888 (comment).

yuyichao · 2015-09-09T19:18:07Z

This breaks the api/abi for jl_datatype_t so it would be nice to at least get the first half of this in 0.4.0 if we want this in any of 0.4.x.

tkelman · 2015-09-09T19:23:06Z

Breaking embedding clients between 0.4.0 and 0.4.1 wouldn't be very nice of us. Let's not do that.

yuyichao · 2015-09-09T19:36:53Z

I reordered the commit so that the relatively safe part (and also the breaking part) should be all on the yyc/type_size-0.4.0 branch. This PR only have the size estimation commit on top of that.

This avoids the JuliaLang/julia#11884 issue while waiting for JuliaLang/julia#11888 to be merged.

…to use

tkelman · 2016-05-12T16:39:45Z

@yuyichao is this still relevant? was tagged 0.4.x...

yuyichao · 2016-05-12T16:43:25Z

This is still relevant. Maybe not particularly performance critical and needs a rewrite.

vtjnash · 2016-07-04T04:34:44Z

superseded by #17231

yuyichao force-pushed the yyc/type_size branch from 43c3f49 to c09508f Compare June 29, 2015 11:04

yuyichao mentioned this pull request Jul 16, 2015

Long array literal leads to OverflowError #12171

Closed

JeffBezanson force-pushed the yyc/type_size branch from c09508f to cda865b Compare August 24, 2015 20:23

JeffBezanson force-pushed the yyc/type_size branch from cda865b to a62a8e5 Compare August 24, 2015 21:06

yuyichao force-pushed the yyc/type_size branch 4 times, most recently from 0531afb to 0551f5e Compare August 31, 2015 20:51

yuyichao force-pushed the yyc/type_size branch from 0551f5e to 3665260 Compare September 1, 2015 17:30

yuyichao force-pushed the yyc/type_size branch 5 times, most recently from 536cd61 to 78d377b Compare September 8, 2015 13:57

JeffBezanson added this to the 0.4.1 milestone Sep 9, 2015

yuyichao force-pushed the yyc/type_size branch from 78d377b to f320223 Compare September 9, 2015 19:19

yuyichao force-pushed the yyc/type_size branch from f320223 to b05590a Compare September 9, 2015 19:32

yuyichao mentioned this pull request Sep 15, 2015

Regression for loading a datafile #13137

Closed

tomasaschan added a commit to JuliaGeometry/Contour.jl that referenced this pull request Sep 15, 2015

Split large literal dataset in smaller chunks

b0ab09a

This avoids the JuliaLang/julia#11884 issue while waiting for JuliaLang/julia#11888 to be merged.

tomasaschan mentioned this pull request Sep 15, 2015

Split large literal dataset in smaller chunks JuliaGeometry/Contour.jl#22

Merged

tomasaschan added a commit to JuliaGeometry/Contour.jl that referenced this pull request Sep 15, 2015

Split large literal dataset in smaller chunks

266ff5e

This avoids the JuliaLang/julia#11884 issue while waiting for JuliaLang/julia#11888 to be merged.

yuyichao force-pushed the yyc/type_size branch from b05590a to f482f16 Compare September 15, 2015 15:50

yuyichao mentioned this pull request Sep 15, 2015

Variable size field descriptors #13147

Merged

yuyichao force-pushed the yyc/type_size branch 3 times, most recently from 767390f to b5c9999 Compare September 16, 2015 11:27

Use a estimated type size to determine the size of field descriptors …

60fe7e1

…to use

yuyichao force-pushed the yyc/type_size branch from b5c9999 to 60fe7e1 Compare September 16, 2015 18:47

yuyichao changed the title ~~Make size and offset in jl_fielddesc_t 32bit length~~ Determine the size of field descriptors using a estimated type size Sep 17, 2015

tkelman removed this from the 0.4.x milestone May 12, 2016

vtjnash closed this Jul 4, 2016

yuyichao deleted the yyc/type_size branch July 4, 2016 04:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Determine the size of field descriptors using a estimated type size #11888

Determine the size of field descriptors using a estimated type size #11888

yuyichao commented Jun 27, 2015

carnaval commented Jun 28, 2015

yuyichao commented Jun 28, 2015

carnaval commented Jun 28, 2015

yuyichao commented Jun 28, 2015

carnaval commented Jun 28, 2015

yuyichao commented Jun 28, 2015

JeffBezanson commented Aug 24, 2015

yuyichao commented Aug 24, 2015

JeffBezanson commented Aug 24, 2015

yuyichao commented Sep 1, 2015

yuyichao commented Sep 1, 2015

yuyichao commented Sep 8, 2015

tkelman commented Sep 9, 2015

JeffBezanson commented Sep 9, 2015

yuyichao commented Sep 9, 2015

tkelman commented Sep 9, 2015

yuyichao commented Sep 9, 2015

tkelman commented May 12, 2016

yuyichao commented May 12, 2016

vtjnash commented Jul 4, 2016

Determine the size of field descriptors using a estimated type size #11888

Determine the size of field descriptors using a estimated type size #11888

Conversation

yuyichao commented Jun 27, 2015

carnaval commented Jun 28, 2015

yuyichao commented Jun 28, 2015

carnaval commented Jun 28, 2015

yuyichao commented Jun 28, 2015

carnaval commented Jun 28, 2015

yuyichao commented Jun 28, 2015

JeffBezanson commented Aug 24, 2015

yuyichao commented Aug 24, 2015

JeffBezanson commented Aug 24, 2015

yuyichao commented Sep 1, 2015

yuyichao commented Sep 1, 2015

yuyichao commented Sep 8, 2015

tkelman commented Sep 9, 2015

JeffBezanson commented Sep 9, 2015

yuyichao commented Sep 9, 2015

tkelman commented Sep 9, 2015

yuyichao commented Sep 9, 2015

tkelman commented May 12, 2016

yuyichao commented May 12, 2016

vtjnash commented Jul 4, 2016