-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation-defined limits #335
Comments
I don't think we should limit (4); it's more akin to a limit on the number of runtime memory pages, or an OOM situation. The other examples are limitations on the size of modules the engine will accept. For 1, I suggest we keep the limit of 1 million (total) types and additionally limit the number in a recursion group to 1 million. |
FWIW this section of the spec also specifies that at runtime "the maximum number of pages of a memory is 65536" so there is precedent and opportunity to specify runtime limits as well. I don't feel strongly that it's necessary to do so, though. |
The memory size limit is the maximum addressable size, so isn't really a policy limit. :) With arrays there are two problems: exhausting the heap with a single object that can't be allocated (especially when using 32-bit heaps such as using compressed pointers), or allocating and using so much memory that it causes the client machine to swap to death (if the arrays themselves are stored outside the heap, like ArrayBuffers in JS, or with a 64-bit heap). In general browser and node environments handle these cases poorly but everybody assumes it's fine to crash the tab or hang the computer. |
Another limit we'll need to decide is the maximum subtype depth for type definitions. This is useful for preventing an |
31 seems low. In my brief searching I couldn't find any good measurements of maximum Java class depth seen in the wild, but this post from 2006 suggests that at that point the JVM supported a depth of 60: https://www.javaspecialists.eu/archive/Issue121-How-Deep-is-Your-Hierarchy.html. Would a limit of 64 (or 63) be acceptable? |
That seems fine to us. I'm not sure if @jakobkummerow has a different opinion though as 31 came from V8. Also regarding dynamic array size limits, we have one internally to simplify some code generation logic. It's a byte length limit of ~1.9 GiB (1987654321 bytes). We probably could round that up to 2GiB, and that would align with our 32-bit platform ArrayBuffer limit. |
Regarding recursion group and type definitions limits, I would suggest we limit the total recursion groups to 1 million (effectively re-interpreting the existing limit) and add a limit of 1 million total types (when recursion groups are flattened to form the type index space). This will ensure that either index space (recursion groups or type definitions) have a nice upper bound, which is where these limits help out the most IMO. A pre-declaration of the total type definitions would be nice for being able to pre-size some buffers, but I don't think that's easily add-able to the binary format and so I'm fine with the limit not being triggered eagerly. |
We picked 31 as the subtyping depth limit back when we needed to store value kind, heap type index, and subtyping depth in a single 32-bit field (for explicit "rtt with depth" values, which have since been scrapped from the proposal). With the current design there is no shortage of bits, so I have no hard objections to raising the subtyping depth limit to 63 (or 60, if we think the ability to store internal sentinel values might come in handy). That said, I also haven't yet heard of a case where 31 would have been insufficient, so I'm also fine with keeping 31 until someone actually runs into that limit. |
If we add explicit rtt's back, will they require the depth to be part of the value type as before? If so, then I'd prefer to start with limiting to 31 and expand if needed in the future. We ran into the bit packing problem as well, and it's nice to spare the bits. |
I've scheduled this for discussion at Tuesday's subgroup meeting. It would be great if we could close this issue out at that meeting. Right now the leading proposed limits are:
We have not discussed limits on:
|
Probably won't be able to attend the next meeting, but I'm generally agnostic to the specifics of limits. Merely the 31 subtyping depth limit seems very low -- if I understood @chambart and @zapashcanon correctly, then they currently use much deeper hierarchies for their Ocaml backend. |
Array sizes: I'm fine with not specifying a particular limit. FWIW, V8 currently has a general per-object size limit of 1 GiB on its managed heap. Struct fields: We initially tried to get by with a limit of 999 struct fields, but got feedback from partners that that wasn't enough. Our currently implemented limit is 2,000, and that appears to be sufficient. I wouldn't object to raising it a little further; that said for implementing fast trap-handler based null checks it's convenient if the maximum struct size isn't overly large, so I wouldn't raise the limit much farther than necessary.
Subtyping depth: I have nothing to add to the discussion above (the handful of posts dated December 2022). |
For its closure representation, Dart currently uses a subtyping depth corresponding to the (maximum) number of positional parameters. This could be optimized to skip parameter counts that do not occur in the program. Still, 31 seems low. |
For OCaml we have different representation strategies for structs. (type $s1 (struct (field (ref eq))))
(type $s2 (sub $s1 (struct (field (ref eq)) (field (ref eq)))))
(type $s3 (sub $s2 (struct (field (ref eq)) (field (ref eq)) (field (ref eq)))))
... And the largest 'struct' in OCaml are compilation units (which are also modules that the user can manipulate). Those tend to be quite large and can have hundred or thousands of fields. For that compilation strategy we would need long subtyping chains. The alternative would be to compile every struct to an array which would require bound checks for every field access and limit the precision of the type that we represent since every field would have be of the same type. |
Since we have some minor concerns about the struct field limit, we might be incentivised to keep it low. Implementations can always implement larger structs with an additional indirection. |
In the subgroup meeting today we discussed these limits:
For subtype depth, there were concerns about allowing extremely deep subtyping hierarchies because the space or time costs of casts would become unreasonable. @chambart, how high would you need the subtype depth limit to be for your compilation scheme to work on real-world code? I'm also pretty sure it would be possible to implement casting for deep hierarchies with a logarithmic time and space cost. In addition to the array of k immediate supertypes used for constant time casting, each RTT would store an array of exponentially more distant supertypes. For example, each RTT could store up to 32 immediate supertypes, then the 64th, 128th, 256th, etc. supertypes. To find the 100th supertype, it would suffice to find the 4th supertype of the 32nd supertype of the 64th supertype. Obviously this would be more complicated, but it would be a significant improvement over linear search for large hierarchies. |
I don't know which limit exactly would be the point where I consider switching, but if it is in the thousand I would be happy with it. |
After talking with the V8 team today, it sounds like we would be ok with having a higher limit (e.g. 1000), but we probably wouldn't spend much time optimizing for deep hierarchies right away. We're mostly likely to keep constant time casting without any additional casting schemes to start, so deep hierarchies would consume a good deal of memory. Does that sound like an acceptable outcome? Edit: alternatively we might switch over to linear-time casting for large hierarchies (e.g. depth more than about 100), which would have a performance downside rather than the space downside. @chambart, unfortunately the spec doesn't have any precedent or mechanism to specify algorithmic complexity or performance. In principle we could add a non-normative note, but I think typically we try to stay out of the business of specifying performance. |
@rossberg, since you have historically been a proponent of predictable performance, I'd be curious to hear your thoughts on how to balance the desire to support deep casting hierarchies with the time/space trade off that would require. |
@tlively I kind of feel that on the contrary, the cost model has always been obvious for everything. The gc and exception proposals are the first ones where there are real choices in potential implementation strategies that would affect the algorithmic cost. For instance, it has always been quite clear that a cast until now was a cheap constant time operation. I would rather have a somewhat low limit where I know the cost of casting than a high limit where the best compilation strategy depends on the runtime. Of course for my purpose it is simpler to have a high limit and don't care about the size of structs, but with short subtyping chain we could still emulate large structs with struct trees. This would have a logarithmic cost like what you are suggesting. with a cast per tree level. But given that the vast majority of struct allocated are small, very few trees of would be created and the overall overhead would be quite small. |
Thanks for the input, @chambart. In that case, it sounds like we'll get the strongest consensus for a depth limit of 63 for now. We can consider raising it in the future and discuss complexity/performance/memory trade offs at that point. |
Yes, I don't think engines should do a linear fallback. Better have a hard limit. 63 still seems rather low, though. Out of curiosity, all other limits are exponents of 10, I believe, why not this one? |
63 comes from a desire to make it possible for the depth to be encoded in some of the representation bits of a future explicit RTT type without an extra indirection. I'm not sure exactly how all the bits are allocated, so I don't know if 127 or 255 would also work, for example. @eqrion and @jakobkummerow, how many bits do we have available, assuming we want to maintain the ability to use bit packing for future RTT types? So far we have not seen any use cases that need a depth greater than 63 but also do not need depths that would require linear or logarithmic fallbacks, so it seems like a fine initial limit. We can always consider raising it later if a need arises. |
The bit packing we had (back when we supported RTT-with-depth) for a 32-bit
We could probably come up with a different packing scheme if necessary, but why make it necessary without a driving use case? Most but not all other limits are powers of 10. Module size, function body size, number of locals, and number of memory pages are not. |
We currently use 64-bits for
That leaves 5 bits free currently, for 0-31 range of rtt depths. There is some slop though, we probably could rework the type code and nullable to free a bit. We also could free one to three bits from the type definition pointer by taking advantage of alignment and pointer ranges better. So we're comfortable with a limit of 63 as 6-bits for depth seems reasonably future proof, with the ability to be raised later. |
Based on discussion in #335 and subgroup meetings.
Based on discussion in #335 and subgroup meetings.
Based on discussion in #335 and subgroup meetings.
Based on discussion in #335 and subgroup meetings.
Resolved by #360. If there are any other limits we should specify, please open a new issue. |
The JS spec has a section on implementation-defined limits. Several new limits will have to be introduced for the GC proposal.
The limit on the number of types is currently one million. We should define a limits on the number of rec groups and the size of each rec group. Should we reinterpret the current limit as the limit on rec groups or should we additionally enforce a limit on the total number of types across all rec groups? Should we put the total number of types into the binary format of the type section so engines can continue to check this limit eagerly?
We should define a limit on the number of fields allowed in a struct.
We should define a limit on the number of arguments
array.new_fixed
can take (keeping in mind that this will be used to initialize very large global arrays assuming Initializing array from data or element segment #260 is not solved for the MVP).We should define runtime limits on the size of arrays.
And I am probably missing additional limits as well.
The text was updated successfully, but these errors were encountered: