-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict capacity of plists to 2^28 resp. 2^60 (revision of #2039) #2064
Restrict capacity of plists to 2^28 resp. 2^60 (revision of #2039) #2064
Conversation
src/plist.c
Outdated
if (!UINT_FITS_IN_INTOBJ(need)) { | ||
ErrorMayQuit("GrowPlist: List size too large", 0, 0); | ||
} | ||
|
||
/* find out how large the plain list should become */ | ||
good = 5 * (SIZE_OBJ(list)/sizeof(Obj)-1) / 4 + 4; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would check the overflow here:
if (! UINT_FITS_IN_INTOBJ(good))
good = (((UInt)1) << NR_SMALL_INT_BITS) - 1;
src/plist.c
Outdated
/* find out how large the plain list should become */ | ||
good = 5 * (SIZE_OBJ(list)/sizeof(Obj)-1) / 4 + 4; | ||
|
||
/* but maybe we need more */ | ||
if ( need < good ) { plen = good; } | ||
else { plen = need; } | ||
if (need < good && UINT_FITS_IN_INTOBJ(good)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then this can be simplified to
if (need < good)
In your version adding and appending to lists of length > 4/5 * maximal length becomes very inefficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent point, thanks, I'll update this PR.
I always thought that the length of plists is restricted to 2^28-1 (resp. 2^60-1). I agree that for the reasons given above this is not a serious restriction. I guess a lot of code does actually assume that the length of a plist fits into an immediate integer. Therefore I consider this pull request as a sensible bug fix. Maybe this restriction should be mentioned somewhere in the manual? A possible test on 32 bit systems could be
(Or use Side remark:
On my 64 bit systems I can even call 32-bit gap with |
I'll look into addressing all of that. I'll also replace |
That is, use ObjInt_UInt instead of INTOBJ_INT to convert the object size into a GAP integer object, and handle T_INT and T_FFE inputs explicitly. Also replace the kernel function SHALLOW_SIZE with a synonym pointing to SIZE_OBJ. While the former name may be less misleading (a user might expect SIZE_OBJ to include the sizes of subobjects), the latter name matches the name of the kernel function, just like with TNUM_OBJ and TNAM_OBJ, which simplifies life for kernel developers who want to use these UNDOCUMENTED, INTERNAL and potentially DANGEROUS APIs. As such, clarity of names is not a primary concern.
Also convert as much code as possible to use the new constants, and tweak some additional places using NR_SMALL_INT_BITS: * FindCommonField used it in a bounds check, but not quite right; use INT_INTOBJ_MAX instead * FuncGASMAN_STATS was simplified by using ObjInt_Int * ProdIntObj and PowObjInt needlessly used NR_SMALL_INT_BITS+1, changed to avoid the +1 * ModInt and RemInt included a check 'i <= (1L<<NR_SMALL_INT_BITS)' for an immediate value i, which of course was always true, so removed it
By adding bounds checks to GrowPlist and NEW_PLIST, we restrict the capacity of plain lists to the maximum value of an immediate integer, i.e. 2^28 on a 32 bit system and 2^60 on a 64 bit system. The check in GrowPlist triggers an error, as it can be triggered by the user. The check in NEW_PLIST is a run-time assertion, and should never be triggered by user actions, only by buggy kernel code. This restriction fixes overflows and other problems, which can lead to crashes, corrupt data or nonsense computations. It poses no actual limitation in practice, for the following reasons: First off, many other places already effectively limited the length of a plist they can interact with to the maximum value of an immediate integer. E.g. such limitations exist for sublist access via l{poss}, EmptyPlist(), ASS_PLIST_DEFAULT, and more. Secondly, with this change, the effective size (in bytes) of a plist of this maximal length would be 2^30 resp. 2^63. The latter certainly poses no actual limitation. The former corresponds to 1 GB. While GAP in principle supports slightly larger bags on 32 bit systems (the GASMAN heap can grow up to almost 4GB if the host system supports it),, there is not much you can do with GAP if most of its heap is filled with a gigantic plist, so this restriction seems acceptable.
I have now addressed most of the points @frankluebeck brought up, I hope, plus some more:
I also looked into adding the length restriction to the manual, but couldn't find a good place. We already mention a similar restriction for ranges (were the two ends must be immediate integers). But I couldn't find any section dedicated to plain lists / "internally represented lists", although they are mentioned on and off. Perhaps such a section should be added? Or perhaps it's there and I just missed it? I also noticed that section 21.2 "Basic Operations for Lists" starts out saying "The basic operations for lists are ..." and then provides a set of points which, curiously enough, does not cover creation of lists. I found no place in the whole manual where this is described (I focused on checking chapter 21, as well as chapter 4, "The Programming Language") |
c577b2e
to
2e12aa4
Compare
UInt bound = 1UL << NR_SMALL_INT_BITS; | ||
|
||
if (i < bound) { | ||
if (i <= INT_INTOBJ_MAX) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is relying on how C handles signed / unsigned comparisons. I can't remember his they come out, and would have a small preference for assigning INT_INTOBJ_MAX to a UInt, just in case future people worry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I briefly thought about adding an UINT_INTOBJ_MAX
, but the C standard is pretty clear on this, and always has been. In a (simplified) nutshell, a signed and unsigned value are compared by casting the signed value to unsigned, which is always what we want for these bounds checks.
In short, INT_INTOBJ_MAX
and UINT_INTOBJ_MAX
would always do exactly the same thing, except when accidentally using UINT_INTOBJ_MAX
to bound check a signed quantity... As such, UINT_INTOBJ_MAX
could be misused by accident. I don't see a way to misuse INT_INTOBJ_MAX
by accident.
@fingolfin I will omit this from release notes: as #2039 sats, it ensures that GrowPlist checks the size we are growing the list to is a valid list size, and stops us then overflowing in ResizeBag later. This is an internal functionality. |
This is a rebased version of PR #2039, with the fixup commit squashed, and a new size restriction assert added to
NEW_PLIST
. As such, it addresses parts of issue #1479 (but does not fully resolve it; at least one overflow bug in GASMAN needs to be fixed, as well a bug in one of the GNU libcmemmove
implementations).I also provided a new, extensive commit message for the second commit, and since that now makes the majority of the work in that commit, took the liberty of changing authorship of that commit to myself. Here is the new commit message:
By adding bounds checks to GrowPlist and NEW_PLIST, we restrict the capacity
of plain lists to the maximum value of an immediate integer, i.e. 2^28 on a
32 bit system and 2^60 on a 64 bit system. The check in GrowPlist triggers
an error, as it can be triggered by the user. The check in NEW_PLIST is a
run-time assertion, and should never be triggered by user actions, only by
buggy kernel code.
This restriction fixes overflows and other problems, which can lead to
crashes, corrupt data or nonsense computations.
It poses no actual limitation in practice, for the following reasons:
First off, many other places already effectively limited the length of a
plist they can interact with to the maximum value of an immediate integer.
E.g. such limitations exist for sublist access via l{poss}, EmptyPlist(),
ASS_PLIST_DEFAULT, and more.
Secondly, with this change, the effective size (in bytes) of a plist of
this maximal length would be 2^30 resp. 2^63. The latter certainly poses no
actual limitation. The former corresponds to 1 GB. Conceivably, GAP could
support slightly larger bags on a 32bit system (the GASMAN heap can grow up
to 3GB if the host system supports it). However, there is not much you can
do with GAP if most of its heap is filled with a single gigantic plist, so
this restriction seems acceptable.