-
Notifications
You must be signed in to change notification settings - Fork 7.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix GH-16878: gmp_fact overflow on memory allocation attempt. #16880
base: PHP-8.2
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic makes sense to me, but please address the comment.
ext/gmp/gmp.c
Outdated
RETURN_THROWS(); | ||
} | ||
} else { | ||
mpz_ptr gmpnum; | ||
gmp_temp_t temp_a; | ||
|
||
FETCH_GMP_ZVAL(gmpnum, a_arg, temp_a, 1); | ||
long r = mpz_get_si(gmpnum); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please replace the call to zval_get_long(a_arg)
below to use the new variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use mpz_get_si()
here in the first place, since it may overflow for large gmpnum
. Possibly use mpz_sizeinbase(gmpnum, 2)
instead; that might need some fine-tuning of maxbits
, though.
ext/gmp/gmp.c
Outdated
RETURN_THROWS(); | ||
} | ||
} else { | ||
mpz_ptr gmpnum; | ||
gmp_temp_t temp_a; | ||
|
||
FETCH_GMP_ZVAL(gmpnum, a_arg, temp_a, 1); | ||
long r = mpz_get_si(gmpnum); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use mpz_get_si()
here in the first place, since it may overflow for large gmpnum
. Possibly use mpz_sizeinbase(gmpnum, 2)
instead; that might need some fine-tuning of maxbits
, though.
ext/gmp/gmp.c
Outdated
#if SIZEOF_SIZE_T == 4 | ||
const zend_long maxbits = ULONG_MAX / GMP_NUMB_BITS; | ||
#else | ||
const zend_long maxbits = INT_MAX; | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be moved to the top-level, since we're already using it in gmp_random_bits()
, and may want to re-use it elsewhere. (Could also be a macro; doesn't really matter, I think.)
#if SIZEOF_SIZE_T == 4 | ||
#define GMP_ALLOC_MAXBITS (ULONG_MAX / GMP_NUMB_BITS) | ||
#else | ||
#define GMP_ALLOC_MAXBITS INT_MAX | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic to avoid overflow in _mpz_realloc()
(libgmp 6.3.0) is:
if (sizeof (mp_size_t) == sizeof (int))
{
if (UNLIKELY (new_alloc > ULONG_MAX / GMP_NUMB_BITS))
MPZ_OVERFLOW;
}
else
{
if (UNLIKELY (new_alloc > INT_MAX))
MPZ_OVERFLOW;
}
Assuming that mp_size_t
is actually size_t
, the simplification seems to be correct (we're assuming sizeof(int) == 4
anyway). However, I do not understand how ULONG_MAX
fits into the mix. That only works under the assumption that ULONG_MAX == UINT_MAX
for 32bit platforms (which is what php-src assumes anyway), but why don't they use UINT_MAX
in the first place? I think we should.
if (mpz_sgn(gmpnum) < 0) { | ||
zend_argument_value_error(1, "must be greater than or equal to 0"); | ||
(void)gmpnum; | ||
val = zval_get_long(a_arg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand how zval_get_long()
fits into the mix here. At least, its usage makes FETCH_GMP__ZVAL()
above superflous, and we wouldn't even need the special casing for Z_TYPE_P(a_arg) == IS_LONG
(since this is already catered to by zval_get_long()
.
What happens if one calls gmp_fact("18446744073709551617")
? Wouldn't that raise a deprecation notice ('Implicit conversion from float-string "18446744073709551617" to int loses precision') instead of throwing an ValueError
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if one calls
gmp_fact("18446744073709551617")
? Wouldn't that raise a deprecation notice ('Implicit conversion from float-string "18446744073709551617" to int loses precision') instead of throwing anValueError
.
Nope I get the exception in this case too. I ll see the rest later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, indeed! That is because the conversion happens in coercive mode. And that also shows that we can't easily drop the seemingly superfluous FETCH_GMP_ZVAL()
since only that will report issues regarding strict typing mode (e.g. passing a float). So zval_get_long()
is basically correct here; the saturating behavior is fine since we cannot compute the factorial of PHP_INT_MAX
anyway.
Still, there is an issue unrelated to this PR
E.g. for gmp_fact(new GMP("18446744073709551617"))
zval_get_long()
returns 1 which passes the overflow check, and evaluates to GMP(1)
. That is because the mpz_get_si()
in gmp_cast_object()
overflows, while I would expect saturating behavior. This is not related to this PR, though, but rather a general issue with ext/gmp. From the mpz_get_si()
documentation:
If op fits into a signed long int return the value of op. Otherwise return the least significant part of op, with the same sign as op.
If op is too big to fit in a signed long int, the returned result is probably not very useful. To find out if the value will fit, use the function mpz_fits_slong_p.
So apparently, there is an easy fix for gmp_cast_object()
. However, on LLP64 mpir 3.0.0 mpz_fits_slong_p()
returns false for 2147483648. ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand how
zval_get_long()
fits into the mix here. At least, its usage makesFETCH_GMP__ZVAL()
above superflous, and we wouldn't even need the special casing forZ_TYPE_P(a_arg) == IS_LONG
(since this is already catered to byzval_get_long()
.What happens if one calls
gmp_fact("18446744073709551617")
? Wouldn't that raise a deprecation notice ('Implicit conversion from float-string "18446744073709551617" to int loses precision') instead of throwing anValueError
.
The zval_get_long()
is to convert the GMP object back to an int...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
zval_get_long()
is to convert the GMP object back to an int...
Yeah, I figured that much, but I wondered about values outside of [ZEND_LONG_MIN, ZEND_LONG_MAX]
. However, I found out. :)
On my machine (64bit) the largest factorial I can attempt to compute is |
Well that still OOMs on 48GB of RAM so, that's not the upper boundary then. |
Roughly 4294967296! might fit into 16GB. 16GB == 2^37 bits is still quite excessive. Especially if we revert/do not introduce these bounds check for stable versions, in my opinion, we could go with far less, maybe even only 2^24 bits (in which case |
Yeah, I was just trying to see what was theoretically possible. :) I am happy to restrict it to |
No description provided.