Fix GH-16878: gmp_fact overflow on memory allocation attempt. #16880

devnexen · 2024-11-21T06:36:59Z

No description provided.

Girgias

The logic makes sense to me, but please address the comment.

Girgias · 2024-11-21T11:01:02Z

ext/gmp/gmp.c

 		gmp_temp_t temp_a;

 		FETCH_GMP_ZVAL(gmpnum, a_arg, temp_a, 1);
+		long r = mpz_get_si(gmpnum);


Please replace the call to zval_get_long(a_arg) below to use the new variable.

Don't use mpz_get_si() here in the first place, since it may overflow for large gmpnum. Possibly use mpz_sizeinbase(gmpnum, 2) instead; that might need some fine-tuning of maxbits, though.

cmb69 · 2024-11-21T12:33:34Z

ext/gmp/gmp.c

 		gmp_temp_t temp_a;

 		FETCH_GMP_ZVAL(gmpnum, a_arg, temp_a, 1);
+		long r = mpz_get_si(gmpnum);


Don't use mpz_get_si() here in the first place, since it may overflow for large gmpnum. Possibly use mpz_sizeinbase(gmpnum, 2) instead; that might need some fine-tuning of maxbits, though.

cmb69 · 2024-11-21T12:35:56Z

ext/gmp/gmp.c

+#if SIZEOF_SIZE_T == 4
+	const zend_long maxbits = ULONG_MAX / GMP_NUMB_BITS;
+#else
+	const zend_long maxbits = INT_MAX;
+#endif


This should be moved to the top-level, since we're already using it in gmp_random_bits(), and may want to re-use it elsewhere. (Could also be a macro; doesn't really matter, I think.)

cmb69 · 2024-11-21T13:15:46Z

ext/gmp/gmp.c

+#if SIZEOF_SIZE_T == 4
+#define GMP_ALLOC_MAXBITS (ULONG_MAX / GMP_NUMB_BITS)
+#else
+#define GMP_ALLOC_MAXBITS INT_MAX
+#endif


The logic to avoid overflow in _mpz_realloc() (libgmp 6.3.0) is:

if (sizeof (mp_size_t) == sizeof (int)) { if (UNLIKELY (new_alloc > ULONG_MAX / GMP_NUMB_BITS)) MPZ_OVERFLOW; } else { if (UNLIKELY (new_alloc > INT_MAX)) MPZ_OVERFLOW; }

Assuming that mp_size_t is actually size_t, the simplification seems to be correct (we're assuming sizeof(int) == 4 anyway). However, I do not understand how ULONG_MAX fits into the mix. That only works under the assumption that ULONG_MAX == UINT_MAX for 32bit platforms (which is what php-src assumes anyway), but why don't they use UINT_MAX in the first place? I think we should.

cmb69 · 2024-11-21T13:30:04Z

ext/gmp/gmp.c

-		if (mpz_sgn(gmpnum) < 0) {
-			zend_argument_value_error(1, "must be greater than or equal to 0");
+		(void)gmpnum;
+		val = zval_get_long(a_arg);


I don't quite understand how zval_get_long() fits into the mix here. At least, its usage makes FETCH_GMP__ZVAL() above superflous, and we wouldn't even need the special casing for Z_TYPE_P(a_arg) == IS_LONG (since this is already catered to by zval_get_long().

What happens if one calls gmp_fact("18446744073709551617")? Wouldn't that raise a deprecation notice ('Implicit conversion from float-string "18446744073709551617" to int loses precision') instead of throwing an ValueError.

What happens if one calls gmp_fact("18446744073709551617")? Wouldn't that raise a deprecation notice ('Implicit conversion from float-string "18446744073709551617" to int loses precision') instead of throwing an ValueError.

Nope I get the exception in this case too. I ll see the rest later.

Ah, indeed! That is because the conversion happens in coercive mode. And that also shows that we can't easily drop the seemingly superfluous FETCH_GMP_ZVAL() since only that will report issues regarding strict typing mode (e.g. passing a float). So zval_get_long() is basically correct here; the saturating behavior is fine since we cannot compute the factorial of PHP_INT_MAX anyway.

Still, there is an issue unrelated to this PR

E.g. for gmp_fact(new GMP("18446744073709551617")) zval_get_long() returns 1 which passes the overflow check, and evaluates to GMP(1). That is because the mpz_get_si() in gmp_cast_object() overflows, while I would expect saturating behavior. This is not related to this PR, though, but rather a general issue with ext/gmp. From the mpz_get_si() documentation:

If op fits into a signed long int return the value of op. Otherwise return the least significant part of op, with the same sign as op.

If op is too big to fit in a signed long int, the returned result is probably not very useful. To find out if the value will fit, use the function mpz_fits_slong_p.

So apparently, there is an easy fix for gmp_cast_object(). However, on LLP64 mpir 3.0.0 mpz_fits_slong_p() returns false for 2147483648. ;)

I don't quite understand how zval_get_long() fits into the mix here. At least, its usage makes FETCH_GMP__ZVAL() above superflous, and we wouldn't even need the special casing for Z_TYPE_P(a_arg) == IS_LONG (since this is already catered to by zval_get_long().

What happens if one calls gmp_fact("18446744073709551617")? Wouldn't that raise a deprecation notice ('Implicit conversion from float-string "18446744073709551617" to int loses precision') instead of throwing an ValueError.

The zval_get_long() is to convert the GMP object back to an int...

The zval_get_long() is to convert the GMP object back to an int...

Yeah, I figured that much, but I wondered about values outside of [ZEND_LONG_MIN, ZEND_LONG_MAX]. However, I found out. :)

Girgias · 2024-11-22T22:54:22Z

On my machine (64bit) the largest factorial I can attempt to compute is 471778098879 the boundary where it fails is 471778098880 = 64 * 7371532795. So we could compare against this.

Girgias · 2024-11-22T23:00:11Z

Well that still OOMs on 48GB of RAM so, that's not the upper boundary then.

cmb69 · 2024-11-24T13:33:48Z

Roughly 4294967296! might fit into 16GB. 16GB == 2^37 bits is still quite excessive. Especially if we revert/do not introduce these bounds check for stable versions, in my opinion, we could go with far less, maybe even only 2^24 bits (in which case !2790877 would be the maximum).

Girgias · 2024-11-25T13:40:57Z

Yeah, I was just trying to see what was theoretically possible. :)

I am happy to restrict it to !2790877 :)

Fix phpGH-16878: gmp_fact overflow on memory allocation attempt.

10660d8

github-actions bot added the Extension: gmp label Nov 21, 2024

devnexen marked this pull request as ready for review November 21, 2024 07:09

devnexen requested a review from Girgias as a code owner November 21, 2024 07:09

Girgias reviewed Nov 21, 2024

View reviewed changes

Girgias mentioned this pull request Nov 21, 2024

gmp_pow(64, 11) throws overflow exception #16870

Closed

cmb69 reviewed Nov 21, 2024

View reviewed changes

cmb69 mentioned this pull request Nov 21, 2024

Fix GH-16870: gmp_pow(64, 11) throws overflow exception #16884

Closed

changes from feedback

ac20c82

cmb69 reviewed Nov 21, 2024

View reviewed changes

Fix GH-16878: gmp_fact overflow on memory allocation attempt. #16880

Are you sure you want to change the base?

Fix GH-16878: gmp_fact overflow on memory allocation attempt. #16880

Uh oh!

Conversation

devnexen commented Nov 21, 2024

Uh oh!

Girgias left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Girgias commented Nov 22, 2024

Uh oh!

Girgias commented Nov 22, 2024

Uh oh!

cmb69 commented Nov 24, 2024

Uh oh!

Girgias commented Nov 25, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants