Skip to content

Use low-level mpn_* routines to implement integer arithmetics? #529

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
skirpichev opened this issue Nov 18, 2024 · 3 comments
Open

Use low-level mpn_* routines to implement integer arithmetics? #529

skirpichev opened this issue Nov 18, 2024 · 3 comments

Comments

@skirpichev
Copy link
Contributor

POC pr for CPython: python/cpython#66121

I think we can adopt similar strategy to implement arithmetic operations for mpz/xmpz's. No need to change MPZ_Object structure. All we need: appropriately allocate/reallocate/free memory by hand and use low-level mpn_* functions to operate on arrays of limbs. I don't expect any speed loss.

On pros: arithmetic on mpz's will never crash the interpreter. (That might be the case for some special function, but not for the basic stuff, working on CPython int's.) New mpz's could be used as a drop-in replacement for int's.

@casevh
Copy link
Collaborator

casevh commented Nov 20, 2024

I like this idea. I've thought about it but don't have the time to implement it. I have a system with lots of memory that I can use for stress testing.

@skirpichev
Copy link
Contributor Author

I'm thinking about a separate package (something like gmp-python, as flint-python; better name?), which will provide such mpz type and few integer-related functions (like isqrt). This type will have int-compatible interface. (Revival of python/cpython#66121 patch is also in my short-term plans.)

Later we can include this to gmpy2's mpz. I don't think that testing memory handling will require a machine with a lots of memory (e.g. in Linux you can emulate more tight memory bounds with ulimit). By I would appreciate more testing anyway!

A separate package does make sense for me also because other projects (e.g. mpmath, sympy or diofant) use only integer-related stuff from the gmpy2 (well, maybe mpq's too). Now python ecosystem has much better handling of dependencies. So, eventually it might be useful to split the gmpy2 package to two or three. This is a different issue, however. Let me know if it does make sense for you.

@skirpichev
Copy link
Contributor Author

skirpichev commented Dec 2, 2024

FYI, @casevh, early draft published on https://github.com/diofant/python-gmp

It has a large TODO, but basically this approach does work:

$ ulimit -v $((1024*256))
$ python
Python 3.13.0rc1+ (heads/3.13:bd29ce8509, Aug 29 2024, 15:30:15) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from gmpy2 import mpz
>>> mpz(-1) ^ mpz(1)
KeyboardInterrupt
>>> mpz(2222222222222211111111122222222222222)**mpz(33322222)
GNU MP: Cannot allocate memory (size=503998640)
Aborted
$ python
Python 3.13.0rc1+ (heads/3.13:bd29ce8509, Aug 29 2024, 15:30:15) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from gmp import mpz
>>> mpz(2222222222222211111111122222222222222)**mpz(33322222)
Traceback (most recent call last):
  File "<python-input-1>", line 1, in <module>
    mpz(2222222222222211111111122222222222222)**mpz(33322222)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^~~~~~~~~~~~~~
MemoryError

I run primitive benchmarks. It seems this module is slightly faster than the gmpy2 without cache (e.g. I got ~20% boost for multiplication of two 2-digit integers). Probably, it's a temporary speedup and performance will match gmpy2's when project matures (it lacks support for floats, for example).

Edit:
It was less easy than I expected. Just using mpn_*() functions isn't enough, some functions still do memory allocation for temporary buffers (e.g. inside of mpn_mul()). (See TMP_ALLOC/TMP_BALLOC macros in the gmp-impl.h.) But I think that using setjmp/longjmp should be safe to free such memory: warning from the chapter 13 of the GMP docs seems to be unapplicable in given circumstances.

So, my toy python module now uses custom allocation functions (realloc isn't required) for the GMP. I can't crash it on the CPython so far, but can on PyPy (maybe it's a PyPy issue).

The package is not feature-complete yet, but I still would appreciate testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants