More GMP work #1075

fingolfin · 2017-01-13T15:34:29Z

More work on the GMP work. Not yet complete

codecov-io · 2017-01-13T20:05:52Z

Current coverage is 54.49% (diff: 96.10%)

Merging #1075 into master will decrease coverage by 0.03%

@@             master      #1075   diff @@
==========================================
  Files           432        432          
  Lines        224765     224546   -219   
  Methods        3431       3442    +11   
  Messages          0          0          
  Branches          0          0          
==========================================
- Hits         122572     122368   -204   
+ Misses       102193     102178    -15   
  Partials          0          0

Diff Coverage	File Path
••••• 50%	lib/integer.gi
••••••••• 96%	src/gmpints.c
•••••••••• 100%	src/vecgf2.c
•••••••••• 100%	lib/numtheor.gi
•••••••••• 100%	src/code.c

Powered by Codecov. Last update 75245e7...cad52ab

fingolfin · 2017-01-16T18:21:25Z

I think this is almost good for merging. What is left is to add (more) test cases for the new functions (in commits marked "HACK"), adding some thoughts into failure modes for them (i.e. make sure there are no crashes) and finally using them in the library methods for Jacobi(), PValuation(), PowerModInt(). Might also change ModRat in the kernel to use InverseModInt.

fingolfin · 2017-01-20T22:46:06Z

src/gmpints.c

+  CHECK_INT(mod);
+
+  if ( mod == INTOBJ_INT(0) ) {
+    ErrorMayQuit( "PowerModInt: <mod> must be nonzero", 0L, 0L  );


Oops, wrong func name

ChrisJefferson · 2017-01-20T22:39:05Z

lib/integer.gi

@@ -1195,7 +1195,11 @@ InstallGlobalFunction(PowerModInt,function ( r, e, m )
    # return the power
    return pow;
 end);
-
+if IsBound(POWERMODINT) then


Is there a good reason to keep POWERMODINT_GAP around? Is POWERMODINT ever not going to be defined?

Similar for other functions that follow a similar pattern. If they are only used for testing (and I do like testing by comparing two implementations), perhaps they could be moved to a file in tst?

The original reasons I kept them are (a) testing, and (b) making it possible to quickly switch between branches without always having to recompile (as warnings would be shown about undeclared functions). Both were mostly useful for development of this PR. I left them in because that way it was clearly visible to everybody that I did indeed verify the new against the old implementation.

I have no problem either removing those, or else moving them to the .tst files, as you suggest.

ChrisJefferson · 2017-01-20T22:41:18Z

src/gmpints.c

+**  The calling code is responsible for any verification.
+**
+**  WARNING: Since this may trigger a garbage collection, make sure to
+**  call this *before* any calls to FAKEMPZ_GMPorINTOBJ.


what is 'this'? or should the function at the end be different?

Good catch, that is an obsolete comment I forget to remove (and somehow it indeed migrated to the "wrong" function -- I think used to be a comment for NEW_FAKEMPZ, or rather its predecessor). With this final implementation, the GC problems are mostly avoided by using the MPZ_FAKEMPZ macro.

ChrisJefferson · 2017-01-20T22:44:46Z

src/gmpints.h

+/****************************************************************************
+**
+** Compute log2 of the absolute value of an Int, i.e. the index of the highest
+** set bit. For 0 it returns -1, for negative values it 


fingolfin · 2017-01-21T10:08:20Z

lib/numtheor.gi

-local a,v;
-  if not IsPrimeInt(p) or not IsRat(n) then
+local v;
+  if not IsPosInt(p) or not IsRat(n) then


Hmm... I change IsPrimeInt to IsPosInt as that avoided a pointless and potentially costly primality test. However, this meant it used to accept negative values for p, too. So perhaps I should change IsPosInt(p) to IsInt(p) and p <> 0 ?
Might just as well insert a better error message too... And finally, is there any reason this very useful function is not documented???

Also rewrite GcdInt, ProdInt, SumOrDiffInt to use fake_mpz_t. The rewritten functions are much simpler than before. This technique also opens up the way to wrapping more mpz_ functions. Finally, this commit speeds up GcdInt if one or both arguments are multiples of a large power of 2. For example: Old code: gap> GcdInt(2^100001*13, 2^100002*17);; time; 135 gap> GcdInt(2^1000001*13, 2^1000002*17);; time; 14122 gap> GcdInt(2^10000001*13, 2^10000002*17);; time; ... did not finish in reasonable time ... New code: gap> GcdInt(2^100001*13, 2^100002*17);; time; 1 gap> GcdInt(2^1000001*13, 2^1000002*17);; time; 13 gap> GcdInt(2^10000001*13, 2^10000002*17);; time; 164

We previously used gmp_snprintf, which needs to parse a format string. It is simpler and more efficient to call mpz_get_str directly.

Both call the same new function StringIntBase(), which we could also make available to the library.

There was only once place using it left, and that was actually better off directly using the INTOBJ (instead of first converting it into a GMP integer).

Added test to verify it does not destroy the input string anymore.

This was used for the old integers code, were the two values differed.

At this point in the computation, it should be impossible for 'mod' to be negative, so remove the code which adjusts the sign. And just in case I am missing something, insert an assert(!IS_NEGATIVE(mod)) (only active when DEBUG_GMP is on)

After all my refactoring, there was only one caller left.

Also convert rest of gmpints.c to use them, and also IS_INTOBJ, IS_INTPOS, IS_INTNEG. There are now far fewer places using TNUM_OBJ directly.

Surely the user does not want to retry as QUO / QuoInt... Note: Just changing QUO to RemInt and the `if` to `while` would not be enough; we'd also have to verify that opR stays an integer. And then adjust the error message. And so on... Since I am doubtful regarding the usefulness of ErrorReturnObj to start with, I chose to keep my life simple and use ErrorMayQuit instead.

Use GMP's mpz_kronecker to compute the Kronecker-Jacobi symbol J(n,m). We retain the old GAP implementation for reference, and also to facilitate testing (we want to make sure that old and new code return identical results). Note that the new implementation accepts some more inputs, namely it allows negative values for m. This is because GMP allows this, and it seems pointless to restrict this again. The new implementation is considerably faster. Old code: gap> for a in [1..1000] do for b in [1..1000] do Jacob(a,b);od;od;time; 1979 New code: gap> for a in [1..1000] do for b in [1..1000] do Jacob(a,b);od;od;time; 85

fingolfin · 2017-01-21T15:27:53Z

@ChrisJefferson I updated the PR to hopefully address all your remarks. Thanks for the review!

This adds a GMP based implementation of PowerModInt. The new kernel function INVMODINT computes modular inverse, and is not currently exposed to the library in a documented function. As usual, we get a speedup. Old code: gap> for b in [-100..100] do for e in [1..100] do for m in [1..100] do > PowerModInt(b,e,m);od;od;od; time; 2270 gap> for b in [-100..100] do for e in [1..100] do for m in [1..100] do > PowerModInt(b,2^e,m);od;od;od; time; 11354 New code: gap> for b in [-100..100] do for e in [1..100] do for m in [1..100] do > PowerModInt(b,e,m);od;od;od; time; 625 gap> for b in [-100..100] do for e in [1..100] do for m in [1..100] do > PowerModInt(b,2^e,m);od;od;od; time; 1920

It uses mpz_remove to compute the p-valuation. In fact, it computes arbitrary g-valuations, i.e. the second argument can be an arbitrary nonzero integer. We also adapt PValuation to use PVALUATION_INT, and the Valuation method for integers. For small inputs, there is little difference, but for larger ones we get a noticeable speedup. To illustrate this, here are some examples, where ps:=[2,3,5,7,251,65537]; lwi:=ListWithIdenticalEntries;; Old code: gap> x:=2^20;; ListX(lwi(10000,x),ps,PValuation);; time; 90 gap> x:=2^1000-1;; ListX(lwi(10000,x),ps,PValuation);; time; 79 gap> x:=2^1000;; ListX(lwi(10000,x),ps,PValuation);; time; 2608 gap> x:=2^1000+1;; ListX(lwi(10000,x),ps,PValuation);; time; 56 gap> x:=2^1000+2^100;; ListX(lwi(10000,x),ps,PValuation);; time; 379 New code: gap> x:=2^20;; ListX(lwi(10000,x),ps,PValuation);; time; 40 gap> x:=2^1000-1;; ListX(lwi(10000,x),ps,PValuation);; time; 86 gap> x:=2^1000;; ListX(lwi(10000,x),ps,PValuation);; time; 64 gap> x:=2^1000+1;; ListX(lwi(10000,x),ps,PValuation);; time; 71 gap> x:=2^1000+2^100;; ListX(lwi(10000,x),ps,PValuation);; time; 67 Using PVALUATION_INT directly (returns 0 for n=0, and only supports integer inputs): gap> x:=2^20;; ListX(lwi(10000,x),ps,PVALUATION_INT);; time; 26 gap> x:=2^1000-1;; ListX(lwi(10000,x),ps,PVALUATION_INT);; time; 67 gap> x:=2^1000;; ListX(lwi(10000,x),ps,PVALUATION_INT);; time; 47 gap> x:=2^1000+1;; ListX(lwi(10000,x),ps,PVALUATION_INT);; time; 53 gap> x:=2^1000+2^100;; ListX(lwi(10000,x),ps,PVALUATION_INT);; time; 49

markuspf

There are a few small comments/questions.

This PR makes the GMP code less fragile and more readable, which is great.

A general question is whether these changes have any (notable) performance implications?

markuspf · 2017-01-23T13:56:48Z

lib/integer.gd

@@ -799,7 +799,7 @@ DeclareGlobalFunction( "NextPrimeInt" );
 ##
 ##  <Description>
 ##  returns <M><A>r</A>^{<A>e</A>} \pmod{<A>m</A>}</M> for integers <A>r</A>,
-##  <A>e</A> and <A>m</A> (<M><A>e</A> \geq 0</M>).
+##  <A>e</A> and <A>m</A>.
 ##  <P/>


Is it obvious enough that $e \neq 0$?

Negative values seem to work (now) if $r$ is invertible, so that should maybe be documented explicitly?

But e can (still) be 0. And negative values were supported before, just not documented.
Documenting this more explicitly is fine by me, but I'd prefer to not do this as part of this PR (which has been in the works for very long by now)...

markuspf · 2017-01-23T13:58:33Z

lib/numtheor.gi

-local a,v;
-  if not IsPrimeInt(p) or not IsRat(n) then
+local v;
+  if not IsInt(p) or not IsRat(n) or p = 0 then
    Error("wrong parameters");
  fi;


This error message could be more helpful.

Yes. I hope somebody will make a PR improving it after this PR has been merged :-)

markuspf · 2017-01-23T14:04:27Z

src/gmpints.c

-                 "%Nd", ADDR_INT(op),
-                 (TypGMPSize)SIZE_INT(op));
+    Char buf[20000];
+    mpz_t v;


I don't see that we're checking for overflows here anymore. Maybe I am missing something?

Not sure what you mean with "checking for overflows". You mean: check whether the buffer was exceeded (and we thus will crash momentarily, for smashing the stack) ? The old code had an assert for that, sure.

But given that LogInt( (2^64)^1000, 10) = 19265 < 20000, that simply cannot happen. Hmm, unless we run this on a 128 bit processor, I guess.

Note that the assert was not even doing anything by default.

Anyway, I'll add a comment and think about whether I can make this more clear / pseudo-robust.

markuspf · 2017-01-23T14:14:47Z

lib/numtheor.gi

-
-    return jac;
-end );
+InstallGlobalFunction( Jacobi, JACOBI_INT );



While its great to be able to do all these in GMP, I would quite like to keep GAP functions around as an alternative. The reason is that while implementing GAP in alternative runtimes, the more code one does not have to write in the runtime the better.

I don't know how to do this cleanly right now though.

That is exactly how I did it before @ChrisJefferson asked me to remove the GAP implementation sigh. It is still around in the .tst files, by the way.

I can add it back here again, but I don't want to rewrite the code a third time, so I'll wait until at least @ChrisJefferson signs off on it, too.

Caveat: the new Jacobi accepts more inputs than the old (n negative).

As discussed in the chat, I am happy with moving the function to tests, and I will attempt to make a structured attempt at detecting which functions are delegated to C code (and collect/provide GAP implementations where appropriate).

markuspf · 2017-01-23T14:17:38Z

src/gmpints.c

+#else
+#define CHECK_FAKEMPZ(fake)  do { } while(0);
+#endif
+


Also, ensure it works correctly on 128bit systems ;-)

This is largely based on the old src/integer.c file, with some edits to make it valid for non-32 bit systems.

fingolfin added the do not merge PRs which are not yet ready to be merged (e.g. submitted for discussion, or test results) label Jan 13, 2017

fingolfin force-pushed the mh/gmp2 branch from b4c4428 to b45f1b8 Compare January 13, 2017 18:20

fingolfin force-pushed the mh/gmp2 branch 3 times, most recently from 3549c0b to d59fc61 Compare January 16, 2017 18:17

fingolfin force-pushed the mh/gmp2 branch 2 times, most recently from ace78ca to a629a1f Compare January 20, 2017 19:36

fingolfin removed the do not merge PRs which are not yet ready to be merged (e.g. submitted for discussion, or test results) label Jan 20, 2017

fingolfin requested review from ChrisJefferson, markuspf and frankluebeck January 20, 2017 22:21

fingolfin commented Jan 20, 2017

View reviewed changes

ChrisJefferson reviewed Jan 20, 2017

View reviewed changes

fingolfin commented Jan 21, 2017

View reviewed changes

fingolfin force-pushed the mh/gmp2 branch from a629a1f to 85d6304 Compare January 21, 2017 14:47

fingolfin added 13 commits January 21, 2017 15:47

gmpints: merge GMPorINTOBJ_INT and ObjInt_Int

8c018e0

gmpints: rewrite PrintInt using mpz_get_str

e918054

We previously used gmp_snprintf, which needs to parse a format string. It is simpler and more efficient to call mpz_get_str directly.

gmpints: rewrite HexStringInt and STRING_INT using fake_mpz_t

1c6d8ee

Both call the same new function StringIntBase(), which we could also make available to the library.

gmpints: remove GMP_INTOBJ

b591686

There was only once place using it left, and that was actually better off directly using the INTOBJ (instead of first converting it into a GMP integer).

gmpints: rewrite FuncIntHexString

9a3689f

Added test to verify it does not destroy the input string anymore.

gmpints: replace TypLimb -> mp_limb_t, TypGMPSize -> mp_size_t

99418d3

gmpints: replace INTEGER_ALLOCATION_SIZE by INTEGER_UNIT_SIZE

b700ea4

This was used for the old integers code, were the two values differed.

gmpints: cleanup gmpints.h

af567b1

gmpints: simplify ModInt

6a5815f

At this point in the computation, it should be impossible for 'mod' to be negative, so remove the code which adjusts the sign. And just in case I am missing something, insert an assert(!IS_NEGATIVE(mod)) (only active when DEBUG_GMP is on)

gmpints: remove NEW_INTPOS

b8a188c

After all my refactoring, there was only one caller left.

gmpints: add IS_INT, IS_POSITIVE, REQUIRE_INT_ARG

8454a6f

Also convert rest of gmpints.c to use them, and also IS_INTOBJ, IS_INTPOS, IS_INTNEG. There are now far fewer places using TNUM_OBJ directly.

gmpints: change REQUIRE_INT_ARG to use ErrorMayQuit

1c8ae24

fingolfin added 3 commits January 21, 2017 15:47

gmpints: use REQUIRE_INT_ARG in two more places

cdba5d4

fingolfin force-pushed the mh/gmp2 branch from 85d6304 to 0c864a6 Compare January 21, 2017 15:24

fingolfin force-pushed the mh/gmp2 branch 2 times, most recently from 35be8e2 to fbf656a Compare January 22, 2017 00:09

fingolfin added 3 commits January 22, 2017 01:09

tests: replace legal header in intarith.tst

fbf656a

markuspf reviewed Jan 23, 2017

View reviewed changes

fingolfin added 2 commits January 23, 2017 15:50

gmpints: explain the buffer in PrintInt

9bc7f34

Also, ensure it works correctly on 128bit systems ;-)

gmpints: improve initial documentation comment

cad52ab

This is largely based on the old src/integer.c file, with some edits to make it valid for non-32 bit systems.

markuspf approved these changes Jan 23, 2017

View reviewed changes

markuspf merged commit be7677f into gap-system:master Jan 23, 2017

fingolfin deleted the mh/gmp2 branch January 23, 2017 16:58

fingolfin mentioned this pull request Sep 7, 2017

Release notes for GAP 4.9 #1699

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More GMP work #1075

More GMP work #1075

fingolfin commented Jan 13, 2017

codecov-io commented Jan 13, 2017 •

edited

Loading

fingolfin commented Jan 16, 2017

fingolfin Jan 20, 2017

ChrisJefferson Jan 20, 2017

ChrisJefferson Jan 20, 2017

fingolfin Jan 21, 2017

ChrisJefferson Jan 20, 2017

fingolfin Jan 21, 2017

ChrisJefferson Jan 20, 2017

fingolfin Jan 21, 2017

fingolfin commented Jan 21, 2017

markuspf left a comment

markuspf Jan 23, 2017

fingolfin Jan 23, 2017

markuspf Jan 23, 2017

fingolfin Jan 23, 2017

markuspf Jan 23, 2017

fingolfin Jan 23, 2017

fingolfin Jan 23, 2017

markuspf Jan 23, 2017

fingolfin Jan 23, 2017

fingolfin Jan 23, 2017

markuspf Jan 23, 2017

markuspf Jan 23, 2017

More GMP work #1075

More GMP work #1075

Conversation

fingolfin commented Jan 13, 2017

codecov-io commented Jan 13, 2017 • edited Loading

Current coverage is 54.49% (diff: 96.10%)

fingolfin commented Jan 16, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fingolfin commented Jan 21, 2017

markuspf left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-io commented Jan 13, 2017 •

edited

Loading