Add nmod_vec_invert: invert an array of nmod coefficients by vneiger · Pull Request #2432 · flintlib/flint

vneiger · 2025-10-27T16:11:20Z

This inverts each entry in an nmod_vec, doing mostly multiplications instead of inversions.

See the table below for efficiency comparisons.

In short, this is in most cases significantly faster than the naive approach which uses nmod_inv repeatedly. When the vector length grows, the speed-up factor becomes close to the ratio between the time spent in a modular inversion vs. the time spent in a modular multiplication. The table below includes cases with factors beyond 20.

The only exceptions for "faster" are for very small moduli (like bitsize 2 or 3) since inversion is then quite fast; and for very small vector lengths. But already for bitsize 3 and length 5, this starts to be beneficial.

And actually, for not-too-small moduli (say bitsize 16 or more), this has an interesting speed-up already for very small lengths (more than 1.5x for length 2, more than 2x for length 3)... if this operation of inverting 2 or 3 (or a small number) of elements is not rare (I'm not sure about this), it could make sense to add some function for this in nmod? Something like:

void nmod_inv2(ulong * inv_a1, ulong * inv_a2, ulong a1, ulong a2, nmod_t mod)
void nmod_inv3(ulong * inv_a1, ulong * inv_a2, ulong * inv_a3, ulong a1, ulong a2, ulong a3, nmod_t mod)

Any insight about this is welcome before this PR gets finalized.

╰─ ./p-invert
unit: all measurements in c/l (up to constant multiplicative factor)
profiled: naive | precomp shoup | generic
bit/len 1               2               3               4               5               6               7               8               9               10              11              12              13              14              15              16              1024            65536
2       0.26|0.56|0.45  0.22|0.36|0.33  0.22|0.30|0.30  0.20|0.28|0.29  0.20|0.25|0.30  0.19|0.24|0.30  0.19|0.23|0.29  0.18|0.23|0.29  0.18|0.21|0.28  0.18|0.21|0.29  0.18|0.22|0.30  0.19|0.21|0.28  0.18|0.21|0.29  0.18|0.20|0.28  0.17|0.20|0.28  0.19|0.21|0.29  0.17|0.18|0.28  0.17|0.18|0.28
3       0.33|0.72|0.53  0.32|0.45|0.40  0.31|0.36|0.34  0.29|0.32|0.34  0.28|0.28|0.33  0.27|0.27|0.34  0.27|0.25|0.31  0.27|0.25|0.31  0.28|0.22|0.30  0.27|0.22|0.31  0.27|0.23|0.31  0.27|0.22|0.31  0.27|0.22|0.30  0.27|0.21|0.29  0.27|0.21|0.31  0.27|0.21|0.30  0.25|0.18|0.28  0.24|0.18|0.28
4       0.38|0.80|0.60  0.37|0.49|0.43  0.36|0.37|0.37  0.36|0.33|0.35  0.34|0.29|0.34  0.33|0.27|0.35  0.33|0.26|0.32  0.34|0.25|0.33  0.33|0.23|0.33  0.33|0.23|0.31  0.34|0.23|0.31  0.33|0.24|0.31  0.32|0.23|0.30  0.32|0.22|0.30  0.32|0.22|0.30  0.32|0.22|0.30  0.30|0.18|0.28  0.30|0.18|0.28
5       0.48|0.89|0.68  0.48|0.55|0.48  0.45|0.40|0.41  0.44|0.35|0.37  0.43|0.31|0.36  0.42|0.29|0.38  0.42|0.27|0.34  0.43|0.26|0.34  0.43|0.25|0.33  0.42|0.24|0.33  0.42|0.25|0.32  0.42|0.24|0.32  0.41|0.23|0.31  0.41|0.23|0.31  0.42|0.22|0.31  0.43|0.22|0.31  0.39|0.18|0.29  0.39|0.18|0.28
6       0.50|0.91|0.70  0.48|0.56|0.49  0.47|0.41|0.40  0.46|0.36|0.38  0.47|0.31|0.37  0.44|0.29|0.37  0.44|0.29|0.35  0.46|0.26|0.34  0.44|0.25|0.33  0.44|0.24|0.34  0.44|0.24|0.32  0.45|0.24|0.32  0.44|0.24|0.32  0.43|0.23|0.31  0.43|0.23|0.31  0.43|0.23|0.32  0.41|0.18|0.28  0.41|0.18|0.28
7       0.61|1.02|0.79  0.59|0.59|0.53  0.57|0.45|0.44  0.57|0.38|0.41  0.55|0.33|0.39  0.54|0.30|0.38  0.54|0.29|0.35  0.56|0.28|0.35  0.55|0.26|0.35  0.54|0.25|0.33  0.54|0.25|0.33  0.54|0.25|0.34  0.56|0.24|0.32  0.53|0.23|0.32  0.53|0.24|0.32  0.53|0.24|0.32  0.51|0.19|0.28  0.52|0.18|0.28
8       0.67|1.06|0.85  0.65|0.61|0.56  0.63|0.46|0.45  0.64|0.40|0.42  0.63|0.34|0.40  0.61|0.32|0.40  0.61|0.29|0.36  0.61|0.29|0.36  0.64|0.27|0.36  0.61|0.26|0.34  0.60|0.26|0.34  0.60|0.25|0.34  0.60|0.25|0.32  0.61|0.24|0.32  0.60|0.24|0.33  0.59|0.24|0.32  0.57|0.18|0.28  0.57|0.19|0.29
9       0.79|1.19|0.97  0.76|0.67|0.61  0.76|0.50|0.49  0.73|0.42|0.45  0.73|0.37|0.42  0.73|0.33|0.41  0.71|0.31|0.37  0.72|0.30|0.37  0.73|0.28|0.35  0.71|0.28|0.36  0.72|0.27|0.36  0.72|0.26|0.34  0.71|0.25|0.34  0.71|0.24|0.34  0.72|0.24|0.33  0.71|0.24|0.33  0.69|0.18|0.29  0.68|0.18|0.28
10      0.85|1.25|1.09  0.83|0.73|0.65  0.83|0.52|0.51  0.82|0.44|0.46  0.81|0.39|0.43  0.80|0.33|0.44  0.81|0.32|0.39  0.79|0.31|0.38  0.78|0.29|0.36  0.78|0.28|0.36  0.79|0.28|0.35  0.79|0.27|0.35  0.78|0.26|0.34  0.78|0.26|0.34  0.78|0.25|0.33  0.77|0.25|0.34  0.78|0.18|0.28  0.75|0.18|0.28
11      0.93|1.32|1.11  0.92|0.73|0.69  0.92|0.55|0.54  0.96|0.47|0.48  0.90|0.39|0.45  0.89|0.35|0.44  0.94|0.33|0.40  0.91|0.32|0.39  0.91|0.31|0.37  0.94|0.30|0.37  0.90|0.28|0.36  0.90|0.28|0.36  0.93|0.28|0.36  0.89|0.26|0.34  0.89|0.25|0.34  0.91|0.25|0.34  0.91|0.19|0.29  0.88|0.18|0.28
12      0.92|1.35|1.18  0.99|0.77|0.72  0.99|0.57|0.57  0.99|0.48|0.51  0.94|0.40|0.45  0.96|0.36|0.44  0.92|0.33|0.42  0.96|0.32|0.39  0.92|0.31|0.38  0.93|0.29|0.38  0.92|0.28|0.36  0.92|0.27|0.36  0.92|0.27|0.36  0.93|0.26|0.36  0.94|0.25|0.34  0.92|0.25|0.35  0.89|0.18|0.28  0.89|0.18|0.28
13      1.09|1.59|1.35  1.14|0.83|0.78  1.13|0.61|0.60  1.12|0.52|0.53  1.10|0.43|0.48  1.09|0.38|0.46  1.10|0.37|0.42  1.10|0.34|0.42  1.09|0.32|0.40  1.09|0.30|0.38  1.13|0.32|0.39  1.10|0.29|0.38  1.09|0.28|0.36  1.10|0.27|0.35  1.08|0.26|0.35  1.07|0.26|0.37  1.09|0.18|0.28  1.06|0.18|0.28
14      1.30|1.73|1.58  1.32|0.94|0.89  1.29|0.70|0.68  1.29|0.55|0.59  1.29|0.48|0.53  1.25|0.42|0.50  1.25|0.39|0.46  1.30|0.38|0.46  1.32|0.35|0.42  1.25|0.33|0.41  1.29|0.32|0.40  1.25|0.31|0.39  1.24|0.31|0.38  1.30|0.28|0.37  1.26|0.28|0.37  1.26|0.28|0.36  1.27|0.18|0.28  1.23|0.19|0.29
15      1.36|1.79|1.58  1.37|0.97|0.92  1.34|0.69|0.69  1.33|0.58|0.60  1.35|0.52|0.54  1.32|0.42|0.51  1.30|0.40|0.46  1.36|0.38|0.45  1.30|0.36|0.43  1.30|0.35|0.42  1.36|0.32|0.40  1.30|0.32|0.39  1.29|0.30|0.38  1.33|0.29|0.38  1.29|0.28|0.39  1.32|0.28|0.37  1.31|0.18|0.28  1.27|0.18|0.28
16      1.49|1.93|1.72  1.50|1.09|1.00  1.47|0.74|0.74  1.45|0.60|0.64  1.47|0.52|0.57  1.43|0.45|0.54  1.42|0.42|0.49  1.51|0.39|0.47  1.43|0.37|0.45  1.42|0.35|0.42  1.42|0.34|0.41  1.42|0.32|0.42  1.45|0.31|0.40  1.42|0.31|0.39  1.41|0.29|0.38  1.41|0.29|0.38  1.39|0.18|0.29  1.41|0.18|0.28
17      1.53|1.97|1.76  1.51|1.05|1.01  1.49|0.76|0.75  1.48|0.64|0.67  1.53|0.53|0.58  1.47|0.46|0.54  1.46|0.42|0.50  1.47|0.39|0.47  1.47|0.38|0.47  1.53|0.36|0.43  1.46|0.34|0.42  1.46|0.33|0.41  1.53|0.32|0.40  1.46|0.30|0.39  1.45|0.30|0.38  1.51|0.29|0.38  1.43|0.18|0.28  1.42|0.18|0.28
18      1.64|2.07|1.81  1.57|1.08|1.04  1.53|0.77|0.77  1.57|0.63|0.65  1.50|0.54|0.57  1.48|0.46|0.57  1.56|0.43|0.48  1.51|0.40|0.47  1.49|0.38|0.45  1.49|0.36|0.43  1.48|0.34|0.42  1.57|0.34|0.41  1.49|0.32|0.40  1.48|0.30|0.39  1.48|0.30|0.38  1.47|0.30|0.39  1.48|0.18|0.28  1.46|0.18|0.28
19      1.70|2.10|1.87  1.62|1.13|1.08  1.62|0.80|0.86  1.69|0.67|0.68  1.60|0.55|0.60  1.59|0.48|0.57  1.58|0.45|0.51  1.59|0.42|0.49  1.63|0.39|0.47  1.59|0.38|0.45  1.58|0.35|0.43  1.58|0.34|0.42  1.58|0.33|0.41  1.59|0.31|0.40  1.56|0.30|0.39  1.56|0.30|0.39  1.54|0.18|0.28  1.54|0.18|0.28
20      1.76|2.21|1.94  1.68|1.15|1.12  1.69|0.83|0.83  1.67|0.67|0.69  1.66|0.56|0.61  1.68|0.50|0.56  1.65|0.46|0.52  1.66|0.42|0.49  1.65|0.39|0.47  1.64|0.38|0.46  1.70|0.36|0.43  1.66|0.35|0.43  1.65|0.34|0.41  1.66|0.32|0.40  1.64|0.31|0.39  1.68|0.31|0.38  1.62|0.18|0.28  1.62|0.18|0.28
21      1.85|2.25|2.03  1.77|1.19|1.16  1.86|0.86|0.86  1.75|0.68|0.71  1.73|0.58|0.62  1.72|0.50|0.58  1.72|0.47|0.53  1.73|0.45|0.52  1.75|0.39|0.47  1.72|0.39|0.47  1.72|0.36|0.44  1.71|0.35|0.43  1.71|0.34|0.42  1.72|0.33|0.42  1.72|0.32|0.40  1.71|0.32|0.39  1.70|0.18|0.28  1.68|0.20|0.28
22      1.98|2.56|2.16  1.91|1.27|1.22  1.88|0.90|0.89  1.87|0.75|0.75  1.86|0.63|0.66  1.86|0.52|0.61  1.85|0.48|0.57  1.90|0.45|0.52  1.85|0.41|0.49  1.85|0.39|0.49  1.86|0.37|0.45  1.83|0.36|0.44  1.82|0.34|0.44  1.85|0.33|0.42  1.81|0.32|0.42  1.92|0.31|0.39  1.79|0.18|0.28  1.80|0.18|0.28
23      2.03|2.48|2.27  1.96|1.27|1.27  1.95|0.91|0.92  1.96|0.73|0.76  1.90|0.61|0.66  1.90|0.54|0.61  1.89|0.49|0.57  1.96|0.46|0.52  1.91|0.41|0.49  1.89|0.39|0.48  1.95|0.37|0.45  1.88|0.37|0.45  1.93|0.34|0.43  1.95|0.34|0.42  1.88|0.33|0.40  1.87|0.32|0.40  1.92|0.18|0.30  1.90|0.18|0.28
24      2.12|2.53|2.29  2.09|1.31|1.28  2.00|0.93|0.93  1.98|0.74|0.78  2.11|0.65|0.68  1.97|0.54|0.62  1.97|0.50|0.57  1.98|0.46|0.53  1.97|0.42|0.50  2.10|0.40|0.49  1.97|0.38|0.46  1.97|0.37|0.46  1.96|0.35|0.44  1.98|0.34|0.43  1.98|0.33|0.41  1.96|0.33|0.41  1.94|0.18|0.28  1.93|0.18|0.28
25      2.15|2.55|2.31  2.05|1.36|1.31  2.05|0.95|0.95  2.02|0.77|0.78  2.01|0.63|0.68  2.00|0.56|0.63  2.00|0.52|0.58  2.05|0.47|0.54  2.01|0.43|0.51  2.01|0.41|0.49  2.01|0.38|0.46  1.99|0.38|0.48  2.11|0.36|0.45  1.99|0.35|0.43  1.99|0.33|0.41  1.99|0.33|0.43  2.02|0.18|0.30  2.01|0.18|0.28
26      2.27|2.66|2.49  2.18|1.38|1.35  2.13|0.98|0.97  2.12|0.78|0.81  2.18|0.66|0.71  2.11|0.58|0.65  2.11|0.52|0.59  2.18|0.49|0.54  2.11|0.44|0.52  2.24|0.42|0.50  2.21|0.39|0.47  2.11|0.39|0.46  2.11|0.36|0.44  2.21|0.35|0.43  2.09|0.34|0.43  2.11|0.33|0.41  2.07|0.18|0.28  2.07|0.18|0.28
27      2.34|2.73|2.48  2.34|1.48|1.42  2.24|1.00|1.00  2.20|0.79|0.82  2.19|0.68|0.72  2.16|0.58|0.66  2.17|0.52|0.60  2.20|0.51|0.54  2.18|0.45|0.53  2.18|0.43|0.50  2.17|0.40|0.49  2.18|0.39|0.48  2.20|0.37|0.45  2.17|0.36|0.45  2.18|0.34|0.43  2.16|0.33|0.42  2.14|0.18|0.29  2.17|0.18|0.28
28      2.41|2.79|2.55  2.28|1.50|1.43  2.27|1.01|1.01  2.25|0.82|0.89  2.29|0.68|0.73  2.24|0.60|0.67  2.23|0.54|0.60  2.24|0.50|0.56  2.24|0.46|0.53  2.38|0.45|0.52  2.26|0.41|0.49  2.26|0.40|0.47  2.26|0.38|0.46  2.25|0.37|0.45  2.30|0.35|0.44  2.26|0.34|0.43  2.23|0.18|0.28  2.22|0.18|0.28
29      2.49|2.85|2.64  2.41|1.50|1.46  2.36|1.07|1.05  2.45|0.85|0.86  2.33|0.69|0.75  2.32|0.61|0.69  2.46|0.57|0.62  2.33|0.52|0.58  2.33|0.48|0.54  2.33|0.46|0.53  2.32|0.41|0.49  2.32|0.41|0.48  2.33|0.38|0.48  2.32|0.37|0.45  2.30|0.35|0.44  2.31|0.35|0.44  2.30|0.19|0.30  2.32|0.18|0.28
30      2.59|2.99|2.77  2.47|1.54|1.50  2.44|1.10|1.08  2.42|0.86|0.93  2.48|0.71|0.77  2.42|0.62|0.71  2.42|0.57|0.63  2.48|0.51|0.59  2.41|0.48|0.58  2.45|0.44|0.53  2.49|0.41|0.50  2.49|0.41|0.49  2.46|0.40|0.48  2.51|0.38|0.46  2.45|0.36|0.45  2.41|0.36|0.44  2.49|0.19|0.28  2.36|0.18|0.28
31      2.62|2.99|2.87  2.60|1.59|1.53  2.48|1.08|1.09  2.45|0.87|0.90  2.45|0.73|0.77  2.43|0.63|0.69  2.55|0.59|0.64  2.46|0.54|0.60  2.44|0.49|0.56  2.44|0.45|0.53  2.44|0.43|0.51  2.55|0.41|0.50  2.44|0.39|0.47  2.43|0.39|0.47  2.44|0.36|0.45  2.42|0.36|0.44  2.40|0.19|0.29  2.38|0.18|0.28
32      2.72|3.10|2.86  2.60|1.68|1.60  2.58|1.15|1.13  2.57|0.93|0.93  2.59|0.74|0.80  2.56|0.65|0.73  2.55|0.59|0.65  2.56|0.55|0.61  2.56|0.51|0.58  2.60|0.47|0.55  2.55|0.43|0.52  2.55|0.42|0.49  2.55|0.40|0.48  2.54|0.39|0.49  2.67|0.38|0.46  2.54|0.36|0.45  2.52|0.18|0.29  2.58|0.18|0.29
33      2.79|3.17|2.94  2.68|1.64|1.59  2.63|1.17|1.14  2.61|0.93|0.94  2.59|0.75|0.81  2.58|0.65|0.74  2.68|0.60|0.66  2.59|0.54|0.61  2.59|0.51|0.57  2.59|0.46|0.56  2.59|0.43|0.52  2.74|0.43|0.51  2.60|0.40|0.50  2.60|0.39|0.47  2.57|0.37|0.46  2.57|0.36|0.46  2.60|0.18|0.28  2.55|0.18|0.28
34      2.87|3.29|3.03  2.73|1.68|1.64  2.71|1.20|1.17  2.70|0.96|1.00  2.74|0.77|0.82  2.67|0.67|0.75  2.67|0.61|0.67  2.75|0.54|0.63  2.69|0.54|0.61  2.82|0.48|0.56  2.79|0.44|0.53  2.70|0.43|0.51  2.70|0.41|0.49  2.79|0.40|0.49  2.72|0.38|0.47  2.67|0.37|0.46  2.75|0.18|0.28  2.63|0.18|0.28
35      2.91|3.41|3.09  2.85|1.68|1.66  2.74|1.18|1.18  2.73|0.93|0.96  2.83|0.77|0.82  2.71|0.68|0.76  2.76|0.61|0.67  2.86|0.57|0.63  2.72|0.51|0.59  2.71|0.48|0.56  2.71|0.46|0.56  2.87|0.44|0.51  2.72|0.41|0.49  2.73|0.41|0.49  2.74|0.38|0.46  2.73|0.37|0.46  2.76|0.19|0.28  2.71|0.18|0.28
36      2.98|3.33|3.08  2.82|1.74|1.68  2.79|1.21|1.26  2.83|0.96|0.99  2.82|0.79|0.85  2.80|0.70|0.77  2.79|0.63|0.68  2.80|0.57|0.64  2.95|0.53|0.59  2.78|0.48|0.56  2.76|0.45|0.53  2.77|0.43|0.51  2.77|0.41|0.49  2.76|0.41|0.49  2.77|0.38|0.47  2.76|0.37|0.46  2.73|0.18|0.28  2.73|0.18|0.28
37      3.10|3.51|3.25  2.92|1.76|1.72  2.90|1.25|1.22  2.87|0.98|1.00  2.86|0.80|0.85  2.85|0.72|0.79  2.89|0.64|0.69  2.86|0.57|0.63  2.86|0.53|0.60  2.85|0.48|0.59  2.88|0.46|0.55  2.90|0.44|0.53  2.87|0.42|0.53  2.91|0.41|0.49  2.84|0.39|0.47  2.84|0.38|0.49  2.90|0.18|0.28  2.82|0.18|0.28
38      3.14|3.57|3.34  3.03|1.82|1.77  3.00|1.30|1.27  3.00|0.99|1.06  3.05|0.81|0.87  2.93|0.71|0.78  2.93|0.63|0.70  3.08|0.59|0.65  3.14|0.56|0.63  2.97|0.50|0.59  2.97|0.48|0.56  2.96|0.45|0.54  2.96|0.43|0.51  2.96|0.43|0.51  2.96|0.40|0.48  2.96|0.39|0.47  2.94|0.19|0.29  2.93|0.18|0.28
39      3.27|3.63|3.38  3.22|1.85|1.82  3.08|1.33|1.29  3.07|1.02|1.05  3.06|0.86|0.90  3.04|0.73|0.82  3.06|0.66|0.73  3.06|0.60|0.67  3.05|0.55|0.62  3.04|0.52|0.58  3.04|0.48|0.59  3.14|0.46|0.54  3.04|0.44|0.52  3.04|0.43|0.52  3.08|0.40|0.49  3.03|0.40|0.50  3.13|0.18|0.28  2.98|0.18|0.28
40      3.29|3.66|3.39  3.12|1.88|1.88  3.14|1.32|1.34  3.16|1.03|1.06  3.10|0.86|0.91  3.09|0.74|0.81  3.09|0.66|0.72  3.09|0.60|0.68  3.17|0.56|0.63  3.10|0.51|0.60  3.09|0.48|0.56  3.08|0.45|0.54  3.09|0.44|0.52  3.27|0.44|0.50  3.08|0.40|0.49  3.07|0.40|0.48  3.05|0.18|0.28  3.05|0.18|0.28
41      3.41|3.73|3.48  3.21|1.92|1.88  3.20|1.35|1.33  3.18|1.05|1.07  3.18|0.86|0.92  3.16|0.78|0.86  3.22|0.70|0.73  3.17|0.62|0.69  3.22|0.56|0.64  3.18|0.53|0.63  3.21|0.49|0.58  3.21|0.46|0.55  3.16|0.44|0.55  3.25|0.43|0.51  3.15|0.41|0.49  3.15|0.40|0.51  3.31|0.19|0.28  3.13|0.18|0.28
42      3.45|3.80|3.69  3.35|1.94|1.90  3.25|1.35|1.42  3.31|1.06|1.09  3.40|0.88|0.93  3.22|0.76|0.84  3.22|0.67|0.73  3.23|0.64|0.68  3.49|0.58|0.66  3.31|0.53|0.61  3.29|0.51|0.59  3.28|0.47|0.56  3.29|0.45|0.54  3.30|0.46|0.54  3.32|0.42|0.50  3.28|0.40|0.49  3.27|0.19|0.29  3.26|0.18|0.28
43      3.53|3.79|3.58  3.30|2.01|1.91  3.26|1.38|1.35  3.25|1.05|1.09  3.25|0.88|0.96  3.29|0.79|0.85  3.29|0.69|0.75  3.26|0.62|0.73  3.33|0.58|0.65  3.31|0.53|0.61  3.30|0.50|0.61  3.42|0.47|0.56  3.29|0.46|0.54  3.26|0.44|0.52  3.37|0.41|0.50  3.26|0.41|0.52  3.27|0.19|0.28  3.29|0.18|0.28
44      3.61|3.98|3.70  3.44|2.03|1.99  3.53|1.39|1.49  3.60|1.12|1.14  3.41|0.92|0.97  3.41|0.79|0.87  3.39|0.71|0.77  3.41|0.65|0.72  3.63|0.59|0.66  3.39|0.54|0.62  3.35|0.51|0.59  3.40|0.49|0.57  3.39|0.46|0.55  3.39|0.45|0.54  3.41|0.43|0.51  3.39|0.41|0.50  3.37|0.19|0.28  3.35|0.18|0.28
45      3.73|4.07|3.82  3.54|2.08|2.04  3.50|1.46|1.43  3.49|1.14|1.15  3.47|0.93|0.98  3.46|0.84|0.88  3.52|0.75|0.76  3.47|0.65|0.73  3.48|0.60|0.67  3.46|0.55|0.65  3.48|0.52|0.63  3.65|0.50|0.58  3.45|0.47|0.57  3.51|0.45|0.53  3.44|0.43|0.52  3.45|0.42|0.50  3.53|0.19|0.28  3.43|0.18|0.28
46      3.73|4.15|3.91  3.57|2.09|2.05  3.54|1.43|1.45  3.58|1.13|1.18  3.62|0.93|0.99  3.51|0.81|0.88  3.50|0.72|0.79  3.67|0.67|0.73  3.49|0.61|0.69  3.52|0.55|0.64  3.50|0.54|0.61  3.47|0.49|0.57  3.46|0.46|0.55  3.45|0.47|0.54  3.48|0.43|0.51  3.45|0.41|0.50  3.43|0.18|0.29  3.43|0.18|0.28
47      3.76|4.27|4.04  3.71|2.18|2.06  3.61|1.46|1.45  3.54|1.13|1.17  3.53|0.97|1.01  3.54|0.80|0.90  3.56|0.73|0.79  3.56|0.66|0.76  3.59|0.60|0.68  3.52|0.55|0.64  3.52|0.53|0.64  3.63|0.50|0.58  3.52|0.47|0.55  3.52|0.46|0.54  3.61|0.43|0.52  3.52|0.42|0.51  3.62|0.19|0.28  3.50|0.18|0.28
48      3.90|4.24|3.96  3.69|2.17|2.24  3.85|1.52|1.66  3.82|1.18|1.20  3.68|0.96|1.02  3.65|0.83|0.91  3.65|0.74|0.80  3.65|0.67|0.74  3.91|0.64|0.71  3.69|0.56|0.65  3.65|0.54|0.61  3.64|0.52|0.59  3.64|0.48|0.56  3.64|0.48|0.55  3.69|0.44|0.53  3.65|0.43|0.51  3.62|0.18|0.28  3.62|0.18|0.28
49      4.02|4.29|4.03  3.76|2.18|2.13  3.71|1.51|1.51  3.70|1.20|1.20  3.68|0.97|1.02  3.68|0.87|0.92  3.71|0.78|0.80  3.67|0.68|0.75  3.72|0.61|0.70  3.67|0.58|0.68  3.72|0.54|0.65  3.75|0.51|0.60  3.68|0.48|0.59  3.75|0.47|0.54  3.66|0.44|0.53  3.66|0.43|0.53  3.88|0.19|0.28  3.68|0.18|0.28
50      4.04|4.42|4.22  3.88|2.23|2.20  3.84|1.55|1.55  3.88|1.21|1.29  3.79|0.99|1.05  3.80|0.86|0.94  3.80|0.77|0.83  3.97|0.69|0.76  3.98|0.63|0.72  3.82|0.58|0.67  3.83|0.56|0.64  3.80|0.53|0.61  3.80|0.49|0.57  3.79|0.50|0.57  3.85|0.45|0.54  3.80|0.43|0.53  3.76|0.19|0.29  3.74|0.18|0.28
51      4.17|4.51|4.23  3.96|2.25|2.24  3.91|1.59|1.57  3.90|1.22|1.26  3.89|1.03|1.07  3.86|0.90|0.99  3.98|0.78|0.84  3.88|0.70|0.79  3.89|0.63|0.71  3.86|0.59|0.68  3.86|0.56|0.67  4.07|0.55|0.62  3.85|0.50|0.58  3.86|0.48|0.59  3.95|0.45|0.54  3.85|0.44|0.53  3.98|0.19|0.29  3.84|0.18|0.28
52      4.22|4.57|4.28  4.03|2.31|2.34  4.05|1.60|1.69  4.07|1.26|1.29  3.98|1.03|1.09  3.97|0.89|0.96  3.98|0.79|0.86  3.97|0.72|0.78  3.98|0.65|0.74  3.99|0.60|0.69  3.97|0.57|0.65  3.96|0.54|0.62  3.96|0.51|0.59  3.96|0.51|0.60  4.08|0.46|0.55  3.97|0.45|0.54  3.93|0.19|0.28  3.93|0.18|0.28
53      4.32|4.56|4.38  4.09|2.36|2.31  4.05|1.62|1.62  4.03|1.28|1.30  4.00|1.03|1.09  4.00|0.90|0.98  4.06|0.80|0.85  4.02|0.72|0.79  4.00|0.65|0.73  4.01|0.61|0.72  4.05|0.57|0.68  4.08|0.54|0.63  4.01|0.51|0.59  4.14|0.49|0.57  4.00|0.47|0.55  4.00|0.46|0.54  4.18|0.19|0.29  3.97|0.18|0.28
54      4.34|4.70|4.52  4.18|2.37|2.33  4.11|1.63|1.64  4.32|1.30|1.32  4.21|1.05|1.10  4.08|0.91|0.96  4.07|0.80|0.87  4.27|0.75|0.80  4.05|0.67|0.75  4.10|0.62|0.70  4.08|0.59|0.66  4.06|0.54|0.63  4.06|0.51|0.60  4.07|0.52|0.59  4.10|0.47|0.56  4.07|0.46|0.54  4.04|0.19|0.29  4.03|0.18|0.28
55      4.51|4.78|4.47  4.22|2.39|2.35  4.17|1.65|1.66  4.19|1.28|1.32  4.17|1.06|1.14  4.19|0.90|1.04  4.19|0.82|0.89  4.19|0.75|0.85  4.27|0.66|0.74  4.16|0.63|0.71  4.18|0.58|0.67  4.39|0.58|0.65  4.17|0.52|0.61  4.17|0.51|0.58  4.35|0.48|0.56  4.15|0.47|0.55  4.26|0.19|0.29  4.15|0.18|0.29
56      4.52|4.84|4.55  4.29|2.46|2.44  4.36|1.68|1.69  4.34|1.32|1.36  4.27|1.09|1.14  4.25|0.93|1.01  4.25|0.83|0.89  4.24|0.75|0.82  4.24|0.68|0.75  4.22|0.63|0.71  4.20|0.58|0.67  4.20|0.55|0.63  4.18|0.52|0.60  4.19|0.49|0.59  4.24|0.48|0.56  4.19|0.46|0.55  4.17|0.18|0.28  4.17|0.18|0.28
57      4.57|5.03|4.77  4.39|2.46|2.43  4.30|1.71|1.69  4.30|1.35|1.36  4.28|1.09|1.14  4.26|0.93|1.03  4.31|0.84|0.91  4.32|0.77|0.83  4.32|0.69|0.77  4.30|0.64|0.75  4.43|0.58|0.67  4.39|0.56|0.64  4.27|0.53|0.61  4.41|0.50|0.58  4.24|0.49|0.58  4.32|0.46|0.56  4.51|0.19|0.29  4.29|0.18|0.28
58      4.64|5.01|4.79  4.46|2.52|2.48  4.40|1.75|1.74  4.55|1.34|1.40  4.36|1.11|1.16  4.38|0.95|1.03  4.38|0.85|0.91  4.59|0.78|0.84  4.36|0.73|0.81  4.56|0.65|0.74  4.44|0.62|0.71  4.43|0.57|0.66  4.46|0.55|0.62  4.44|0.51|0.62  4.48|0.50|0.59  4.42|0.47|0.57  4.39|0.19|0.29  4.37|0.18|0.28
59      4.69|5.14|4.79  4.51|2.61|2.48  4.44|1.77|1.76  4.44|1.35|1.40  4.42|1.12|1.19  4.42|0.96|1.08  4.49|0.86|0.92  4.43|0.78|0.87  4.48|0.69|0.77  4.40|0.64|0.73  4.41|0.60|0.69  4.51|0.57|0.65  4.42|0.54|0.62  4.42|0.51|0.60  4.55|0.49|0.58  4.50|0.49|0.57  4.40|0.19|0.30  4.37|0.18|0.28
60      4.82|5.14|4.83  4.57|2.60|2.65  4.62|1.78|1.79  4.65|1.39|1.44  4.56|1.15|1.19  4.53|0.98|1.06  4.53|0.86|0.92  4.53|0.79|0.86  4.55|0.71|0.79  4.58|0.67|0.75  4.54|0.61|0.70  4.54|0.58|0.66  4.53|0.55|0.63  4.52|0.54|0.62  4.56|0.51|0.59  4.53|0.49|0.57  4.51|0.19|0.28  4.49|0.18|0.28
61      4.90|5.15|4.98  4.68|2.65|2.61  4.64|1.83|1.82  4.63|1.44|1.45  4.59|1.15|1.21  4.60|1.03|1.12  4.76|0.88|0.92  4.60|0.80|0.87  4.61|0.71|0.79  4.59|0.67|0.78  4.69|0.62|0.74  4.67|0.59|0.67  4.60|0.56|0.64  4.78|0.54|0.61  4.58|0.51|0.59  4.58|0.48|0.58  4.78|0.19|0.29  4.58|0.18|0.28
62      4.93|5.31|5.12  4.75|2.65|2.62  4.70|1.82|1.91  4.72|1.42|1.47  4.67|1.17|1.22  4.67|1.00|1.08  4.66|0.88|0.95  4.88|0.82|0.87  4.68|0.73|0.81  4.68|0.68|0.76  4.67|0.64|0.72  4.64|0.59|0.68  4.65|0.56|0.64  4.65|0.55|0.65  4.79|0.51|0.60  4.68|0.50|0.58  4.62|0.19|0.29  4.61|0.18|0.28
63      5.02|5.25|5.11  4.85|2.69|2.67  4.79|1.85|1.87  4.78|1.43|1.48  4.77|1.23|1.26  4.74|1.05|1.11  4.83|0.89|0.97  4.78|0.82|0.91  4.82|0.73|0.81  4.76|0.69|0.77  4.75|0.63|0.76  4.93|0.60|0.68  4.78|0.57|0.65  4.75|0.54|0.62  4.91|0.52|0.60  4.73|0.49|0.59  4.69|0.19|0.28  4.68|0.18|0.28
64      5.10| na |5.10  4.81| na |2.79  4.90| na |1.96  5.06| na |1.50  4.84| na |1.25  4.82| na |1.11  4.80| na |0.96  4.80| na |0.89  5.06| na |0.82  4.83| na |0.78  4.83| na |0.72  4.80| na |0.69  4.79| na |0.66  4.79| na |0.64  4.83| na |0.61  4.80| na |0.59  4.76| na |0.29  4.76| na |0.28

vneiger · 2025-10-27T16:18:31Z

On my list to finalize this:

add documentation
see if memory consumption of the Shoup variant can be reduced to 2*len instead of 4*len
add case for doing the naive approach when len == 1
check performance a bit more carefully on a few machines, and add some logic in the main function to refine the choice of variant depending on length/bitsize (notably for very small lengths and very small bitsize)

- reduce length of temporary vector to 3*n instead of 4*n - check if len == 1 and do naive approach in that case

vneiger · 2025-10-27T21:32:39Z

This PR is ready for review. Performance on three machines attached. On recent-ish machines, the non naive approach is basically always interesting.

On older machines with a slower integer division, the gain is less significant, and the naive approach would actually be faster for moduli of very small bitsizes (up to bitsize 5 or so). This was not enough to convince me that having thresholds for this could be useful, but this can be discussed.

profile.txt

albinahlback · 2025-10-27T22:16:10Z

On older machines with a slower integer division, the gain is less significant, and the naive approach would actually be faster for moduli of very small bitsizes (up to bitsize 5 or so). This was not enough to convince me that having thresholds for this could be useful, but this can be discussed.

Cool! I think it is okay, Cascade Lake is 6 years old anyway.

doc/source/nmod_vec.rst

src/nmod_vec/test/t-invert.c

Co-authored-by: Albin Ahlbäck <albin.ahlback@gmail.com>

…nmod_vec_inv

fredrik-johansson · 2025-10-28T12:22:50Z

This is a nice speedup. Do you have an application?

I know some functions where we need to construct [1, 1/2, 1/3, 1/4, ...], but this is a special case where one should be able to do a bit better than a general algorithm (there is a slightly-less than naive _gr_nmod_vec_reciprocals which isn't really optimized).

vneiger · 2025-10-28T13:27:08Z

This is a nice speedup. Do you have an application?

This appears in functions for Cauchy / Cauchy-like matrices, but I'm not sure about any plans for such structured linear algebra being in FLINT in the near future.

Recently I needed this when writing draft code for rational reconstruction, which I would like to add to FLINT in the somehow-near future. More precisely this was the Cauchy interpolation case of rational reconstruction (as in [von zur Gathen and Gerhard, Section 5.8]), where I had to invert a bunch of evaluations of a polynomial. This might also be useful more generally for related algorithms that compute with 2 x 2 univariate polynomial matrices like the half-gcd or Padé approximation, when multiplications are done through FFT evaluation-interpolation.

edgarcosta · 2025-10-28T13:43:04Z

This is also useful for BSGS algorithms. Many times one needs to patch the output of the main loop with an inverse coming from a known sequence apriori.

vneiger · 2025-10-28T14:01:41Z

This is also useful for BSGS algorithms. Many times one needs to patch the output of the main loop with an inverse coming from a known sequence apriori.

Thanks, that's good to know.

I don't have more to add to this PR, and regarding the initial PR message, I am not sure about adding a function like

void nmod_inv2(ulong * inv_a1, ulong * inv_a2, ulong a1, ulong a2, nmod_t mod)

because I don't have a use case in mind. Unless someone has suggestions, for me this is ready for merge.

fredrik-johansson · 2025-10-28T17:38:40Z

Thanks!

first version of nmod_vec_invert, with profile and tests

31533cb

vneiger added 3 commits October 27, 2025 17:20

two small fixes

65487ea

- add documentation

8790d47

- reduce length of temporary vector to 3*n instead of 4*n - check if len == 1 and do naive approach in that case

remove fixme

858cd21

vneiger marked this pull request as ready for review October 27, 2025 21:32

albinahlback reviewed Oct 27, 2025

View reviewed changes

doc/source/nmod_vec.rst Outdated Show resolved Hide resolved

src/nmod_vec/test/t-invert.c Show resolved Hide resolved

vneiger and others added 3 commits October 27, 2025 23:57

Update doc/source/nmod_vec.rst

dc1772a

Co-authored-by: Albin Ahlbäck <albin.ahlback@gmail.com>

fix copyright

3f99421

Merge branch 'add_nmod_vec_inv' of github.com:vneiger/flint into add_…

504023d

…nmod_vec_inv

vneiger added workshop 2025v2 performance new feature labels Oct 28, 2025

vneiger merged commit c6147f4 into flintlib:main Oct 28, 2025
14 checks passed

vneiger mentioned this pull request Oct 30, 2025

Port PML's geometric evaluation / interpolation integration for univariate polynomial #2449

Merged

vneiger deleted the add_nmod_vec_inv branch November 4, 2025 00:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add nmod_vec_invert: invert an array of nmod coefficients#2432

Add nmod_vec_invert: invert an array of nmod coefficients#2432
vneiger merged 7 commits intoflintlib:mainfrom
vneiger:add_nmod_vec_inv

vneiger commented Oct 27, 2025 •

edited

Loading

Uh oh!

vneiger commented Oct 27, 2025 •

edited

Loading

Uh oh!

vneiger commented Oct 27, 2025

Uh oh!

albinahlback commented Oct 27, 2025

Uh oh!

Uh oh!

Uh oh!

fredrik-johansson commented Oct 28, 2025

Uh oh!

vneiger commented Oct 28, 2025

Uh oh!

edgarcosta commented Oct 28, 2025

Uh oh!

vneiger commented Oct 28, 2025

Uh oh!

Uh oh!

fredrik-johansson commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

vneiger commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vneiger commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vneiger commented Oct 27, 2025

Uh oh!

albinahlback commented Oct 27, 2025

Uh oh!

Uh oh!

Uh oh!

fredrik-johansson commented Oct 28, 2025

Uh oh!

vneiger commented Oct 28, 2025

Uh oh!

edgarcosta commented Oct 28, 2025

Uh oh!

vneiger commented Oct 28, 2025

Uh oh!

Uh oh!

fredrik-johansson commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vneiger commented Oct 27, 2025 •

edited

Loading

vneiger commented Oct 27, 2025 •

edited

Loading