Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize lagrange_poly() #432

Merged
merged 6 commits into from
Nov 10, 2022
Merged

Optimize lagrange_poly() #432

merged 6 commits into from
Nov 10, 2022

Conversation

mhostetter
Copy link
Owner

Fixes #427.

@Popeyef5 I don't think it is possible to significantly beat the "vanilla" implementation you have for your case in #427 of a very large prime. The reason is that I'm doing everything you're doing (and in the same way, ie in pure Python). But I'm also doing more -- argument verification, bounds checking in FieldArray, and other overhead when creating the FieldArray and Poly classes -- that I can't remove. However, I should be able to come close to the "vanilla" implementation when ufunc_mode="python-calculate".

Now in cases where I can use Numba and JIT compile (ie ufunc_mode="jit-lookup" or ufunc_mode="jit-calculate"), I do outperform the "vanilla" version.

I agree galois is not currently close in your use case. I found a few issues:

Below are some performance examples.

Note lagrange_poly() is the old version and lagrange_poly2() is the new version. Once done testing, the old will be replaced with the new.

Your example modified `test_427.py`
import galois
import numpy as np


def mod_inv(x, p):
  x = x % p
  t = 0
  nt = 1
  r = p
  nr = x
  while nr != 0:
    q = r // nr
    t, nt = nt, t - q * nt
    r, nr = nr, r - q * nr
  if r != 1:
    raise ValeError("number has no inverse")
  return t % p


def polydivmod(u, v, p):
  m = len(u)
  n = len(v)
  s = mod_inv(v[0], p)
  q = np.array([0]*max(m-n+1, 1), dtype=object)
  r = np.copy(u)

  for i in range(m - n +1):
    d = s * r[i] % p
    q[i] = d
    r[i:i+n] = np.mod(r[i:i+n] - v * d, p)

  first = 0
  for i in range(len(r)-1):
    if r[i] != 0:
      break
    first += 1

  r = r[first:]

  return q, r


class fpoly1d:

  def __init__(self, c, prime):
    self.prime = prime

    if isinstance(c, fpoly1d):
      self.coeffs = c.coeffs
      return

    c = np.atleast_1d(c).astype(object)
    c = np.trim_zeros(c, 'f')
    self.coeffs = np.mod(c, self.prime)

  def __mul__(self, other):
    if isinstance(other, fpoly1d):
      return fpoly1d(np.convolve(self.coeffs, other.coeffs), self.prime)
    else:
      return fpoly1d(self.coeffs * other, self.prime)

  def __rmul__(self, other):
    if np.isscalar(other):
      return fpoly1d(other * self.coeffs, self.prime)
    else:
      other = fpoly1d(other, self.prime)
      return fpoly1d(np.convolve(self.coeffs, other.coeffs), self.prime)

  def __add__(self, other):
    other = fpoly1d(other, self.prime)
    return fpoly1d(np.polyadd(self.coeffs, other.coeffs), self.prime)

  def __radd__(self, other):
    other = fpoly1d(other, self.prime)
    return fpoly1d(np.polyadd(self.coeffs, other.coeffs), self.prime)

  def __div__(self, other):
    if np.isscalar(other):
      other_inv = mod_inv(other, self.prime)
      return fpoly1d(self.coeffs * other_inv, self.prime)
    q, r = polydivmod(self.coeffs, other.coeffs, self.prime)
    return fpoly1d(q, self.prime), fpoly1d(r, self.prime)

  __truediv__ = __div__

  def __rdiv__(self, other):
    if np.isscalar(other):
      other_inv = mod_inv(other, self.prime)
      return fpoly1d(self.coeffs * other_inv, self.prime)
    q, r = polydivmod(self.coeffs, other.coeffs, self.prime)
    return fpoly1d(q, self.prime), fpoly1d(r, self.prime)


def vanilla_interpolate(x, y, prime):
  # Emulating SciPy's Lagrange implementation
  poly = fpoly1d(0, prime)
  for i in np.flatnonzero(y):
    pt = fpoly1d(y[i], prime)
    for j in range(len(x)):
      if i == j:
        continue
      fac = x[i] - x[j]
      pt *= fpoly1d([1, -x[j]], prime) / fac
    poly += pt
  return poly

Example 1: JIT + lookup tables -- 50x

In [1]: %run test_427

In [2]: GF = galois.GF(13693)

In [3]: GF.ufunc_mode
Out[3]: 'jit-lookup'

In [4]: X = GF.Random(100, seed=1)

In [5]: Y = GF.Random(100, seed=2)

In [6]: %timeit p_g = galois.lagrange_poly(X, Y)
2.77 s ± 6.75 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [7]: %timeit p_g2 = galois.lagrange_poly2(X, Y)
4.64 ms ± 76.6 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [8]: %timeit p_v = vanilla_interpolate(X.view(np.ndarray).astype(object), Y.view(np.ndarray).astype(object), GF.order)
235 ms ± 3.07 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Example 2: JIT + explicit calculation -- 43x

In [1]: %run test_427

In [2]: GF = galois.GF(2424113)

In [3]: GF.ufunc_mode
Out[3]: 'jit-calculate'

In [4]: X = GF.Random(100, seed=1)

In [5]: Y = GF.Random(100, seed=2)

In [6]: %timeit p_g = galois.lagrange_poly(X, Y)
2.55 s ± 12.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [7]: %timeit p_g2 = galois.lagrange_poly2(X, Y)
5.35 ms ± 34.4 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [8]: %timeit p_v = vanilla_interpolate(X.view(np.ndarray).astype(object), Y.view(np.ndarray).astype(object), GF.order)
254 ms ± 1.76 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Example 3: Pure Python calculation -- 1.13x

In [1]: %run test_427

In [2]: GF = galois.GF(21888242871839275222246405745257275088696311157297823662689037894645226208583, primitive_element=3, verify=False)

In [3]: GF.ufunc_mode
Out[3]: 'python-calculate'

In [4]: X = GF.Random(100, seed=1)

In [5]: Y = GF.Random(100, seed=2)

In [6]: %timeit p_g = galois.lagrange_poly(X, Y)
3.9 s ± 6.96 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [7]: %timeit p_g2 = galois.lagrange_poly2(X, Y)
709 ms ± 8.83 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [8]: %timeit p_v = vanilla_interpolate(X.view(np.ndarray).astype(object), Y.view(np.ndarray).astype(object), GF.order)
802 ms ± 4.01 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

@mhostetter mhostetter added performance Affects speed/performance poly Related to polynomials labels Nov 8, 2022
@codecov
Copy link

codecov bot commented Nov 8, 2022

Codecov Report

Base: 97.06% // Head: 96.97% // Decreases project coverage by -0.08% ⚠️

Coverage data is based on head (65d4d75) compared to base (8b8a40e).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #432      +/-   ##
==========================================
- Coverage   97.06%   96.97%   -0.09%     
==========================================
  Files          42       42              
  Lines        5346     5396      +50     
==========================================
+ Hits         5189     5233      +44     
- Misses        157      163       +6     
Impacted Files Coverage Δ
src/galois/_fields/_array.py 98.90% <100.00%> (-0.01%) ⬇️
src/galois/_polys/_dense.py 96.79% <100.00%> (-1.15%) ⬇️
src/galois/_polys/_functions.py 100.00% <100.00%> (ø)
src/galois/_polys/_poly.py 96.65% <100.00%> (ø)
src/galois/_polys/_conversions.py 98.19% <0.00%> (-0.91%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Fixes #434. This is done to be more consistent with NumPy's repr()/str() styling.
@mhostetter mhostetter merged commit e07461b into master Nov 10, 2022
@mhostetter mhostetter deleted the optimize-lagrange-poly branch November 10, 2022 01:08
@Popeyef5
Copy link

Popeyef5 commented Jan 3, 2023

Awesome, thank you for replying so quickly and sorry for the late reply. I got caight up with work and had to put the project on hold for a bit.

Regarding this, my issue is that I'm working with large prime finite fields, so JIT compilation isn't an option. I'm looking into different alternatives because it's taking prohibitely long to interpolate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Affects speed/performance poly Related to polynomials
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Lagrange interpolation speed problem
2 participants