Skip to content

Commit

Permalink
add docstrings and tests, bump version number
Browse files Browse the repository at this point in the history
  • Loading branch information
m-stclair committed May 13, 2024
1 parent 5f18b6b commit de61e55
Show file tree
Hide file tree
Showing 9 changed files with 216 additions and 71 deletions.
73 changes: 42 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ parameters) to ~100x (complicated functions, some parameter tuning).
Install from source using `pip install .`. Dependencies are also described
in a Conda `environment.yml` file.

Examples and tips follow. Further documentation forthcoming.
The minimum supported version of Python is *3.11*.

## example of use

Expand All @@ -34,36 +34,8 @@ approx runtime:
325 µs ± 3.89 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
```

## limitations

* `quickseries` only works for functions ℝ<sup>_n_</sup>🡒ℝ for finite _n_. In
programming terms, this means it will only produce functions that accept a
fixed number of floating-point or integer arguments (which may be 'arraylike'
objects such as pandas `Series` or numpy `ndarrays`) and return a single
floating-point value (or a 1-D floating-point array if passed arraylike
arguments).
* `quickseries` only works consistently on functions that are continuous and
infinitely differentiable within the domain of interest. Specifically, they
should not have singularities, discontinuities, or infinite / undefined
values at `point` or within `bounds`. Failure cases differ:
* `quickseries` will always fail on functions that are infinite/undefined
at `point`, like `quickseries("ln(x)", point=-1)`.
* It will almost always fail on functions with a largeish interval of
infinite/undefined values within `bounds`, such as
`quickseries("gamma(x)", bounds=(-1.1, 0), point=-0.5)`.
* It will usually succeed but produce bad results on functions with
singularities or point discontinuities within `bounds` or
near `point` but not at `point`, such as `quickseries("tan(x)", bounds=(1, 2))`.
* It will often succeed, but usually produce bad results, on univariate
functions that are continuous but not differentiable at `point`, such as
`quickseries("abs(sin(x))", point=0)`. It will always fail on multivariate
functions of this kind.
* Functions given to `quickseries` must be expressed in strict closed form
and include only finite terms. They cannot contain limits, integrals,
derivatives, summations, continued fractions, etc.
* `quickseries` is not guaranteed to work for all such functions.

## tips
## usage notes

* Multivariate `quickseries()`-generated functions always map positional arguments
to variables in the string representation of the input function in alphanumeric
Expand Down Expand Up @@ -209,4 +181,43 @@ install it with your preferred package manager.
* If `jit=True`, `quickseries` does _not_ do this by default. The `numba`
compiler implicitly performs a similar optimization, and computing these
terms explicitly tends to be counterproductive. If you want `quickseries`
to do it anyway, you can pass `prefactor=True`.
to do it anyway, you can pass `prefactor=True`.


## tips


## limitations

* `quickseries` only works for functions ℝ<sup>_n_</sup>🡒ℝ for finite _n_. In
programming terms, this means it will only produce functions that accept a
fixed number of floating-point or integer arguments (which may be 'arraylike'
objects such as pandas `Series` or numpy `ndarrays`) and return a single
floating-point value (or a 1-D floating-point array if passed arraylike
arguments).
* `quickseries` only works consistently on functions that are continuous and
infinitely differentiable within the domain of interest. Specifically, they
should not have singularities, discontinuities, or infinite / undefined
values at `point` or within `bounds`. Failure cases differ:
* `quickseries` will always fail on functions that are infinite/undefined
at `point`, like `quickseries("ln(x)", point=-1)`.
* It will almost always fail on functions with a largeish interval of
infinite/undefined values within `bounds`, such as
`quickseries("gamma(x)", bounds=(-1.1, 0), point=-0.5)`.
* It will usually succeed but produce bad results on functions with
singularities or point discontinuities within `bounds` or
near `point` but not at `point`, such as `quickseries("tan(x)", bounds=(1, 2))`.
* It will often succeed, but usually produce bad results, on univariate
functions that are continuous but not differentiable at `point`, such as
`quickseries("abs(sin(x))", point=0)`. It will always fail on multivariate
functions of this kind.
* Functions given to `quickseries` must be expressed in strict closed form
and include only finite terms. They cannot contain limits, integrals,
derivatives, summations, continued fractions, etc.
* `quickseries` is not guaranteed to work for all such functions.

## tests

`quickseries` has a few simple tests. You can run them by executing `pytest`
in the repository's root directory. More comprehensive test coverage is
planned.
2 changes: 1 addition & 1 deletion quickseries/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from quickseries.approximate import quickseries
from quickseries.benchmark import benchmark

__version__ = "0.2.0"
__version__ = "0.2.1"
62 changes: 40 additions & 22 deletions quickseries/approximate.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import re
from inspect import getfullargspec, signature
from itertools import chain
from typing import Literal, Optional, Sequence, Union
from typing import Literal, Optional, Sequence, Union, Collection

import numpy as np
import sympy as sp
Expand Down Expand Up @@ -31,7 +31,11 @@ def regexponents(text: str) -> tuple[int]:
return tuple(map(int, re.findall(EXP_PATTERN, text)))


def _decompose(remaining, reduced, replacements):
def _decompose(
remaining: tuple[str],
reduced: set[str],
replacements: list[tuple[int, list[int]]]
) -> bool:
if len(remaining) == 1: # trivial case
replacements[0][1][:] = [1 for _ in range(replacements[0][0])]
return True
Expand Down Expand Up @@ -154,7 +158,9 @@ def rewrite(
return "\n ".join(lines)


def _rewrite_precomputed(polyexpr, free):
def _rewrite_precomputed(
polyexpr: str, free: Collection[str]
) -> tuple[str, list[str]]:
# replacements: what factors we will decompose each exponent into
# free: which factors we will define as variables, and their
# "building blocks"
Expand All @@ -178,15 +184,23 @@ def _rewrite_precomputed(polyexpr, free):
return polyexpr, factorlines


def _pvec(bounds, offset_resolution):
def _pvec(
bounds: Sequence[tuple[float, float]], offset_resolution: int
) -> list[np.ndarray]:
axes = [np.linspace(*b, offset_resolution) for b in bounds]
indices = map(np.ravel, np.indices([offset_resolution for _ in bounds]))
return [j[i] for j, i in zip(axes, indices)]


def _perform_series_fit(
func, bounds, nterms, fitres, point, apply_bounds, is_poly
):
func: str | sp.Expr,
bounds: tuple[float, float] | Sequence[tuple[float, float]],
nterms: int,
fitres: int,
point: float | Sequence[float],
apply_bounds: bool,
is_poly: bool
) -> tuple[sp.Expr, np.ndarray]:
if (len(bounds) == 1) and (is_poly is False):
approx, expr = series_lambda(func, point[0], nterms, True)
else:
Expand All @@ -212,29 +226,33 @@ def _perform_series_fit(
return expr, params


def _makebounds(bounds, n_free, x0):
def _makebounds(
bounds: Optional[Sequence[tuple[float, float]] | tuple[float, float]],
n_free: int,
point: Optional[Sequence[float] | float]
) -> tuple[list[tuple[float, float]], list[float]]:
bounds = (-1, 1) if bounds is None else bounds
if not isinstance(bounds[0], (list, tuple)):
bounds = [bounds for _ in range(n_free)]
if x0 is None:
x0 = [np.mean(b) for b in bounds]
elif not isinstance(x0, (list, tuple)):
x0 = [x0 for _ in bounds]
return bounds, x0
if point is None:
point = [np.mean(b) for b in bounds]
elif not isinstance(point, (list, tuple)):
point = [point for _ in bounds]
return bounds, point


def _make_quickseries(
approx_poly,
bound_series_fit,
bounds,
approx_poly: bool,
bound_series_fit: bool,
bounds: Optional[Sequence[tuple[float, float]] | tuple[float, float]],
expr: sp.Expr,
fit_series_expansion,
fitres,
nterms,
point,
precision,
prefactor,
):
fit_series_expansion: bool,
fitres: int,
nterms: int,
point: Optional[Sequence[float] | float],
precision: Optional[Literal[16, 32, 64]],
prefactor: bool,
) -> dict[str, sp.Expr | np.ndarray | str]:
if len(expr.free_symbols) == 0:
raise ValueError("func must have at least one free variable.")
free = sorted(expr.free_symbols, key=lambda s: str(s))
Expand Down
20 changes: 10 additions & 10 deletions quickseries/benchmark.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,25 @@
from inspect import getfullargspec
from itertools import product
from time import time
from typing import Union
from typing import Union, Sequence, Optional

import numpy as np
import sympy as sp
from dustgoggles.func import gmap

from quickseries import quickseries
from quickseries.approximate import _makebounds
from quickseries.sputils import lambdify
from quickseries.sputils import lambdify, LmSig


def _offset_check_cycle(
absdiff,
frange,
lamb,
quick,
vecs,
worstpoint,
):
absdiff: float,
frange: tuple[float, float],
lamb: LmSig,
quick: LmSig,
vecs: Sequence[np.ndarray],
worstpoint: Optional[list[float]],
) -> tuple[float, float, float, tuple[float, float], list[float]]:
approx_y, orig_y = quick(*vecs), lamb(*vecs)
frange = (min(orig_y.min(), frange[0]), max(orig_y.max(), frange[1]))
offset = abs(approx_y - orig_y)
Expand All @@ -39,7 +39,7 @@ def benchmark(
testbounds="equal",
cache: bool = False,
**quickkwargs
):
) -> dict[str, sp.Expr | float | np.ndarray | str | list[float]]:
lamb = lambdify(func)
compile_start = time()
quick, ext = quickseries(
Expand Down
4 changes: 2 additions & 2 deletions quickseries/expansions.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,8 +110,8 @@ def multivariate_taylor(
[(x - a) ** i for x, a, i in zip(argsyms, pointsyms, ixsyms)]
)
taylor = deriv / fact * err
# TODO, probably: there's a considerably faster way to do this in some cases
# by precomputing partial derivatives
# TODO, probably: there's a considerably faster way to do this in some
# cases by precomputing partial derivatives
decomp = additive_combinations(dimensionality, nterms - 1)
built = reduce(
sp.Add,
Expand Down
9 changes: 7 additions & 2 deletions quickseries/simplefit.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,11 @@
from scipy.optimize import curve_fit


def fit_wrap(func, dimensionality, fit_parameters):
def fit_wrap(
func: Callable[[np.ndarray | float, ...], np.ndarray | float],
dimensionality: int,
fit_parameters: Sequence[str]
) -> Callable[[np.ndarray | float, ...], np.ndarray | float]:
@wraps(func)
def wrapped_fit(independent_variable, *params):
variable_components = [
Expand All @@ -34,7 +38,7 @@ def fit(
bounds: Optional[
Union[tuple[tuple[float, float]], tuple[float, float]]
] = None
):
) -> tuple[np.ndarray, np.ndarray]:
sig = signature(func)
assert len(vecs) < len(sig.parameters), (
"The model function must have at least one 'free' "
Expand All @@ -50,6 +54,7 @@ def fit(
raise ValueError("each input vector must be 1-dimensional")
# TODO: optional goodness-of-fit evaluation
kw = {'bounds': bounds} if bounds is not None else {}
# noinspection PyTypeChecker
return curve_fit(
fit_wrap(func, len(vecs), fit_parameters),
vecs,
Expand Down
2 changes: 1 addition & 1 deletion quickseries/sputils.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import numpy as np
import sympy as sp

LmSig = Callable[[Any], Union[np.ndarray, float]]
LmSig = Callable[[np.ndarray | float, ...], np.ndarray | float]


def lambdify(
Expand Down
Loading

0 comments on commit de61e55

Please sign in to comment.