-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: special function extension #725
Comments
Really happy to see the discussion get started. One point I'll add is that, of course, there are many other special functions that are widely used. Indeed, many of the ones important to me (like the Bessel functions) are not covered by the above list. Why? We started with a minimal set of functions that are easily implementable everywhere. It's not super helpful to propose a standard for a special function that other array libraries will not implement because it's too much effort. This is why the task of converting SciPy's internal special function implementations into C++, see scipy/scipy#19404, is relevant and important. |
Thanks @mdhaber! Very nice write-up. Looking forward to the discussion. |
Thanks for all the hard work on this @mdhaber, @izaid and @steppi! I'll add a few initial thoughts:
Such "not implemented" status has typically been a blocker for inclusion. For the |
Following the numbers used in #725 (comment):
Footnotes
|
@mdhaber Covered it all super well, but I'll chip in just a little.
As for Generally very happy to discuss! Think it's important to get this right. |
I would stay close for Digital Library of Mathematica Functions, https://dlmf.nist.gov/6.2, and perhaps name Verbosity is not an issue nowadays with Copilot and IDEs. |
Just curious, but would it make sense to copy or move |
We shouldn't move them. That would be a compatibility break with existing versions of the standard. It wouldn't be a big deal to duplicate them. There's a similar thing for some functions like |
It might also make sense to ask whether there might eventually be a "neural network" extension that reflects the functions in |
@NeilGirdhar Re: neural network extension. See #158, which you previously commented on. |
Great additions! My two cents about some of the points: +1 on supporting both lower and upper bounds of integration. The parameter convention of Python’s Are Where some implementation does not yet support complex argument of a function, does it make sense to standardize real argument first so that all implementations become compliant? Each implementation is then free to support complex argument. Regarding the default value of |
Functions like sum default to |
This RFC proposes adding a special function extension to the array API specification.
Overview
Several array libraries have some support for "special" functions (e.g.
gamma
), that is, mathematical functions that are broadly applicable but not considered to be "elementary" (e.g.sin
). We1 propose adding aspecial
sub-namespace to the array API specification, which would contain a number of special functions that are already implemented by many array libraries.Prior Art
We begin with 25 particularly important special functions that are either already available for NumPy, PyTorch, CuPy, and JAX arrays or are easily implemented. Partial information about their signatures in these libraries is included in the table below; parameters that are less commonly supported/used are omitted.
With the exception of log-sum-exp functions, which reduces along an axis, all work elementwise, producing an output that is the broadcasted shape of the arguments. The variable names shown are not necessarily those used by the referenced library; instead they are standardized with
x
/z
/n
denoting an arguments of real/complex/integer dtype.Further information about these functions in other languages (C++, Julia, Mathematica, Matlab, and R) is available in this spreadsheet.
Proposal
The Array API specification would include the following functions in a
special
sub-namespace.log_sum_exp(z, /, *, axis=-1, weights=None)
logit(x, /)
expit(x, /)
log_normcdf(a, b=None, /)
normcdf(a, b=None, /)
normcdf_inv(p, /, *, a=None, b=None)
digamma(z, /)
polygamma(n, x)
log_multigamma(x, n)
log_abs_gamma(z, /, *, a=None, b=None, regularized=None)
gamma(z, /, *, a=None, b=None, regularized=None)
log_abs_beta(x1, x2, /, *, a=None, b=None)
beta(x1, x2, /, *, a=None, b=None)
erf(a, b=None, /)
erf_inv(p, /, *, a=None, b=None)
zeta(x1, x2=None, /)
binom(x1, x2, /)
expinti(x, /)
expintv(n, x)
softmax(z, /)
log_softmax(z, /)
A few notes about the interface:
ndtr
; we call itnormcdf
(as it is named in Matlab).loggamma
to compute the log of thegamma
function,betaln
to compute the log of thebeta
function, andlog_ndtr
to compute the log of thendtr
function. For consistency, we form the name of the log-version of a function by prependinglog_
to the original function name.erf
to evaluate a particular definite integral from-oo
tox
anderfc
to evaluate the integral fromy
to+oo
without the potential for catastrophic cancellation associated with1 - erf(x)
. A related, unmet need is the ability to evaluate such integrals fromx
toy
without subtraction, e.g.erf(y) - erf(x)
. To better meet this need - and to avoid the need for a separate "complementary" functions - we provide arguments that allow specification ofa
andb
limits of integration.ndtri
to compute the inverse ofndtr
anderfinv
to compute the inverse oferf
. For consistency, we form the name of the inverse of a function by appending_inv
to the name.gamma
for the (unregularized) gamma function andgammainc
for the regularized incomplete gamma function. In these cases, we have only one function with a keyword argument (e.g.regularized
). In some cases, this helps to reduce duplication of similar function names and signatures; in others, it allows developers to be more explicit about which variant is being used.Where applicable, we find that these conventions generalize well to other special functions that might be added in the future.
Other notes about function selection:
binom
) does not seem to be implemented for PyTorch, CuPy, or JAX arrays, but the need is so fundamental that we wish to include it in the standard. A moderately robust version of the function can be implemented in terms of the log of the gamma function until a more robust, custom implementation is available.Questions / Points of Discussion:
z
rather thanx
) even if some libraries are not compliant initially?log_
and_inv
components in the name, the order of operations is ambiguous. For example, wouldlog_normcdf_inv
(which would be useful in statistics) be the logarithm of the inverse ofnormcdf
or the inverse of the logarithm ofnormcdf
?normcdf
,normcdf_inv
,log_normcdf
, andlog_normcdf_inv
,normcdf
would have attributesnormcdf.log
andnormcdf.inv
, andnormcdf.log
would have an attributenormcdf.log.inv
.log_
andinv_
to both be prefixes. However,_inv
typically appears as a suffix in existing special function names, perhaps because the superscript_log
and_inv
to be suffixes. However,log
typically appears as a prefix in existing function names, perhaps because this is how the function appears when typeset mathematically, e.g.range
function: it is natural forrange(y)
to denote a range with an upper limit ofy
and forrange(x, y)
to generate a range betweenx
andy
. However, if the arguments were allowed to be specified as keywords, it would be unclear how they should be named. The userange(y)
suggests that the name of the first argument might bestop
, butrange(x, y)
suggests that the name of the first argument should bestart
; assigning either name and allowing both positional and keyword specification leads to confusion. To avoid this ambiguity,range
requires that the arguments be passed as positional-only. We run into a similar situation with oura
andb
arguments. After carefully considering many possibilities, we have suggested the following above:a
/b
require that these arguments are positional-only.a
/b
require that these arguments are keyword-only.a
/b
is that they are somewhat restrictive. Users cannot callnormcdf(a=x, b=y)
with keywords to be explicit, nor can they be callgamma(z, x, y)
without keywords to be concise. A compromise would be to accept separate positional-only and keyword-only versions of the same argument, and implement logic to resolve the intended use. While this is anticipated to allow for both natural and flexible use, it would be somewhat more cumbersome to document and implement.regularized
argument ofgamma
is challenging to choose.gamma(z, upper=y)
) will typically be regularized, suggesting that aregularized=True
default is more appropriate for this use case.gamma(z)
) is identically 1, suggesting thatregularized=False
is more appropriate for this use case.regularized=None
. Whengamma
is used as the complete gamma function (withouta/b
),regularized
would be set toFalse
, and whengamma
is used as the incomplete gamma function (witha/b
,regularized
would be set toTrue
. However, this is more complex to document than choosing eitherTrue
orFalse
as the default.binom
are not interchangeable, suggesting that some users might prefer to pass arguments by keyword. On one hand,n
andk
would be reasonable names, since the binomial coefficient is often needed in situations that call for "n choose k". On the other hand, the namesn
andk
are not entirely universal, and the function is extended for real arguments, whereas namesn
andk
are suggestive of integer dtypes. Also, whilea
andb
are concise names that are commonly used for lower and upper limits of integration, they are not as descriptive aslower
/upper
, and might be confused with the symbols commonly used for different arguments of the same function (e.g.beta
).low
/high
,lo/hi
,ll
/ul
,c
/d
have also been proposed.Footnotes
@steppi, @izaid, @mdhaber, @rgommers ↩
The text was updated successfully, but these errors were encountered: