This section provides a list of examples demonstrating the use autodiff
in common scenarios.
The examples assume that you imported the package as follows:
from autodiff.array import Function, var, d
# This function can take any combination of
# literals and variables (both scalar and array).
def logistic(x, k):
return 1 / (1 + exp(-k * x)) # scalars broadcast to arrays
k = var(4)
y = var(logistic(0, k))
print("y =", y()) # y = 0.5
x = var(np.array([-1, 0, 1]))
y = var(logistic(x, 4))
print("y =", y()) # y = [0.01798621 0.5 0.98201379]
k = var(np.array([4, 4, 4]))
y = var(logistic(x, k))
print("y =", y()) # y = [0.01798621 0.5 0.98201379]
initial = var(0)
state = initial
for i in range(10):
state = var(state + 1) # evaluate to a NEW variable
print("state =", state()) # state = 10.0
f = Function(state)
f.pull_gradient_at(state)
print("∂state/∂initial =", d(initial)) # ∂state/∂initial = 1.0
For more details on when to use var
, see Variables vs expressions.
# Caution: if statements are not differentiable
def bad_relu(x):
return x if x > 0 else 0 # cannot differentiate this
def good_relu(x):
return maximum(x)
x = var(np.array([-1, 0, 1]))
y = var(good_relu(x))
print("y =", y()) # y = [0. 0. 1.]
You can, however, use conditionals to decide which expression to evaluate:
# Caution: if statements cannot depend on variables
x = var(1)
if x() > 0: # true
y = var(x)
else:
y = var(0) # never evaluated!
# from now on y = var(x)
f = Function(y)
x.set(-1)
f.evaluate()
print("y =", y()) # y = -1.0
Given a function
x = var(np.array([1, 2, 3]))
m = var(np.array([[1, 2, 3], [4, 5, 6]]))
y = var(m @ x) # matrix-vector product
f = Function(y)
f.pull_gradient_at(y)
print("∂f/∂x =\n", d(x)) # ∂f/∂x =
# [[1. 2. 3.]
# [4. 5. 6.]]
print("∂f/∂m =\n", d(m)) # ∂f/∂m =
# [[1. 0. 2. 0. 3. 0.]
# [0. 1. 0. 2. 0. 3.]]
Note
During differentiation, matrices are flattened column-wise.
Therefore, d(m)
returns a
For more details on the pull_gradient_at
method, see Reverse-mode differentiation (aka backpropagation).
Given a scalar function
x = var(np.array([1, 2, 3]))
y = var(norm(x)) # L²-norm
f = Function(y) # f : R³ → R, x ↦ y = ||x||
f.pull_gradient_at(y)
print("∇f =", d(x)) # ∇f = [[0.26726124 0.53452248 0.80178373]]
Note that the gradient of a scalar function is a
For more details, see Reverse-mode differentiation (aka backpropagation).
x = var(np.array([1, 2, 3]))
y = var(np.array([4, 5, 6]))
z = var(x * y) # element-wise multiplication
f = Function(z)
z.set_derivative(np.ones((1, 3))) # element-wise identity
f.pull_gradient()
print("∇_x f =", d(x)) # ∇_x f = [[4. 5. 6.]]
print("∇_y f =", d(y)) # ∇_y f = [[1. 2. 3.]]
If you used pull_gradient_at(z)
(as in the previous example), you would instead get
Given a function
then the following code using the push_tangent
method is much more efficient than computing the full Jacobian matrix first and then multiplying it with
# Directional derivative (Jacobian-vector product)
x = var(np.array([1, 2, 3]))
m = np.array([[1, 2, 3], [4, 5, 6]])
y = var(matmul(m, x)) # matrix-vector product
f = Function(y) # f : R³ → R², x ↦ y = m @ x
δx = np.array([1, 1, 1]) # direction vector
x.set_derivative(δx)
f.push_tangent()
print("δy =\n", d(y)) # δy =
# [[ 6.]
# [15.]]
For more details, see Forward-mode differentiation.