You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
Backpropagation through random variables is no easy task. Two main methods are often adopted for derivative estimation: score function estimator and pathwise derivative estimator (see https://arxiv.org/abs/1506.05254 for more details). The former one is wildly used in reinforcement learning while the pathwise derivative estimator could be seen a lot in variational autoencoder related models, often referred to as the reparameterization trick. One of the key differences between the two method is that, pathwise derivative estimator requires the derivative of density function f(x;θ) with respect to the parameter, which requires the sampling operation to have gradient, while the SF estimator could bypass such calculation by using log derivative trick.
Proposal
I'm planning to prototype the pathwise gradient for some of the sampling methods in Deep Numpy (Gaussian and Gamma for now) by applying the following modification:
Add require_grads parameter in python frontend.
Add backward function in the backend.
If my experiment goes well, these enhanced sampling methods could possibly serve as the foundation for the distribution module mentioned in MXNet 2.0 Roadmap #16167
Also, differentiable sampling has been introduced into both Tensorflow (tf.distributions) and Pytorch (torch.distributions) for many years, I think it is necessary for MXNet to have such feature as well.
Update:
Gradient for Gaussian added, under review. #16330
Next, I will try to implement a vanilla VAE demo based on it to find out if the interface is easy to use in practice.
The text was updated successfully, but these errors were encountered:
Hey, this is the MXNet Label Bot.
Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it.
Here are my recommended label(s): Feature
Background
Backpropagation through random variables is no easy task. Two main methods are often adopted for derivative estimation: score function estimator and pathwise derivative estimator (see https://arxiv.org/abs/1506.05254 for more details). The former one is wildly used in reinforcement learning while the pathwise derivative estimator could be seen a lot in variational autoencoder related models, often referred to as the reparameterization trick. One of the key differences between the two method is that, pathwise derivative estimator requires the derivative of density function f(x;θ) with respect to the parameter, which requires the sampling operation to have gradient, while the SF estimator could bypass such calculation by using log derivative trick.
Proposal
I'm planning to prototype the pathwise gradient for some of the sampling methods in Deep Numpy (Gaussian and Gamma for now) by applying the following modification:
require_grads
parameter in python frontend.If my experiment goes well, these enhanced sampling methods could possibly serve as the foundation for the
distribution
module mentioned in MXNet 2.0 Roadmap #16167Also, differentiable sampling has been introduced into both Tensorflow (tf.distributions) and Pytorch (torch.distributions) for many years, I think it is necessary for MXNet to have such feature as well.
Update:
Gradient for Gaussian added, under review.
#16330
Next, I will try to implement a vanilla VAE demo based on it to find out if the interface is easy to use in practice.
The text was updated successfully, but these errors were encountered: