Skip to content

Latest commit

 

History

History
1028 lines (776 loc) · 58.1 KB

bayesian_torch.layers.md

File metadata and controls

1028 lines (776 loc) · 58.1 KB

bayesian_torch.layers module

A set of Bayesian neural network layers to perform stochastic variational inference

Layers

class BaseVariationalLayer_(torch.nn.Module)

Abstract class which inherits from torch.nn.Module

kl_div(mu_q, sigma_q, mu_p, sigma_p)

Calculates the Kullback-Leibler divergence from distribution normal Q (parametrized mu_q, sigma_q) to distribution normal P (parametrized mu_p, sigma_p)

Parameters:
  • mu_q: torch.Tensor -> mu parameter of distribution Q
  • sigma_q: torch.Tensor -> sigma parameter of distribution Q
  • mu_p: float -> mu parameter of distribution P
  • sigma_p: float -> sigma parameter of distribution P
Returns

torch.Tensor of shape 0

class LinearReparameterization

bayesian_torch.layers.LinearReparameterization(in_features, out_features, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)

Parameters:

  • in_features: int -> size of each input sample,
  • out_features: int -> size of each output sample,
  • prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
  • prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
  • posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
  • posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
  • bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,

forward(X)

Samples the weights with reparameterization and performs torch.nn.functional.linear.

Parameters:
  • X: torch.Tensor with shape (batch_size, in_features)
Returns:
  • torch.Tensor with shape = (X.shape[0], out_features), float corresponding to KL divergence from the samples weights distribution to the prior

class Conv1dReparameterization

bayesian_torch.layers.Conv1dReparameterization(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)

Parameters:

  • in_channels: int -> number of channels in the input image,
  • out_channels: int -> number of channels produced by the convolution,
  • kernel_size: int -> size of the convolving kernel,
  • stride: int -> stride of the convolution. Default: 1,
  • padding: int -> zero-padding added to both sides of the input. Default: 0,
  • dilation: int -> spacing between kernel elements. Default: 1,
  • groups: int -> number of blocked connections from input channels to output channels,
  • prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
  • prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
  • posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
  • posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
  • bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,

forward(X)

Samples the weights with reparameterzation and performs torch.nn.functional.conv1d. Check PyTorch official documentation for tensor output shape.

Parameters:
  • X: torch.Tensor with shape (batch_size, C, H)
Returns:
  • torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior

class Conv2dReparameterization

bayesian_torch.layers.Conv2dReparameterization(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)

Parameters:

  • in_channels: int -> number of channels in the input image,
  • out_channels: int -> number of channels produced by the convolution,
  • kernel_size: int -> size of the convolving kernel,
  • stride: int -> stride of the convolution. Default: 1,
  • padding: int -> zero-padding added to both sides of the input. Default: 0,
  • dilation: int -> spacing between kernel elements. Default: 1,
  • groups: int -> number of blocked connections from input channels to output channels,
  • prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
  • prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
  • posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
  • posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
  • bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,

forward(X)

Samples the weights with reparameterzation and performs torch.nn.functional.conv2d. Check PyTorch official documentation for tensor output shape.

Parameters:
  • X: torch.Tensor with shape (batch_size, C, H, W)
Returns:
  • torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior

class Conv3dReparameterization

bayesian_torch.layers.Conv3dReparameterization(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)

Parameters:

  • in_channels: int -> number of channels in the input image,
  • out_channels: int -> number of channels produced by the convolution,
  • kernel_size: int -> size of the convolving kernel,
  • stride: int -> stride of the convolution. Default: 1,
  • padding: int -> zero-padding added to both sides of the input. Default: 0,
  • dilation: int -> spacing between kernel elements. Default: 1,
  • groups: int -> number of blocked connections from input channels to output channels,
  • prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
  • prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
  • posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
  • posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
  • bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,

forward(X)

Samples the weights with reparameterzation and performs torch.nn.functional.conv3d. Check PyTorch official documentation for tensor output shape.

Parameters:
  • X: torch.Tensor with shape (batch_size, C, H, W, L)
Returns:
  • torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior

class ConvTranspose1dReparameterization

bayesian_torch.layers.ConvTranspose1dReparameterization(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)

Parameters:

  • in_channels: int -> number of channels in the input image,
  • out_channels: int -> number of channels produced by the convolution,
  • kernel_size: int -> size of the convolving kernel,
  • stride: int -> stride of the convolution. Default: 1,
  • padding: int -> zero-padding added to both sides of the input. Default: 0,
  • dilation: int -> spacing between kernel elements. Default: 1,
  • groups: int -> number of blocked connections from input channels to output channels,
  • prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
  • prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
  • posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
  • posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
  • bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,

forward(X)

Samples the weights with reparameterzation and performs torch.nn.functional.conv_transpose1d. Check PyTorch official documentation for tensor output shape.

Parameters:
  • X: torch.Tensor with shape (batch_size, C, H)
Returns:
  • torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior

class ConvTranspose2dReparameterization

bayesian_torch.layers.ConvTranspose2dReparameterization(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)

Parameters:

  • in_channels: int -> number of channels in the input image,
  • out_channels: int -> number of channels produced by the convolution,
  • kernel_size: int -> size of the convolving kernel,
  • stride: int -> stride of the convolution. Default: 1,
  • padding: int -> zero-padding added to both sides of the input. Default: 0,
  • dilation: int -> spacing between kernel elements. Default: 1,
  • groups: int -> number of blocked connections from input channels to output channels,
  • prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
  • prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
  • posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
  • posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
  • bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,

forward(X)

Samples the weights with reparameterzation and performs torch.nn.functional.conv_transpose2d. Check PyTorch official documentation for tensor output shape.

Parameters:
  • X: torch.Tensor with shape (batch_size, C, H, W)
Returns:
  • torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior

class ConvTranspose3dReparameterization

bayesian_torch.layers.ConvTranspose3dReparameterization(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)

Parameters:

  • in_channels: int -> number of channels in the input image,
  • out_channels: int -> number of channels produced by the convolution,
  • kernel_size: int -> size of the convolving kernel,
  • stride: int -> stride of the convolution. Default: 1,
  • padding: int -> zero-padding added to both sides of the input. Default: 0,
  • dilation: int -> spacing between kernel elements. Default: 1,
  • groups: int -> number of blocked connections from input channels to output channels,
  • prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
  • prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
  • posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
  • posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
  • bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,

forward(X)

Samples the weights with reparameterzation and performs torch.nn.functional.conv_transpose3d. Check PyTorch official documentation for tensor output shape.

Parameters:
  • X: torch.Tensor with shape (batch_size, C, H, W, L)
Returns:
  • torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior

class LSTMReparameterization

bayesian_torch.layers.LSTMReparameterization(in_features, out_features, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)

Parameters:

  • in_features: int -> size of each input sample,
  • out_features: int -> size of each output sample,
  • prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
  • prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
  • posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
  • posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
  • bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,

forward(X, hidden_states=None)

Samples the weights with reparameterzation and performs LSTM feedforward operation.

Parameters:
  • X: torch.Tensor with shape (batch_size, in_features)
  • hidden_states: None or tuple (torch.Tensor with shape = (X.shape[0], seq_len, out_features), torch.Tensor with shape = (X.shape[0], seq_len, out_features))
Returns:
  • tuple: (torch.Tensor with shape = (X.shape[0], seq_len, out_features), tuple (torch.Tensor with shape = (X.shape[0], seq_len, out_features), torch.Tensor with shape = (X.shape[0], seq_len, out_features))) , float corresponding to KL divergence from the samples weights distribution to the prior

class LinearFlipout

bayesian_torch.layers.LinearFlipout(in_features, out_features, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)

Parameters:

  • in_features: int -> size of each input sample,
  • out_features: int -> size of each output sample,
  • prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
  • prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
  • posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
  • posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
  • bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,

forward(X)

Samples the weights with flipout reparameterzation and performs torch.nn.functional.linear.

Parameters:
  • X: torch.Tensor with shape (batch_size, in_features)
Returns:
  • torch.Tensor with shape = (X.shape[0], out_features), float corresponding to KL divergence from the samples weights distribution to the prior

class Conv1dFlipout

bayesian_torch.layers.Conv1dFlipout(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)

Parameters:

  • in_channels: int -> number of channels in the input image,
  • out_channels: int -> number of channels produced by the convolution,
  • kernel_size: int -> size of the convolving kernel,
  • stride: int -> stride of the convolution. Default: 1,
  • padding: int -> zero-padding added to both sides of the input. Default: 0,
  • dilation: int -> spacing between kernel elements. Default: 1,
  • groups: int -> number of blocked connections from input channels to output channels,
  • prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
  • prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
  • posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
  • posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
  • bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,

forward(X)

Samples the weights with flipout reparameterzation and performs torch.nn.functional.conv1d. Check PyTorch official documentation for tensor output shape.

Parameters:
  • X: torch.Tensor with shape (batch_size, C, H)
Returns:
  • torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior

class Conv2dFlipout

bayesian_torch.layers.Conv2dFlipout(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)

Parameters:

  • in_channels: int -> number of channels in the input image,
  • out_channels: int -> number of channels produced by the convolution,
  • kernel_size: int -> size of the convolving kernel,
  • stride: int -> stride of the convolution. Default: 1,
  • padding: int -> zero-padding added to both sides of the input. Default: 0,
  • dilation: int -> spacing between kernel elements. Default: 1,
  • groups: int -> number of blocked connections from input channels to output channels,
  • prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
  • prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
  • posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
  • posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
  • bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,

forward(X)

Samples the weights with flipout reparameterzation and performs torch.nn.functional.conv2d. Check PyTorch official documentation for tensor output shape.

Parameters:
  • X: torch.Tensor with shape (batch_size, C, H, W)
Returns:
  • torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior

class Conv3dFlipout

bayesian_torch.layers.Conv3dFlipout(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)

Parameters:

  • in_channels: int -> number of channels in the input image,
  • out_channels: int -> number of channels produced by the convolution,
  • kernel_size: int -> size of the convolving kernel,
  • stride: int -> stride of the convolution. Default: 1,
  • padding: int -> zero-padding added to both sides of the input. Default: 0,
  • dilation: int -> spacing between kernel elements. Default: 1,
  • groups: int -> number of blocked connections from input channels to output channels,
  • prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
  • prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
  • posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
  • posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
  • bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,

forward(X)

Samples the weights with flipout reparameterzation and performs torch.nn.functional.conv3d. Check PyTorch official documentation for tensor output shape.

Parameters:
  • X: torch.Tensor with shape (batch_size, C, H, W, L)
Returns:
  • torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior

class ConvTranspose1dFlipout

bayesian_torch.layers.ConvTranspose1dFlipout(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)

Parameters:

  • in_channels: int -> number of channels in the input image,
  • out_channels: int -> number of channels produced by the convolution,
  • kernel_size: int -> size of the convolving kernel,
  • stride: int -> stride of the convolution. Default: 1,
  • padding: int -> zero-padding added to both sides of the input. Default: 0,
  • dilation: int -> spacing between kernel elements. Default: 1,
  • groups: int -> number of blocked connections from input channels to output channels,
  • prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
  • prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
  • posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
  • posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
  • bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,

forward(X)

Samples the weights with reparameterzation and performs torch.nn.functional.conv_transpose1d. Check PyTorch official documentation for tensor output shape.

Parameters:
  • X: torch.Tensor with shape (batch_size, C, H)
Returns:
  • torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior

class ConvTranspose2dFlipout

bayesian_torch.layers.ConvTranspose2dFlipout(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)

Parameters:

  • in_channels: int -> number of channels in the input image,
  • out_channels: int -> number of channels produced by the convolution,
  • kernel_size: int -> size of the convolving kernel,
  • stride: int -> stride of the convolution. Default: 1,
  • padding: int -> zero-padding added to both sides of the input. Default: 0,
  • dilation: int -> spacing between kernel elements. Default: 1,
  • groups: int -> number of blocked connections from input channels to output channels,
  • prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
  • prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
  • posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
  • posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
  • bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,

forward(X)

Samples the weights with reparameterzation and performs torch.nn.functional.conv_transpose2d. Check PyTorch official documentation for tensor output shape.

Parameters:
  • X: torch.Tensor with shape (batch_size, C, H, W)
Returns:
  • torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior

class ConvTranspose3dFlipout

bayesian_torch.layers.ConvTranspose3dFlipout(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)

Parameters:

  • in_channels: int -> number of channels in the input image,
  • out_channels: int -> number of channels produced by the convolution,
  • kernel_size: int -> size of the convolving kernel,
  • stride: int -> stride of the convolution. Default: 1,
  • padding: int -> zero-padding added to both sides of the input. Default: 0,
  • dilation: int -> spacing between kernel elements. Default: 1,
  • groups: int -> number of blocked connections from input channels to output channels,
  • prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
  • prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
  • posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
  • posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
  • bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,

forward(X)

Samples the weights with reparameterzation and performs torch.nn.functional.conv_transpose3d. Check PyTorch official documentation for tensor output shape.

Parameters:
  • X: torch.Tensor with shape (batch_size, C, H, W, L)
Returns:
  • torch.Tensor, float corresponding to KL divergence from the samples weights distribution to the prior

class LSTMFlipout

bayesian_torch.layers.LSTMFlipout(in_features, out_features, prior_mean, prior_variance, posterior_mu_init, posterior_rho_init, bias=True)

Parameters:

  • in_features: int -> size of each input sample,
  • out_features: int -> size of each output sample,
  • prior_mean: float -> mean of the prior arbitrary distribution to be used on the complexity cost,
  • prior_variance: float -> variance of the prior arbitrary distribution to be used on the complexity cost,
  • posterior_mu_init: float -> init trainable mu parameter representing mean of the approximate posterior,
  • posterior_rho_init: float -> init trainable rho parameter representing the sigma of the approximate posterior through softplus σ = log(1 + exp(ρ)),
  • bias: bool -> if set to False, the layer will not learn an additive bias. Default: True,

forward(X, hidden_states=None)

Samples the weights with Flipout and performs LSTM feedforward operation.

Parameters:
  • X: torch.Tensor with shape (batch_size, in_features)
  • hidden_states: None or tuple (torch.Tensor with shape = (X.shape[0], seq_len, out_features), torch.Tensor with shape = (X.shape[0], seq_len, out_features))
Returns:
  • tuple: (torch.Tensor with shape = (X.shape[0], seq_len, out_features), tuple (torch.Tensor with shape = (X.shape[0], seq_len, out_features), torch.Tensor with shape = (X.shape[0], seq_len, out_features))) , float corresponding to KL divergence from the samples weights distribution to the prior