You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The initial values for g and b are used to keep the pre-activation values normally-distributed. After the tf.assign operation for g and b, the output of the current conv2d layer is changed, so the input to the next layer is changed. I think the initialization of g and b for the next layer should depends on the new conv2d output.
So I think the customized conv2d in nn.py should be modified as the following
def conv2d(x_, num_filters, filter_size=[3, 3], stride=[1, 1], pad='SAME', nonlinearity=None, init_scale=1., counters={},
init=False, ema=None, **kwargs):
''' convolutional layer '''
name = get_name('conv2d', counters)
with tf.variable_scope(name):
V = get_var_maybe_avg('V', ema, shape=filter_size + [int(x.get_shape()[-1]), num_filters], dtype=tf.float32,
initializer=tf.random_normal_initializer(0, 0.05), trainable=True)
g = get_var_maybe_avg('g', ema, shape=[num_filters], dtype=tf.float32,
initializer=tf.constant_initializer(1.), trainable=True)
b = get_var_maybe_avg('b', ema, shape=[num_filters], dtype=tf.float32,
initializer=tf.constant_initializer(0.), trainable=True)
# use weight normalization (Salimans & Kingma, 2016)
W = tf.reshape(g, [1, 1, 1, num_filters]) * tf.nn.l2_normalize(V, [0, 1, 2])
# calculate convolutional layer output
x = tf.nn.bias_add(tf.nn.conv2d(x_, W, [1] + stride + [1], pad), b)
if init: # normalize x
m_init, v_init = tf.nn.moments(x, [0, 1, 2])
scale_init = init_scale / tf.sqrt(v_init + 1e-10)
with tf.control_dependencies([g.assign(g * scale_init), b.assign_add(-m_init * scale_init)]):
# x = tf.identity(x)
W = tf.reshape(g, [1, 1, 1, num_filters]) * tf.nn.l2_normalize(V, [0, 1, 2])
x = tf.nn.bias_add(tf.nn.conv2d(x_, W, [1] + stride + [1], pad), b)
# apply nonlinearity
if nonlinearity is not None:
x = nonlinearity(x)
return x
The text was updated successfully, but these errors were encountered:
The initial values for
g
andb
are used to keep the pre-activation values normally-distributed. After thetf.assign
operation forg
andb
, the output of the current conv2d layer is changed, so the input to the next layer is changed. I think the initialization ofg
andb
for the next layer should depends on the new conv2d output.So I think the customized conv2d in nn.py should be modified as the following
The text was updated successfully, but these errors were encountered: