Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about transformer position_wise_feed_forward #98

Closed
Continue7777 opened this issue Dec 1, 2018 · 1 comment
Closed

about transformer position_wise_feed_forward #98

Continue7777 opened this issue Dec 1, 2018 · 1 comment

Comments

@Continue7777
Copy link

Recently,i do some experient about bert and transformer on text_classification.I find position always consists of two linear transformations with a ReLU activation in between.But you use conv?Do you have something special thought about this change.

def position_wise_feed_forward_fn(self):
"""
x: [batch,sequence_length,d_model]
:return: [batch,sequence_length,d_model]
"""
output=None
#1.conv1
input=tf.expand_dims(self.x,axis=3) #[batch,sequence_length,d_model,1]
# conv2d.input: [None,sentence_length,embed_size,1]. filter=[filter_size,self.embed_size,1,self.num_filters]
# output with padding:[None,sentence_length,1,1]
output_conv1=tf.layers.conv2d(
input,filters=self.d_ff,kernel_size=[1,self.d_model],padding="VALID",
name='conv1',kernel_initializer=self.initializer,activation=tf.nn.relu
)
output_conv1 = tf.transpose(output_conv1, [0,1,3,2])
print("output_conv1:",output_conv1)
#2.conv2
output_conv2 = tf.layers.conv2d(
output_conv1,filters=self.d_model,kernel_size=[1,self.d_ff],padding="VALID",
name='conv2',kernel_initializer=self.initializer,activation=None
)
output=tf.squeeze(output_conv2) #[batch,sequence_length,d_model]
return output #[batch,sequence_length,d_model]

@brightmart
Copy link
Owner

hi, I do saw above setting from BERT. I use conv follow transfomer's implementation( tensor2tensor). we think it may have less parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants