-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support list/tuple inputs for special tokens in MultiSegmentPacker layer #1046
Support list/tuple inputs for special tokens in MultiSegmentPacker layer #1046
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great! just minor comments
@@ -53,12 +53,16 @@ class MultiSegmentPacker(keras.layers.Layer): | |||
|
|||
Args: | |||
sequence_length: The desired output length. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's add the type notes you have in the other layer for consistency
dtype of the input tensors to the layer. | ||
end_value: The id(s) or token(s) that is/are to be placed at the end of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is/are -> are
dtype of the input tensors to the layer. | ||
end_value: The id(s) or token(s) that is/are to be placed at the end of | ||
the last input segment (called "[SEP]" for BERT). The dtype much |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
much -> must
sep_value: The id(s) or token(s) that is/are to be placed at the end of | ||
every segment, except the last segment (called "[SEP]" for BERT). | ||
If `None`, `end_value` is used. The dtype much match the dtype of | ||
the input tensors to the layer. | ||
pad_value: The id or token that is to be placed into the unused | ||
positions after the last segment in the sequence | ||
(called "[PAD]" for BERT). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add an example below? maybe roberta double sep?
start_column = tf.fill((batch_size, 1), start_value) | ||
end_column = tf.fill((batch_size, 1), end_value) | ||
ones_column = tf.ones_like(start_column, dtype=tf.int32) | ||
start_values_tensor = tf.repeat( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these names are a little confusing start_value
is already a tensor. should we co back to _column
naming?
start_column
, end_column
, sep_column
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can, but it isn't exactly a column :P. I'll call it start_columns
( | ||
[ | ||
[ | ||
"[CLS]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
try to come up with a slightly shorter test case that will format the lists to one line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice! Thank you.
/gcbrun |
No description provided.